|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.tika.parser.html.DefaultHtmlMapper
public class DefaultHtmlMapper
The default HTML mapping rules in Tika.
| Constructor Summary | |
|---|---|
DefaultHtmlMapper()
|
|
| Method Summary | |
|---|---|
boolean |
isDiscardElement(java.lang.String name)
Checks whether all content within the given HTML element should be discarded instead of including it in the parse output. |
java.lang.String |
mapSafeElement(java.lang.String name)
Maps "safe" HTML element names to semantic XHTML equivalents. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public DefaultHtmlMapper()
| Method Detail |
|---|
public java.lang.String mapSafeElement(java.lang.String name)
HtmlMappernull and the element
will be ignored but the content inside it is still processed. See
the HtmlMapper.isDiscardElement(String) method for a way to discard
the entire contents of an element.
mapSafeElement in interface HtmlMappername - HTML element name (upper case)
null if the element is unsafepublic boolean isDiscardElement(java.lang.String name)
HtmlMapper
isDiscardElement in interface HtmlMappername - HTML element name (upper case)
true if content inside the named element
should be ignored, false otherwise
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||