Package de.l3s.boilerpipe.sax
Class BoilerpipeSAXInput
- java.lang.Object
-
- de.l3s.boilerpipe.sax.BoilerpipeSAXInput
-
- All Implemented Interfaces:
BoilerpipeInput
public final class BoilerpipeSAXInput extends java.lang.Object implements BoilerpipeInput
Parses anInputSourceusing SAX and returns aTextDocument.
-
-
Constructor Summary
Constructors Constructor Description BoilerpipeSAXInput(org.xml.sax.InputSource is)Creates a new instance ofBoilerpipeSAXInputfor the givenInputSource.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TextDocumentgetTextDocument()Retrieves theTextDocumentusing a default HTML parser.TextDocumentgetTextDocument(BoilerpipeHTMLParser parser)Retrieves theTextDocumentusing the given HTML parser.
-
-
-
Constructor Detail
-
BoilerpipeSAXInput
public BoilerpipeSAXInput(org.xml.sax.InputSource is) throws org.xml.sax.SAXExceptionCreates a new instance ofBoilerpipeSAXInputfor the givenInputSource.- Parameters:
is-- Throws:
org.xml.sax.SAXException
-
-
Method Detail
-
getTextDocument
public TextDocument getTextDocument() throws BoilerpipeProcessingException
Retrieves theTextDocumentusing a default HTML parser.- Specified by:
getTextDocumentin interfaceBoilerpipeInput- Returns:
- A
TextDocument. - Throws:
BoilerpipeProcessingException
-
getTextDocument
public TextDocument getTextDocument(BoilerpipeHTMLParser parser) throws BoilerpipeProcessingException
Retrieves theTextDocumentusing the given HTML parser.- Parameters:
parser- The parser used to transform the input into boilerpipe's internal representation.- Returns:
- The retrieved
TextDocument - Throws:
BoilerpipeProcessingException
-
-