Package de.jungblut.crawl.extraction
Class HtmlExtrator
- java.lang.Object
-
- de.jungblut.crawl.extraction.HtmlExtrator
-
- All Implemented Interfaces:
Extractor<HtmlExtrator.HtmlFetchResult>
public final class HtmlExtrator extends java.lang.Object implements Extractor<HtmlExtrator.HtmlFetchResult>
Extractor for raw html.- Author:
- thomas.jungblut
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classHtmlExtrator.HtmlFetchResultArticle content fetch result.
-
Constructor Summary
Constructors Constructor Description HtmlExtrator()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description HtmlExtrator.HtmlFetchResultextract(java.lang.String site)Extracts from a given URL all the content needed and return it.
-
-
-
Method Detail
-
extract
public final HtmlExtrator.HtmlFetchResult extract(java.lang.String site)
Description copied from interface:ExtractorExtracts from a given URL all the content needed and return it. Null if nothing should be returned or could be parsed.- Specified by:
extractin interfaceExtractor<HtmlExtrator.HtmlFetchResult>
-
-