Package de.jungblut.crawl.extraction
-
Interface Summary Interface Description Extractor<T extends FetchResult> Simple extraction logic interface for a site and a result. -
Class Summary Class Description ArticleContentExtrator Extractor for news articles.ArticleContentExtrator.ContentFetchResult Article content fetch result.HtmlExtrator Extractor for raw html.HtmlExtrator.HtmlFetchResult Article content fetch result.OutlinkExtractor Outlink extractor, parses a page just for its outlinks.