Package de.jungblut.crawl
Interface Crawler<T extends FetchResult>
-
- Type Parameters:
T- the result type that can be overriden byFetchResult.
- All Known Implementing Classes:
MultithreadedCrawler,SequentialCrawler
public interface Crawler<T extends FetchResult>Basic Crawler Interface, all implements should implicit give a constructor with the same arguments like setup and redirect the call to it.- Author:
- thomas.jungblut
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description voidprocess(java.lang.String... seedUrl)Starts the crawler, starting by the seedURL.voidsetup(int fetches, Extractor<T> extractor, ResultWriter<T> writer)Setups this crawler.
-
-
-
Method Detail
-
setup
void setup(int fetches, Extractor<T> extractor, ResultWriter<T> writer) throws java.io.IOExceptionSetups this crawler.- Parameters:
fetches- how many maximum fetches it should do.extractor- the givenExtractorto extract aFetchResult.writer- theResultWriterto write the result to a sink.- Throws:
java.io.IOException
-
process
void process(java.lang.String... seedUrl) throws java.lang.InterruptedException, java.util.concurrent.ExecutionExceptionStarts the crawler, starting by the seedURL. The real logic is implemented by the crawler itself.- Throws:
java.lang.InterruptedExceptionjava.util.concurrent.ExecutionException
-
-