| Package | Description |
|---|---|
| de.jungblut.crawl |
| Modifier and Type | Class and Description |
|---|---|
class |
ConsoleResultWriter<T extends FetchResult>
Simple class that outputs to console.
|
class |
ResultWriterAdapter<T extends FetchResult>
Empty Adapter class for a
ResultWriter. |
class |
SequenceFileResultWriter<T extends FetchResult>
Writes the result into a sequencefile "files/crawl/result.seq".
|
| Modifier and Type | Method and Description |
|---|---|
void |
SequentialCrawler.setup(int fetches,
Extractor<T> extractor,
ResultWriter<T> writer) |
void |
MultithreadedCrawler.setup(int fetches,
Extractor<T> extractor,
ResultWriter<T> writer) |
void |
Crawler.setup(int fetches,
Extractor<T> extractor,
ResultWriter<T> writer)
Setups this crawler.
|
| Constructor and Description |
|---|
FetchResultPersister(ResultWriter<T> resWriter) |
FetchResultPersister(ResultWriter<T> resWriter,
org.apache.hadoop.conf.Configuration conf) |
MultithreadedCrawler(int fetches,
Extractor<T> extractor,
ResultWriter<T> writer)
Constructs a new Multithreaded Crawler with 32 threads working on 10 url
batches at each time.
|
MultithreadedCrawler(int threadPoolSize,
int batchSize,
int fetches,
Extractor<T> extractor,
ResultWriter<T> writer)
Constructs a new Multithreaded Crawler.
|
SequentialCrawler(int fetches,
Extractor<T> extractor,
ResultWriter<T> writer) |
Copyright © 2016. All rights reserved.