| Package | Description |
|---|---|
| de.l3s.icrawl.crawler | |
| de.l3s.icrawl.crawler.analysis |
Analysis of the content of crawled Web pages.
|
| de.l3s.icrawl.crawler.frontier |
Crawler queue / frontier management
|
| de.l3s.icrawl.crawler.io |
Fetching and storing of archive snapshots
|
| Modifier and Type | Method and Description |
|---|---|
static CrawlUrl |
CrawlUrl.fromSeed(String url,
float priority) |
CrawlUrl |
Resource.getUrl() |
CrawlUrl |
CrawlUrl.merge(CrawlUrl mergee)
Merge with other instance.
|
CrawlUrl |
CrawlUrl.outlink(String url,
float priority,
ZonedDateTime crawlTime) |
| Modifier and Type | Method and Description |
|---|---|
CrawlUrl |
CrawlUrl.merge(CrawlUrl mergee)
Merge with other instance.
|
| Constructor and Description |
|---|
CrawledResource(CrawlUrl url,
Snapshot resource,
double relevance,
ZonedDateTime modifiedDate) |
CrawledResource(CrawlUrl url,
Snapshot resource,
double relevance,
ZonedDateTime modifiedDate,
Duration snapshotsDuration,
double minRelevance,
double maxRelevance) |
Resource(CrawlUrl url,
Map<String,String> headers,
String content) |
| Modifier and Type | Method and Description |
|---|---|
Collection<CrawlUrl> |
ResourceAnalyser.Result.getOutlinks() |
| Modifier and Type | Method and Description |
|---|---|
ResourceAnalyser.Result |
ResourceAnalyser.analyse(Snapshot resource,
CrawlUrl url) |
| Modifier and Type | Method and Description |
|---|---|
Optional<CrawlUrl> |
Frontier.pop() |
Optional<CrawlUrl> |
BaseFrontier.pop() |
protected Optional<CrawlUrl> |
InMemoryFrontier.popInternal() |
protected Optional<CrawlUrl> |
FileBasedFrontier.popInternal() |
protected abstract Optional<CrawlUrl> |
BaseFrontier.popInternal() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
InMemoryFrontier.pushInternal(CrawlUrl url) |
protected void |
FileBasedFrontier.pushInternal(CrawlUrl url) |
protected abstract void |
BaseFrontier.pushInternal(CrawlUrl url) |
| Modifier and Type | Method and Description |
|---|---|
void |
Frontier.push(Collection<CrawlUrl> urls) |
void |
BaseFrontier.push(Collection<CrawlUrl> urls) |
| Modifier and Type | Method and Description |
|---|---|
List<Snapshot> |
ArchiveFetcher.get(CrawlUrl url,
TimeSpecification referenceTime) |
void |
CsvStorer.storeNotFound(CrawlUrl url) |
void |
ZipFileStorer.storeNotFound(CrawlUrl url) |
void |
LoggingStorer.storeNotFound(CrawlUrl url) |
void |
ResultStorer.storeNotFound(CrawlUrl url) |
Copyright © 2017. All rights reserved.