| Package | Description |
|---|---|
| de.l3s.icrawl.contentanalysis |
Analysis of crawled documents
|
| de.l3s.icrawl.crawler |
| Modifier and Type | Method and Description |
|---|---|
DocumentVector |
LanguageModels.buildDocumentVector(Locale language,
String document,
de.l3s.icrawl.contentanalysis.LanguageModel.KeywordMatcher keywordMatcher) |
DocumentVector |
LanguageModel.buildDocumentVector(String document,
de.l3s.icrawl.contentanalysis.LanguageModel.KeywordMatcher keywordMatcher) |
static DocumentVector |
DocumentVector.merge(Collection<DocumentVector> vectors,
boolean useDocumentFrequency) |
DocumentVector |
DocumentVector.topN(int n) |
| Modifier and Type | Method and Description |
|---|---|
Map<Locale,DocumentVector> |
DocumentVectorSimilarity.getReferenceVectors() |
| Modifier and Type | Method and Description |
|---|---|
double |
DocumentVector.cosineSimilarity(DocumentVector other) |
double |
DocumentVector.dotProduct(DocumentVector other) |
double |
LanguageModels.getSimilarity(Locale language,
String doc,
DocumentVector reference,
de.l3s.icrawl.contentanalysis.LanguageModel.KeywordMatcher matcher)
calculate the cosine-similarity of the doc to the specification
|
| Modifier and Type | Method and Description |
|---|---|
static DocumentVectorSimilarity |
DocumentVectorSimilarity.fromVectors(Map<Locale,DocumentVector> referenceVectors,
Map<Locale,Set<String>> keywords,
Locale defaultLanguage,
LanguageModels languageModels,
Map<Locale,Double> correctionFactors) |
static DocumentVector |
DocumentVector.merge(Collection<DocumentVector> vectors,
boolean useDocumentFrequency) |
| Constructor and Description |
|---|
DocumentVectorSimilarity(Map<Locale,DocumentVector> referenceVectors,
Map<Locale,de.l3s.icrawl.contentanalysis.LanguageModel.KeywordMatcher> matchers,
Locale defaultLanguage,
Map<Locale,Double> correctionFactors) |
| Modifier and Type | Method and Description |
|---|---|
Map<Locale,DocumentVector> |
ArchiveCrawlSpecification.getReferenceVectors() |
| Constructor and Description |
|---|
ArchiveCrawlSpecification(String name,
List<String> seedUrls,
List<String> referenceDocuments,
TimeSpecification referenceTime,
Map<Locale,DocumentVector> referenceVectors,
Map<Locale,Set<String>> keywords,
String description,
Locale defaultLanguage,
Map<Locale,Double> correctionFactors) |
Copyright © 2017. All rights reserved.