public class BigramCollocationFinder
extends java.lang.Object
Finding collocations requires first calculating the frequencies of words and their appearance in the context of other words. Often the collection of words will then requiring filtering to only retain useful content terms. Each ngram of words may then be scored according to some association measure, in order to determine the relative likelihood of each ngram being a collocation.
| Constructor and Description |
|---|
BigramCollocationFinder(int minFreq)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
BigramCollocation[] |
find(Corpus corpus,
double p)
Finds bigram collocations in the given corpus whose p-value is less than
the given threshold.
|
BigramCollocation[] |
find(Corpus corpus,
int k)
Finds top k bigram collocations in the given corpus.
|
public BigramCollocationFinder(int minFreq)
minFreq - the minimum frequency of collocation.public BigramCollocation[] find(Corpus corpus, int k)
public BigramCollocation[] find(Corpus corpus, double p)
p - the p-value threshold