public class CompiledTokenizedLM extends Object implements LanguageModel.Sequence, LanguageModel.Tokenized
CompiledTokenizedLM implements a tokenized bounded
sequence language model. Instances are read from streams of bytes
created by compiling a TokenizedLM; see that class for
more information.LanguageModel.Conditional, LanguageModel.Dynamic, LanguageModel.Process, LanguageModel.Sequence, LanguageModel.Tokenized| Modifier and Type | Method and Description |
|---|---|
double |
log2Estimate(char[] cs,
int start,
int end)
Returns an estimate of the log (base 2) probability of the
specified character slice.
|
double |
log2Estimate(CharSequence cSeq)
Returns an estimate of the log (base 2) probability of the
specified character sequence.
|
double |
tokenLog2Probability(String[] tokens,
int start,
int end)
Returns the log (base 2) probability of the specified
token slice in the underlying token n-gram distribution.
|
double |
tokenProbability(String[] tokens,
int start,
int end)
Returns the probability of the specified token slice in the
token n-gram distribution.
|
String |
toString()
Returns a string-based representation of this compiled language
model.
|
public String toString()
Warning: The output may be very long for a large model and may blow out memory attempting to pile it into a string buffer.
public double log2Estimate(CharSequence cSeq)
LanguageModellog2Estimate in interface LanguageModelcSeq - Character sequence to estimate.public double log2Estimate(char[] cs,
int start,
int end)
LanguageModellog2Estimate in interface LanguageModelcs - Underlying array of characters.start - Index of first character in slice.end - One plus index of last character in slice.public double tokenLog2Probability(String[] tokens, int start, int end)
LanguageModel.TokenizedtokenLog2Probability in interface LanguageModel.Tokenizedtokens - Underlying array of tokens.start - Index of first token in slice.end - Index of one past the last token in the slice.public double tokenProbability(String[] tokens, int start, int end)
LanguageModel.TokenizedtokenProbability in interface LanguageModel.Tokenizedtokens - Underlying array of tokens.start - Index of first token in slice.end - Index of one past the last token in the slice.Copyright © 2016 Alias-i, Inc.. All rights reserved.