public final class TokenizerModel extends BaseModel
TokenizerModel is the model used
by a learnable Tokenizer.TokenizerMETRAINING_CUTOFF_PROPERTY, TRAINING_EVENTHASH_PROPERTY, TRAINING_ITERATIONS_PROPERTY| Constructor and Description |
|---|
TokenizerModel(File modelFile)
Initializes the current instance.
|
TokenizerModel(InputStream in)
Initializes the current instance.
|
TokenizerModel(MaxentModel tokenizerModel,
Map<String,String> manifestInfoEntries,
TokenizerFactory tokenizerFactory)
Initializes the current instance.
|
TokenizerModel(String language,
AbstractModel tokenizerMaxentModel,
boolean useAlphaNumericOptimization)
Deprecated.
Use
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory. |
TokenizerModel(String language,
AbstractModel tokenizerMaxentModel,
boolean useAlphaNumericOptimization,
Map<String,String> manifestInfoEntries)
Deprecated.
Use
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory. |
TokenizerModel(String language,
MaxentModel tokenizerMaxentModel,
Dictionary abbreviations,
boolean useAlphaNumericOptimization,
Map<String,String> manifestInfoEntries)
Deprecated.
Use
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory. |
TokenizerModel(URL modelURL)
Initializes the current instance.
|
| Modifier and Type | Method and Description |
|---|---|
Dictionary |
getAbbreviations() |
TokenizerFactory |
getFactory() |
MaxentModel |
getMaxentModel() |
static void |
main(String[] args) |
boolean |
useAlphaNumericOptimization() |
getArtifact, getLanguage, getManifestProperty, getVersion, isLoadedFromSerialized, serializepublic TokenizerModel(MaxentModel tokenizerModel, Map<String,String> manifestInfoEntries, TokenizerFactory tokenizerFactory)
tokenizerModel - the modelmanifestInfoEntries - the manifesttokenizerFactory - the factorypublic TokenizerModel(String language, MaxentModel tokenizerMaxentModel, Dictionary abbreviations, boolean useAlphaNumericOptimization, Map<String,String> manifestInfoEntries)
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory.language - the language the tokenizer should usetokenizerMaxentModel - the statistical model of the tokenizerabbreviations - the dictionary containing the abbreviationsuseAlphaNumericOptimization - if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries - the additional meta data which should be written into manifestpublic TokenizerModel(String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization, Map<String,String> manifestInfoEntries)
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory.language - the language the tokenizer should usetokenizerMaxentModel - the statistical model of the tokenizeruseAlphaNumericOptimization - if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries - the additional meta data which should be written into manifestpublic TokenizerModel(String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization)
TokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in a TokenizerFactory.language - the language the tokenizer should usetokenizerMaxentModel - the statistical model of the tokenizeruseAlphaNumericOptimization - if true alpha numeric optimization is enabled, otherwise notpublic TokenizerModel(InputStream in) throws IOException, InvalidFormatException
in - the Input Stream to load the model fromIOException - if reading from the stream fails in anywayInvalidFormatException - if the stream doesn't have the expected formatpublic TokenizerModel(File modelFile) throws IOException, InvalidFormatException
modelFile - the file containing the tokenizer modelIOException - if reading from the stream fails in anywayInvalidFormatException - if the stream doesn't have the expected formatpublic TokenizerModel(URL modelURL) throws IOException, InvalidFormatException
modelURL - the URL pointing to the tokenizer modelIOException - if reading from the stream fails in anywayInvalidFormatException - if the stream doesn't have the expected formatpublic TokenizerFactory getFactory()
public MaxentModel getMaxentModel()
public Dictionary getAbbreviations()
public boolean useAlphaNumericOptimization()
public static void main(String[] args) throws IOException
IOExceptionCopyright © 2015 The Apache Software Foundation. All rights reserved.