public class DocumentCategorizerME extends Object implements DocumentCategorizer
DocumentCategorizer.| Constructor and Description |
|---|
DocumentCategorizerME(DoccatModel model)
Initializes the current instance with a doccat model.
|
DocumentCategorizerME(DoccatModel model,
FeatureGenerator... featureGenerators)
Deprecated.
|
| Modifier and Type | Method and Description |
|---|---|
double[] |
categorize(String documentText)
Categorizes the given text.
|
double[] |
categorize(String[] text)
Categorizes the given text.
|
double[] |
categorize(String[] text,
Map<String,Object> extraInformation) |
double[] |
categorize(String documentText,
Map<String,Object> extraInformation)
Categorizes the given text.
|
String |
getAllResults(double[] results) |
String |
getBestCategory(double[] outcome) |
String |
getCategory(int index) |
int |
getIndex(String category) |
int |
getNumberOfCategories() |
Map<String,Double> |
scoreMap(String text)
Returns a map in which the key is the category name and the value is the score
|
SortedMap<Double,Set<String>> |
sortedScoreMap(String text)
Returns a map with the score as a key in ascendng order.
|
static DoccatModel |
train(String languageCode,
ObjectStream<DocumentSample> samples)
Deprecated.
|
static DoccatModel |
train(String languageCode,
ObjectStream<DocumentSample> samples,
TrainingParameters mlParams,
DoccatFactory factory) |
static DoccatModel |
train(String languageCode,
ObjectStream<DocumentSample> samples,
TrainingParameters mlParams,
FeatureGenerator... featureGenerators)
Deprecated.
|
public DocumentCategorizerME(DoccatModel model, FeatureGenerator... featureGenerators)
model - featureGenerators - public DocumentCategorizerME(DoccatModel model)
model - public double[] categorize(String[] text, Map<String,Object> extraInformation)
categorize in interface DocumentCategorizerpublic double[] categorize(String[] text)
categorize in interface DocumentCategorizertext - public double[] categorize(String documentText, Map<String,Object> extraInformation)
DoccatFactory.getTokenizer() and defaults to
SimpleTokenizer.categorize in interface DocumentCategorizerpublic double[] categorize(String documentText)
categorize in interface DocumentCategorizerpublic Map<String,Double> scoreMap(String text)
scoreMap in interface DocumentCategorizertext - the input text to classifypublic SortedMap<Double,Set<String>> sortedScoreMap(String text)
sortedScoreMap in interface DocumentCategorizertext - the input text to classifypublic String getBestCategory(double[] outcome)
getBestCategory in interface DocumentCategorizerpublic int getIndex(String category)
getIndex in interface DocumentCategorizerpublic String getCategory(int index)
getCategory in interface DocumentCategorizerpublic int getNumberOfCategories()
getNumberOfCategories in interface DocumentCategorizerpublic String getAllResults(double[] results)
getAllResults in interface DocumentCategorizerpublic static DoccatModel train(String languageCode, ObjectStream<DocumentSample> samples, TrainingParameters mlParams, FeatureGenerator... featureGenerators) throws IOException
train(String, ObjectStream, TrainingParameters, DoccatFactory)
instead.IOExceptionpublic static DoccatModel train(String languageCode, ObjectStream<DocumentSample> samples, TrainingParameters mlParams, DoccatFactory factory) throws IOException
IOExceptionpublic static DoccatModel train(String languageCode, ObjectStream<DocumentSample> samples) throws IOException
train(String, ObjectStream, TrainingParameters, DoccatFactory)
instead.languageCode - samples - IOExceptionObjectStreamExceptionCopyright © 2015 The Apache Software Foundation. All rights reserved.