Package opennlp.tools.doccat
Interface DocumentCategorizer
- All Known Implementing Classes:
- DocumentCategorizerME
public interface DocumentCategorizer
Interface for classes which categorize documents.
- 
Method SummaryModifier and TypeMethodDescriptiondouble[]categorize(String[] text) Categorizes the giventext, provided in separate tokens.double[]categorize(String[] text, Map<String, Object> extraInformation) Categorizes the giventextprovided as tokens along with the providedextraInformation.getAllResults(double[] results) Retrieves the name of the category associated with the given probabilities.getBestCategory(double[] outcome) Retrieves the best category from previously generatedoutcomeprobabilitiesgetCategory(int index) Retrieves the category at a givenindex.intRetrieves the index of a certain category.intRetrieves the number of categories.Retrieves aMapin which the key is the category name and the value is the score.sortedScoreMap(String[] text) Retrieves aSortedMapof the scores sorted in ascending order, together with their associated categories.
- 
Method Details- 
categorizeCategorizes the giventextprovided as tokens along with the providedextraInformation.- Parameters:
- text- The tokens of text to categorize.
- extraInformation- The extra information used for this context.
- Returns:
- The per category probabilities.
 
- 
categorizeCategorizes the giventext, provided in separate tokens.- Parameters:
- text- The tokens of text to categorize.
- Returns:
- The per category probabilities.
 
- 
getBestCategoryRetrieves the best category from previously generatedoutcomeprobabilities- Parameters:
- outcome- An array of computed outcome probabilities.
- Returns:
- The best category represented as String.
 
- 
getIndexRetrieves the index of a certain category.- Parameters:
- category- The category for which the- indexis to be found.
- Returns:
- The index.
 
- 
getCategoryRetrieves the category at a givenindex.- Parameters:
- index- The index for which the- categoryshall be found.
- Returns:
- The category represented as String.
 
- 
getNumberOfCategoriesint getNumberOfCategories()Retrieves the number of categories.- Returns:
- The no. of categories.
 
- 
getAllResultsRetrieves the name of the category associated with the given probabilities.- Parameters:
- results- The probabilities of each category.
- Returns:
- The name of the outcome.
 
- 
scoreMapRetrieves aMapin which the key is the category name and the value is the score.- Parameters:
- text- The tokenized input text to classify.
- Returns:
- A Mapwith the score as a key.
 
- 
sortedScoreMapRetrieves aSortedMapof the scores sorted in ascending order, together with their associated categories.Many categories can have the same score, hence the Setas value.- Parameters:
- text- the input text to classify
- Returns:
- A SortedMapwith the score as a key.
 
 
-