public class TikaLanguageDetector
extends org.apache.tika.language.detect.LanguageDetector
Because it works only on trigrams, it is not suitable for short texts.
There are better performing language detectors. This module is still here in the hopes that we'll get around to improving it, because it is elegant and could be fairly trivially improved.
| Constructor and Description |
|---|
TikaLanguageDetector() |
| Modifier and Type | Method and Description |
|---|---|
void |
addText(char[] cbuf,
int off,
int len) |
List<org.apache.tika.language.detect.LanguageResult> |
detectAll() |
boolean |
hasModel(String language) |
org.apache.tika.language.detect.LanguageDetector |
loadModels() |
org.apache.tika.language.detect.LanguageDetector |
loadModels(Set<String> languages) |
void |
reset() |
org.apache.tika.language.detect.LanguageDetector |
setPriors(Map<String,Float> languageProbabilities)
not supported
|
public org.apache.tika.language.detect.LanguageDetector loadModels()
throws IOException
loadModels in class org.apache.tika.language.detect.LanguageDetectorIOExceptionpublic org.apache.tika.language.detect.LanguageDetector loadModels(Set<String> languages) throws IOException
loadModels in class org.apache.tika.language.detect.LanguageDetectorIOExceptionpublic boolean hasModel(String language)
hasModel in class org.apache.tika.language.detect.LanguageDetectorpublic org.apache.tika.language.detect.LanguageDetector setPriors(Map<String,Float> languageProbabilities) throws IOException
setPriors in class org.apache.tika.language.detect.LanguageDetectorlanguageProbabilities - Map from language to probabilityIOExceptionpublic void reset()
reset in class org.apache.tika.language.detect.LanguageDetectorpublic void addText(char[] cbuf,
int off,
int len)
addText in class org.apache.tika.language.detect.LanguageDetectorpublic List<org.apache.tika.language.detect.LanguageResult> detectAll()
detectAll in class org.apache.tika.language.detect.LanguageDetectorCopyright © 2007–2022 The Apache Software Foundation. All rights reserved.