public class LemmatizerME extends Object implements Lemmatizer
| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_BEAM_SIZE |
static int |
LEMMA_NUMBER |
| Constructor and Description |
|---|
LemmatizerME(LemmatizerModel model)
Initializes the current instance with the provided model
and the default beam size of 3.
|
| Modifier and Type | Method and Description |
|---|---|
static String[] |
decodeLemmas(String[] toks,
String[] preds)
Decodes the lemma from the word and the induced lemma class.
|
static String[] |
encodeLemmas(String[] toks,
String[] lemmas) |
List<List<String>> |
lemmatize(List<String> toks,
List<String> tags)
Generates a lemma tags for the word and postag returning the result in a list
of every possible lemma for each token and postag.
|
String[] |
lemmatize(String[] toks,
String[] tags)
Generates lemmas for the word and postag returning the result in an array.
|
String[][] |
predictLemmas(int numLemmas,
String[] toks,
String[] tags)
Predict all possible lemmas (using a default upper bound).
|
String[] |
predictSES(String[] toks,
String[] tags)
Predict Short Edit Script (automatically induced lemma class).
|
double[] |
probs()
Returns an array with the probabilities of the last decoded sequence.
|
void |
probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence.
|
Sequence[] |
topKLemmaClasses(String[] sentence,
String[] tags) |
Sequence[] |
topKLemmaClasses(String[] sentence,
String[] tags,
double minSequenceScore) |
Sequence[] |
topKSequences(String[] sentence,
String[] tags) |
Sequence[] |
topKSequences(String[] sentence,
String[] tags,
double minSequenceScore) |
static LemmatizerModel |
train(String languageCode,
ObjectStream<LemmaSample> samples,
TrainingParameters trainParams,
LemmatizerFactory posFactory) |
public static final int LEMMA_NUMBER
public static final int DEFAULT_BEAM_SIZE
public LemmatizerME(LemmatizerModel model)
model - the modelpublic String[] lemmatize(String[] toks, String[] tags)
Lemmatizerlemmatize in interface Lemmatizertoks - an array of the tokenstags - an array of the pos tagspublic List<List<String>> lemmatize(List<String> toks, List<String> tags)
Lemmatizerlemmatize in interface Lemmatizertoks - an array of the tokenstags - an array of the pos tagspublic String[] predictSES(String[] toks, String[] tags)
toks - the array of tokenstags - the array of pos tagspublic String[][] predictLemmas(int numLemmas, String[] toks, String[] tags)
numLemmas - the default number of lemmastoks - the tokenstags - the postagspublic static String[] decodeLemmas(String[] toks, String[] preds)
toks - the array of tokenspreds - the predicted lemma classespublic Sequence[] topKSequences(String[] sentence, String[] tags, double minSequenceScore)
public void probs(double[] probs)
lemmatize. The
specified array should be at least as large as the number of tokens in the
previous call to lemmatize.probs - An array used to hold the probabilities of the last decoded sequence.public double[] probs()
chunk.chunk
when it was last called.public static LemmatizerModel train(String languageCode, ObjectStream<LemmaSample> samples, TrainingParameters trainParams, LemmatizerFactory posFactory) throws IOException
IOExceptionCopyright © 2017 The Apache Software Foundation. All rights reserved.