Package opennlp.tools.sentdetect
Class SentenceDetectorME
java.lang.Object
opennlp.tools.sentdetect.SentenceDetectorME
- All Implemented Interfaces:
SentenceDetector
A sentence detector for splitting up raw text into sentences.
A maximum entropy model is used to evaluate end-of-sentence characters in a string to determine if they signify the end of a sentence.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionSentenceDetectorME(String language) Initializes the sentence detector by downloading a default model.SentenceDetectorME(SentenceModel model) Initializes the current instance.SentenceDetectorME(SentenceModel model, Factory factory) Deprecated. -
Method Summary
Modifier and TypeMethodDescriptiondouble[]Returns the probabilities associated with the most recent calls tosentDetect(CharSequence).String[]Detects sentences in given inputCharSequence..Span[]Detects the position of the first words of sentences in aCharSequence.static SentenceModeltrain(String languageCode, ObjectStream<SentenceSample> samples, boolean useTokenEnd, Dictionary abbreviations, TrainingParameters mlParams) Deprecated.static SentenceModeltrain(String languageCode, ObjectStream<SentenceSample> samples, SentenceDetectorFactory sdFactory, TrainingParameters mlParams)
-
Field Details
-
SPLIT
Constant indicates a sentence split.- See Also:
-
NO_SPLIT
Constant indicates no sentence split.- See Also:
-
-
Constructor Details
-
SentenceDetectorME
Initializes the sentence detector by downloading a default model.- Parameters:
language- The language of the sentence detector.- Throws:
IOException- Thrown if the model cannot be downloaded or saved.
-
SentenceDetectorME
Initializes the current instance.- Parameters:
model- theSentenceModel
-
SentenceDetectorME
Deprecated.Use aSentenceDetectorFactoryto extend SentenceDetector functionality.
-
-
Method Details
-
sentDetect
Detects sentences in given inputCharSequence..- Specified by:
sentDetectin interfaceSentenceDetector- Parameters:
s- TheCharSequence. to be processed.- Returns:
- A string array containing individual sentences as elements.
-
sentPosDetect
Detects the position of the first words of sentences in aCharSequence.- Specified by:
sentPosDetectin interfaceSentenceDetector- Parameters:
s- TheCharSequenceto be processed.- Returns:
- An
span arraycontaining the positions of the end index of every sentence.
-
getSentenceProbabilities
public double[] getSentenceProbabilities()Returns the probabilities associated with the most recent calls tosentDetect(CharSequence).- Returns:
- The probability for each sentence returned for the most recent
call to
sentDetect(CharSequence). If not applicable, an empty array is returned.
-
train
public static SentenceModel train(String languageCode, ObjectStream<SentenceSample> samples, boolean useTokenEnd, Dictionary abbreviations, TrainingParameters mlParams) throws IOException Deprecated.- Throws:
IOException
-
train
public static SentenceModel train(String languageCode, ObjectStream<SentenceSample> samples, SentenceDetectorFactory sdFactory, TrainingParameters mlParams) throws IOException - Throws:
IOException
-
SentenceDetectorFactoryto extend SentenceDetector functionality.