public class SimpleTokenizer extends java.lang.Object implements Tokenizer
Most punctuation is split from adjoining words. Verb contractions and the Anglo-Saxon genitive of nouns are split into their component morphemes, and each morpheme is tagged separately. Examples
| Constructor and Description |
|---|
SimpleTokenizer()
Constructor.
|
SimpleTokenizer(boolean splitContraction)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
java.lang.String[] |
split(java.lang.String text)
Splits the string into a list of tokens.
|