Package banner.tokenization
Class SimpleTokenizer
- java.lang.Object
-
- banner.tokenization.SimpleTokenizer
-
- All Implemented Interfaces:
Tokenizer
public class SimpleTokenizer extends Object implements Tokenizer
Tokens ouput by this tokenizer consist of a contiguous block of alphanumeric characters or a single punctuation mark. Note, therefore, that any construction which contains a punctuation mark (such as a contraction or a real number) will necessarily span over at least three tokens.- Author:
- Bob
-
-
Constructor Summary
Constructors Constructor Description SimpleTokenizer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description List<String>getTokens(String text)voidtokenize(Sentence sentence)Tokenizes the givenSentence
-