Package de.l3s.boilerpipe.util
Class UnicodeTokenizer
- java.lang.Object
-
- de.l3s.boilerpipe.util.UnicodeTokenizer
-
public class UnicodeTokenizer extends java.lang.ObjectTokenizes text according to Unicode word boundaries and strips off non-word characters.
-
-
Constructor Summary
Constructors Constructor Description UnicodeTokenizer()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.lang.String[]tokenize(java.lang.CharSequence text)Tokenizes the text and returns an array of tokens.
-