Package net.sf.okapi.steps.tokenization
Class Tokenizer
- java.lang.Object
-
- net.sf.okapi.steps.tokenization.Tokenizer
-
public class Tokenizer extends Object
-
-
Field Summary
Fields Modifier and Type Field Description protected static TokenizationStepts
-
Constructor Summary
Constructors Constructor Description Tokenizer()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static Tokenstokenize(String string, LocaleId language, String... tokenNames)static Tokenstokenize(ITextUnit textUnit, LocaleId language, String... tokenNames)static Tokenstokenize(TextContainer textContainer, LocaleId language, String... tokenNames)static Tokenstokenize(TextFragment textFragment, LocaleId language, String... tokenNames)protected static TokenstokenizeString(String text, LocaleId language, String... tokenNames)Extracts tokens from the given text.
-
-
-
Field Detail
-
ts
protected static TokenizationStep ts
-
-
Method Detail
-
tokenizeString
protected static Tokens tokenizeString(String text, LocaleId language, String... tokenNames)
Extracts tokens from the given text.- Parameters:
text- Text to tokenize.language- Language of the text.tokenNames- Optional list of token names. If omitted, all tokens will be extracted.- Returns:
- A list of TokenType objects.
-
tokenize
public static Tokens tokenize(ITextUnit textUnit, LocaleId language, String... tokenNames)
-
tokenize
public static Tokens tokenize(TextContainer textContainer, LocaleId language, String... tokenNames)
-
tokenize
public static Tokens tokenize(TextFragment textFragment, LocaleId language, String... tokenNames)
-
-