public class RegExFilteredTokenizerFactory extends ModifyTokenTokenizerFactory implements Serializable
RegExFilteredTokenizerFactory modifies the tokens
returned by a base tokenizer factory's tokizer by removing
those that do not match a regular expression pattern.
Matcher
for each token. If the matcher matches, that is, if
Matcher.matches() returns true,
then the token is kept; otherwise, the token is removed.
| Constructor and Description |
|---|
RegExFilteredTokenizerFactory(TokenizerFactory factory,
Pattern pattern)
Construct a regular-expression filtered tokenizer factory from
the specified base factory and regular expression pattern that
accepted tokens must match.
|
| Modifier and Type | Method and Description |
|---|---|
Pattern |
getPattern()
Returns the pattern for this regex-filtered tokenizer.
|
String |
modifyToken(String token)
Returns the specified token if it matches this
filter's pattern and
null otherwise. |
String |
toString() |
modify, modifyWhitespacebaseTokenizerFactory, tokenizerpublic RegExFilteredTokenizerFactory(TokenizerFactory factory, Pattern pattern)
factory - Base tokenizer factory.pattern - Pattern to match against tokens.public Pattern getPattern()
public String modifyToken(String token)
null otherwise.modifyToken in class ModifyTokenTokenizerFactorytoken - Input token.null
otherwise.public String toString()
toString in class ModifyTokenTokenizerFactoryCopyright © 2016 Alias-i, Inc.. All rights reserved.