public class StopTokenizerFactory extends ModifyTokenTokenizerFactory implements Serializable
StopTokenizerFactory modifies a base tokenizer factory
by removing tokens in a specified stop set. When a token is
removed from the output of a tokenizer, so is the whitespace
immediately following it.
| Constructor and Description |
|---|
StopTokenizerFactory(TokenizerFactory factory,
Set<String> stopSet)
Construct a tokenizer factory that removes tokens
in the specified stop set from tokenizers produced
by the specified base factory.
|
| Modifier and Type | Method and Description |
|---|---|
String |
modifyToken(String token)
Return a modified form of the specified token, or
null to remove it. |
Set<String> |
stopSet()
Returns an unmodifiable view of the stop set
underlying this stop tokenizer factory.
|
String |
toString() |
modify, modifyWhitespacebaseTokenizerFactory, tokenizerpublic StopTokenizerFactory(TokenizerFactory factory, Set<String> stopSet)
factory - Base tokenizer factory.stopSet - Set of stop tokens.public Set<String> stopSet()
public String modifyToken(String token)
ModifyTokenTokenizerFactorynull to remove it.
The base implementation in this class simply returns the specified token.
modifyToken in class ModifyTokenTokenizerFactorytoken - Token to modify.null to remove it.public String toString()
toString in class ModifyTokenTokenizerFactoryCopyright © 2016 Alias-i, Inc.. All rights reserved.