opennlp.tools.tokenize
Class TokenSampleStream
java.lang.Object
opennlp.tools.util.FilterObjectStream<java.lang.String,TokenSample>
opennlp.tools.tokenize.TokenSampleStream
- All Implemented Interfaces:
- ObjectStream<TokenSample>
public class TokenSampleStream
- extends FilterObjectStream<java.lang.String,TokenSample>
This class is a stream filter which reads in string encoded samples and creates
TokenSamples out of them. The input string sample is tokenized if a
whitespace or the special separator chars occur.
Sample:
"token1 token2 token3token4"
The tokens token1 and token2 are separated by a whitespace, token3 and token3
are separated by the special character sequence, in this case the default
split sequence.
The sequence must be unique in the input string and is not escaped.
| Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TokenSampleStream
public TokenSampleStream(ObjectStream<java.lang.String> sampleStrings,
java.lang.String separatorChars)
TokenSampleStream
public TokenSampleStream(ObjectStream<java.lang.String> sentences)
read
public TokenSample read()
throws java.io.IOException
- Description copied from interface:
ObjectStream
- Returns the next object. Calling this method repeatedly until it returns
null will return each object from the underlying source exactly once.
- Returns:
- the next object or null to signal that the stream is exhausted
- Throws:
java.io.IOException
Copyright © 2011 The Apache Software Foundation. All Rights Reserved.