public class SentenceChunker extends Object implements Chunker, Serializable
SentenceChunker class uses a
SentenceModel to implement sentence detection through
the chunk.Chunker interface. A sentence chunker is
constructed from a tokenizer factory and a sentence model. The
tokenizer factory creates tokens that it sends to the sentence
model. The types of the chunks produced are given by the constant
SENTENCE_CHUNK_TYPE.
SentenceChunker constructed
from the deserialized tokenizer factory and sentence model.| Modifier and Type | Field and Description |
|---|---|
static String |
SENTENCE_CHUNK_TYPE
The type assigned to sentence chunks, namely
"S". |
| Constructor and Description |
|---|
SentenceChunker(TokenizerFactory tf,
SentenceModel sm)
Construct a sentence chunker from the specified tokenizer
factory and sentence model.
|
| Modifier and Type | Method and Description |
|---|---|
Chunking |
chunk(char[] cs,
int start,
int end)
Return the chunking derived from the underlying sentence model
over the tokenization of the specified character slice.
|
Chunking |
chunk(CharSequence cSeq)
Return the chunking derived from the underlying sentence model
over the tokenization of the specified character slice.
|
SentenceModel |
sentenceModel()
Returns the sentence model for this chunker.
|
TokenizerFactory |
tokenizerFactory()
Returns the tokenizer factory for this chunker.
|
public static final String SENTENCE_CHUNK_TYPE
"S".public SentenceChunker(TokenizerFactory tf, SentenceModel sm)
tf - Tokenizer factory for chunker.sm - Sentence model for chunker.public TokenizerFactory tokenizerFactory()
public SentenceModel sentenceModel()
public Chunking chunk(CharSequence cSeq)
Warning: As described in the class documentation above, a tokenizer factory that produces tokenizers that do not reproduce the original sequence may cause the underlying character slice for the chunks to differ from the slice provided as an argument.
public Chunking chunk(char[] cs, int start, int end)
chunk(CharSequence) for more information.Copyright © 2019 Alias-i, Inc.. All rights reserved.