| Package | Description |
|---|---|
| com.aliasi.chunk |
Classes for extracting meaningful chunks (spans) of text.
|
| com.aliasi.crf |
Classes and interfaces for conditional random fields.
|
| com.aliasi.dict |
Classes for handling dictionaries.
|
| com.aliasi.sentences |
Classes for sentence-boundary detection.
|
| com.aliasi.test.unit.chunk | |
| com.aliasi.tokenizer |
Classes for tokenizing character sequences.
|
| Modifier and Type | Class and Description |
|---|---|
class |
ChunkingImpl
A
ChunkingImpl provides a mutable, set-based
implementation of the chunking interface. |
| Modifier and Type | Method and Description |
|---|---|
Chunking |
RegExChunker.chunk(char[] cs,
int start,
int end)
Return the chunking of the specified character slice.
|
Chunking |
RescoringChunker.chunk(char[] cs,
int start,
int end)
Returns the first-best chunking for the specified character
slice.
|
Chunking |
HmmChunker.chunk(char[] cs,
int start,
int end)
Returns a chunking of the specified character slice.
|
Chunking |
Chunker.chunk(char[] cs,
int start,
int end)
Return the chunking of the specified character slice.
|
Chunking |
TokenShapeChunker.chunk(char[] cs,
int start,
int end)
Return the set of named-entity chunks derived from the
underlying decoder over the tokenization of the specified
character slice.
|
Chunking |
RegExChunker.chunk(CharSequence cSeq)
Return the chunking of the specified character sequence.
|
Chunking |
RescoringChunker.chunk(CharSequence cSeq)
Returns the first-best chunking for the specified character
sequence.
|
Chunking |
HmmChunker.chunk(CharSequence cSeq)
Returns a chunking of the specified character sequence.
|
Chunking |
Chunker.chunk(CharSequence cSeq)
Return the chunking of the specified character sequence.
|
Chunking |
TokenShapeChunker.chunk(CharSequence cSeq)
Return the set of named-entity chunks derived from the
uderlying decoder over the tokenization of the specified
character sequence.
|
static Chunking |
ChunkingImpl.merge(Chunking chunking1,
Chunking chunking2)
Return the result of combining two chunkings into a single
non-overlapping chunking.
|
Chunking |
IoTagChunkCodec.toChunking(StringTagging tagging) |
Chunking |
TagChunkCodec.toChunking(StringTagging tagging)
Return the result of decoding the specified tagging into
a chunking.
|
Chunking |
BioTagChunkCodec.toChunking(StringTagging tagging) |
| Modifier and Type | Method and Description |
|---|---|
Iterator<ScoredObject<Chunking>> |
RescoringChunker.nBest(char[] cs,
int start,
int end,
int maxNBest)
Returns the n-best chunkings of the specified character slice.
|
Iterator<ScoredObject<Chunking>> |
HmmChunker.nBest(char[] cs,
int start,
int end,
int maxNBest)
Returns a size-bounded iterator over scored objects with joint
probability estimates of tags and tokens as scores and
chunkings as objects.
|
Iterator<ScoredObject<Chunking>> |
NBestChunker.nBest(char[] cs,
int start,
int end,
int maxNBest)
Return the scored chunkings of the specified character sequence
in order as an iterator in order of score.
|
Iterator<ScoredObject<Chunking>> |
HmmChunker.nBestConditional(char[] cs,
int start,
int end,
int maxNBest)
Returns a size-bounded iterator over scored objects with
conditional probability estimates of tags and tokens as scores
and chunkings as objects.
|
static ObjectHandler<Chunking> |
TagChunkCodecAdapters.stringTaggingToChunking(TagChunkCodec codec,
ObjectHandler<StringTagging> handler)
Return the chunking handler that converts chunkings to taggings
using the specified codec.
|
static ObjectHandler<Chunking> |
TagChunkCodecAdapters.taggingToChunking(TagChunkCodec codec,
ObjectHandler<Tagging<String>> handler)
Return the chunking handler that converts chunkings to simple
taggings using the specified codec.
|
| Modifier and Type | Method and Description |
|---|---|
void |
ChunkingEvaluation.addCase(Chunking referenceChunking,
Chunking responseChunking)
Add an evaluation case consisting of a reference chunk
set and a response chunk set.
|
static boolean |
ChunkingImpl.equal(Chunking chunking1,
Chunking chunking2)
Returns
true if the specified chunkings are equal. |
void |
ChunkerEvaluator.handle(Chunking referenceChunking)
Handle the specified reference chunking.
|
void |
TrainTokenShapeChunker.handle(Chunking chunking)
Add the specified chunking as a training event.
|
void |
CharLmHmmChunker.handle(Chunking chunking)
Handle the specified chunking by tokenizing it, assigning tags
and training the underlying hidden Markov model.
|
void |
CharLmRescoringChunker.handle(Chunking chunking)
Trains this chunker with the specified chunking.
|
static int |
ChunkingImpl.hashCode(Chunking chunking)
Returns the hash code for the specified chunking.
|
boolean |
TagChunkCodec.isEncodable(Chunking chunking)
Returns
true if the specified chunking may be encoded
as a tagging then decoded back to the original chunking accurately. |
static Chunking |
ChunkingImpl.merge(Chunking chunking1,
Chunking chunking2)
Return the result of combining two chunkings into a single
non-overlapping chunking.
|
abstract double |
RescoringChunker.rescore(Chunking chunking)
Returns the score for a chunking.
|
double |
AbstractCharLmRescoringChunker.rescore(Chunking chunking)
Performs rescoring of the base chunking output using
character language models.
|
StringTagging |
IoTagChunkCodec.toStringTagging(Chunking chunking) |
StringTagging |
TagChunkCodec.toStringTagging(Chunking chunking)
Return the string tagging that fully encodes the specified
chunking.
|
StringTagging |
BioTagChunkCodec.toStringTagging(Chunking chunking) |
Tagging<String> |
IoTagChunkCodec.toTagging(Chunking chunking) |
Tagging<String> |
TagChunkCodec.toTagging(Chunking chunking)
Return the tagging that partially encodes the specified
chunking.
|
Tagging<String> |
BioTagChunkCodec.toTagging(Chunking chunking) |
| Modifier and Type | Method and Description |
|---|---|
static ObjectHandler<StringTagging> |
TagChunkCodecAdapters.chunkingToStringTagging(TagChunkCodec codec,
ObjectHandler<Chunking> handler)
Return the string tagging handler that converts string taggings
to chunkings.
|
static ObjectHandler<Tagging<String>> |
TagChunkCodecAdapters.chunkingToTagging(TagChunkCodec codec,
ObjectHandler<Chunking> handler)
Returns the tagging handler that converts taggings to chunkings
using the specified codec.
|
| Modifier and Type | Method and Description |
|---|---|
Chunking |
ChainCrfChunker.chunk(char[] cs,
int start,
int end) |
Chunking |
ChainCrfChunker.chunk(CharSequence cSeq) |
| Modifier and Type | Method and Description |
|---|---|
Iterator<ScoredObject<Chunking>> |
ChainCrfChunker.nBest(char[] cs,
int start,
int end,
int maxResults) |
Iterator<ScoredObject<Chunking>> |
ChainCrfChunker.nBestConditional(char[] cs,
int start,
int end,
int maxResults)
Returns an iterator over n-best chunkings with scores
normalized to conditional probabilities of the output given the
input string slice.
|
| Modifier and Type | Method and Description |
|---|---|
static ChainCrfChunker |
ChainCrfChunker.estimate(Corpus<ObjectHandler<Chunking>> chunkingCorpus,
TagChunkCodec codec,
TokenizerFactory tokenizerFactory,
ChainCrfFeatureExtractor<String> featureExtractor,
boolean addInterceptFeature,
int minFeatureCount,
boolean cacheFeatureVectors,
RegressionPrior prior,
int priorBlockSize,
AnnealingSchedule annealingSchedule,
double minImprovement,
int minEpochs,
int maxEpochs,
Reporter reporter)
Return the chain CRF-based chunker estimated from the specified
corpus, which is converted to a tagging corpus using the
specified coder/decoder and tokenizer factory, then passed to
the chain CRF estimate method along with the rest of the
arguments.
|
| Modifier and Type | Method and Description |
|---|---|
Chunking |
ExactDictionaryChunker.chunk(char[] cs,
int start,
int end)
Returns the chunking for the specified character slice.
|
Chunking |
ApproxDictionaryChunker.chunk(char[] cs,
int start,
int end)
Return the approximate dictionary-based chunking for the
specified character sequence.
|
Chunking |
ExactDictionaryChunker.chunk(CharSequence cSeq)
Returns the chunking for the specified character sequence.
|
Chunking |
ApproxDictionaryChunker.chunk(CharSequence cSeq)
Return the approximate dictionary-based chunking for
the specified character sequence.
|
| Modifier and Type | Method and Description |
|---|---|
Chunking |
SentenceChunker.chunk(char[] cs,
int start,
int end)
Return the chunking derived from the underlying sentence model
over the tokenization of the specified character slice.
|
Chunking |
SentenceChunker.chunk(CharSequence cSeq)
Return the chunking derived from the underlying sentence model
over the tokenization of the specified character slice.
|
| Modifier and Type | Method and Description |
|---|---|
void |
SentenceEvaluation.addCase(Chunking referenceChunking,
Chunking responseChunking)
Add the case corresponding to the specified reference and
response chunkings.
|
void |
SentenceEvaluator.handle(Chunking refChunking)
Handle the specified reference chunking by extracting its
character sequence, producing a response chunking from the
contained sentence chunker, and adding the reference and
result to the evaluation.
|
static String |
SentenceEvaluation.sentenceCaseToString(Chunking referenceChunking,
Chunking responseChunking,
int lineLength)
Given a pair of reference and response chunkings, returns a
string showing the underlying character sequence
chunking pair, annotated with the
sentence boundaries from each chunking, with linebreaks
inserted every lineLength characters.
|
| Modifier and Type | Method and Description |
|---|---|
static void |
CharLmHmmChunkerTest.assertChunking(Chunker chunker,
Chunking expectedChunking) |
static <C extends Chunker & Compilable> |
CharLmHmmChunkerTest.assertChunkingCompile(C chunkerEstimator,
Chunking expectedChunking) |
static void |
CharLmHmmChunkerTest.assertEqualsChunking(Chunking expectedChunking,
Chunking chunking) |
| Modifier and Type | Method and Description |
|---|---|
Chunking |
TokenChunker.chunk(char[] cs,
int start,
int end)
Return the chunking produced by tokenizing the specified
character array slice.
|
Chunking |
TokenChunker.chunk(CharSequence cSeq)
Return the chunking produced by tokenizing the specified
character sequence.
|
Copyright © 2019 Alias-i, Inc.. All rights reserved.