public interface TagChunkCodec
TagChunkCodec provides a means of coding chunkings as
taggings and decoding (string) taggings back to chunkings.
Each codec contains a method tagSet(Set) to return the
complete set of tags used in the coding given a set of chunk types.
Codecs also use a variable argument method legalTags(String[]) to determine if a sequence of tags is legal.
For a known set of chunk types, the followers of a tag can be
constructed by iterating over the set of tags returned by tagSet() and check if they're legal using legalTags().
To validate whether a chunking may be successfully encoded as a
tagging and then decoded to the original chunking, use the method
isEncodable(Chunking). To validate whether a string
tagging may be successfully decoded to a chunking and then
reencoded to the original string tagging, use isDecodable(StringTagging).
| Modifier and Type | Method and Description |
|---|---|
boolean |
isDecodable(StringTagging tagging)
Returns
true if the specified tagging may be decoded
as a chunking then encoded back to the original tagging accurately. |
boolean |
isEncodable(Chunking chunking)
Returns
true if the specified chunking may be encoded
as a tagging then decoded back to the original chunking accurately. |
boolean |
legalTags(String... tags)
Returns
true if the specified sequence of tags is a
complete legal tag sequence. |
boolean |
legalTagSubSequence(String... tags)
Returns
true if the specified sequence of tags
is a legal subsequence of tags. |
Iterator<Chunk> |
nBestChunks(TagLattice<String> lattice,
int[] tokenStarts,
int[] tokenEnds,
int maxResults)
Returns an iterator over chunks extracted in order of highest
probability up to the specified maximum number of results.
|
Set<String> |
tagSet(Set<String> chunkTypes)
Returns the complete set of tags used by this codec
for the specified set of chunk types.
|
Chunking |
toChunking(StringTagging tagging)
Return the result of decoding the specified tagging into
a chunking.
|
StringTagging |
toStringTagging(Chunking chunking)
Return the string tagging that fully encodes the specified
chunking.
|
Tagging<String> |
toTagging(Chunking chunking)
Return the tagging that partially encodes the specified
chunking.
|
Tagging<String> toTagging(Chunking chunking)
toStringTagging(Chunking).
This method will typically be more efficient than toStringTagging(), but implementations may just return the
same value, because StringTagging extends Tagging<String>.
This method may be implemented by delegating to
call to toStringTagging(Chunking), but a direct
implementation is often more efficient.
chunking - Chunking to encode.StringTagging toStringTagging(Chunking chunking)
chunking - Chunking to encode.Chunking toChunking(StringTagging tagging)
tagging - Tagging to decode.IllegalArgumentException - If the tag sequence is
illegal.Set<String> tagSet(Set<String> chunkTypes)
Modifying the returned set will not affect the codec.
chunkTypes - Set of types for chunks.boolean legalTags(String... tags)
true if the specified sequence of tags is a
complete legal tag sequence. The companion method legalTagSubSequence(String[]) tests if a substring of tags is
legal.tags - Variable length array of tags.true if the specified sequence of tags is
a complete legal tag sequence.boolean legalTagSubSequence(String... tags)
true if the specified sequence of tags
is a legal subsequence of tags. See the companion
method legalTags(String[]) to test if a complete
sequence is legal.
A sequence of tags is a legal subsequence if a legal sequence may be created by adding more tags to the front and/or end of the specified sequence.
Providing an empty sequence of tags always returns true. The result for a single input tag determines if the tag
itself is legal. For longer sequences, the tags must all be
legal and their order must be legal.
tags - Sequence of tags to test.true if the sequence of tags is legal as a
subsequence of some larger sequence.boolean isEncodable(Chunking chunking)
true if the specified chunking may be encoded
as a tagging then decoded back to the original chunking accurately.chunking - Chunking to test.true if encoding then decoding produces the
specified chunking.boolean isDecodable(StringTagging tagging)
true if the specified tagging may be decoded
as a chunking then encoded back to the original tagging accurately.tagging - Tagging to test.true if decoding then encoding produces the
specified tagging.Iterator<Chunk> nBestChunks(TagLattice<String> lattice, int[] tokenStarts, int[] tokenEnds, int maxResults)
lattice - Lattice from which chunks are extracted.maxResults - Maximum number of chunks to return.Copyright © 2016 Alias-i, Inc.. All rights reserved.