public class ChunkingImpl extends Object implements Chunking, Iterable<Chunk>
ChunkingImpl provides a mutable, set-based
implementation of the chunking interface. At construction time, a
character sequence or slice is specified. Chunks may then be added
using the add(Chunk) method.| Constructor and Description |
|---|
ChunkingImpl(char[] cs,
int start,
int end)
Construct a chunking implementation to hold chunks over the
specified character slice.
|
ChunkingImpl(CharSequence cSeq)
Constructs a chunking implementation to hold chunks over the
specified character sequence.
|
| Modifier and Type | Method and Description |
|---|---|
void |
add(Chunk chunk)
Add a chunk this this chunking.
|
void |
addAll(Collection<Chunk> chunks)
Adds all of the chunks in the specified collection to this
chunking.
|
CharSequence |
charSequence()
Returns the character sequence underlying this chunking.
|
Set<Chunk> |
chunkSet()
Return an unmodifiable view of the set of chunks for this
chunking.
|
static boolean |
equal(Chunking chunking1,
Chunking chunking2)
Returns
true if the specified chunkings are equal. |
boolean |
equals(Object that)
Returns
true if the specified object is a chunking
equal to this one. |
int |
hashCode()
Returns the hash code for this chunking.
|
static int |
hashCode(Chunking chunking)
Returns the hash code for the specified chunking.
|
Iterator<Chunk> |
iterator()
Returns an unmodifiable iterator over the chunk set underlying
this chunking implementation.
|
static Chunking |
merge(Chunking chunking1,
Chunking chunking2)
Return the result of combining two chunkings into a single
non-overlapping chunking.
|
static boolean |
overlap(Chunk chunk1,
Chunk chunk2)
Returns
true if the chunks overlap at least one
character position. |
String |
toString()
Returns a string-based representation of this chunking.
|
clone, finalize, getClass, notify, notifyAll, wait, wait, waitforEach, spliteratorpublic ChunkingImpl(CharSequence cSeq)
cSeq - Character sequence underlying the chunking.public ChunkingImpl(char[] cs,
int start,
int end)
cs - Character array.start - Index in array of first element in chunk.end - Index in array of one past the last element in chunk.public void addAll(Collection<Chunk> chunks)
Chunk interface, an illegal argument exception is
thrown.chunks - Chunks to add to this chunking.IllegalArgumentException - If the collection contains an
object that does not implement Chunk.public Iterator<Chunk> iterator()
public void add(Chunk chunk)
chunk - Chunk to add to this chunking.IllegalArgumentException - If the end point is beyond the
underlying character sequence.public CharSequence charSequence()
charSequence in interface Chunkingpublic Set<Chunk> chunkSet()
public boolean equals(Object that)
Chunkingtrue if the specified object is a chunking
equal to this one. Equality for chunking is defined by
character sequence yield equality and chunk set equality.
Character sequences are tested for equality with Strings.equalCharSequence(CharSequence,CharSequence)
and chunks are compared as sets with elements tested for
equality using Chunk.equals(Object).
There is a utility implementation of this definition provided
for chunkings in equal(Chunking,Chunking).public int hashCode()
Chunking
hashCode()
= Strings.hashCode(charSequence())
+ 31 * chunkSet().hashCode()
There is a utility implementation of this definition provided
for chunkings in hashCode(Chunking).public String toString()
public static boolean equal(Chunking chunking1, Chunking chunking2)
true if the specified chunkings are equal.
Chunking equality is defined in Chunking.equals(Object)
to be equality of character sequence yields and equality of
chunk sets.
Warning: Equality is unstable if the chunkings change.
chunking1 - First chunking.chunking2 - Second chunking.true if the chunkings are equal.public static int hashCode(Chunking chunking)
Chunking.hashCode().
Warning: Hash codes are unstable if the chunkings change.
chunking - Chunking whose hash code is returned.public static boolean overlap(Chunk chunk1, Chunk chunk2)
true if the chunks overlap at least one
character position.
Chunks chunk1 and chunk2 overlap if
orchunk1.start() <= chunk2.start() < chunk1.end()
chunk2.start() <= chunk1.start() < chunk2.end()
chunk1 - First chunk to test.chunk2 - Second chunk to test.true if the chunks overlap at least one character
position.public static Chunking merge(Chunking chunking1, Chunking chunking2)
Chunk.TEXT_ORDER_COMPARATOR, and then
visited left to right, keeping chunks that don't overlap chunks
appearing earlier in the order. Next, chunks are added from the
second chunking in the same way, first by sorting, then by
adding in order, all the chunks that are consistent with existing
chunks.
The returned chunking has a string as a character sequence rather than copying one of the input chunking's character sequence.
Overall, this is an O(n log n) operation because of the sorting. It also allocates arrays for each of the input chunking's chunks, and the string and the chunk set for the result.
chunking1 - First chunking to combine.chunking2 - Second chunking to combine.IllegalArgumentException - If the chunkings are not over the same
character sequence.Copyright © 2019 Alias-i, Inc.. All rights reserved.