public class CharSeqMultiCounter extends Object implements CharSeqCounter
CharSeqMultiCounter combines the counts from a pair
of character sequence counters. The returned values are the values
resulting from combining the counts in both counters.
Multi-counters are particularly useful in situations where a large or constant background counter must be updated several different ways simultaneously. For instance, a general 5-gram counter of a language trained over a lot of data might be combined with an 8-gram topic-specific model for use in a classifier.
More than two counters may be combined by combining them two at
a time. The best strategy is to combine them two at a time into a
balanced tree of counters, as done by the constructor CharSeqMultiCounter(CharSeqCounter[]). For instance, with
CharSeqCounter instances c1,
c2, c3, and c4, the balanced
construction of c1234 in:
CharSeqCounter c12 = new CharSeqMultiCounter(c1,c2); CharSeqCounter c34 = new CharSeqMultiCounter(c3,c4); CharSeqCounter c1234 = new CharSeqMultiCounter(c12,c34);is more efficient for many operations than the linear construction in:
CharSeqCounter c12 = new CharSeqMultiCounter(c1,c2); CharSeqCounter c123 = new CharSeqMultiCounter(c12,c3); CharSeqCounter c1234 = new CharSeqMultiCounter(c123,c4);
Implementation Note: The methods numCharactersFollowing(char[],int,int), charactersFollowing(char[],int,int), and observedCharacters() all call the contained counters' CharSeqCounter.charactersFollowing(char[],int,int) methods and
then merge or count results. All other methods only perform
arithmetic on the result of the corresponding method call son the
contained counters.
| Constructor and Description |
|---|
CharSeqMultiCounter(CharSeqCounter[] counters)
Construct a character sequence counter from the specified array
of counters.
|
CharSeqMultiCounter(CharSeqCounter counter1,
CharSeqCounter counter2)
Construct a multi-counter from the specified pair of counters.
|
| Modifier and Type | Method and Description |
|---|---|
char[] |
charactersFollowing(char[] cs,
int start,
int end)
Returns the array of characters that have been observed
following the specified character slice in unicode order.
|
long |
count(char[] cs,
int start,
int end)
Returns the count for the specified character sequence.
|
long |
extensionCount(char[] cs,
int start,
int end)
Returns the sum of the counts of all character sequences one
character longer than the specified character slice.
|
int |
numCharactersFollowing(char[] cs,
int start,
int end)
Returns the number of characters that when appended to the end
of the specified character slice produce an extended slice with
a non-zero count.
|
char[] |
observedCharacters()
Returns an array consisting of the characters with non-zero
count in unicode order.
|
public CharSeqMultiCounter(CharSeqCounter[] counters)
counters - Array of counters to back multicounter.IllegalArgumentException - If the list of counters is
less than two elements long.public CharSeqMultiCounter(CharSeqCounter counter1, CharSeqCounter counter2)
counter1 - First counter in multi-counter.counter2 - Second counter in multi-counter.public long count(char[] cs,
int start,
int end)
CharSeqCountercount in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - Index of one past last character in slice.public long extensionCount(char[] cs,
int start,
int end)
CharSeqCounterextensionCount in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - Index of one past last character in slice.public int numCharactersFollowing(char[] cs,
int start,
int end)
CharSeqCounter
numCharactersFollowing(cSlice)
= | { c | count(cSlice.c) > 0 } |
where count(cSlice.c) represents the count
of the character slice cSlice suffixed with the
character c.numCharactersFollowing in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - One plus index of last character in slice.public char[] charactersFollowing(char[] cs,
int start,
int end)
CharSeqCountercharactersFollowing in interface CharSeqCountercs - Underlying character array.start - Index of first character in slice.end - One plus index of last character in slice.public char[] observedCharacters()
CharSeqCountercharactersFollowing(new
char[0],0,0).observedCharacters in interface CharSeqCounterCopyright © 2019 Alias-i, Inc.. All rights reserved.