E - Type of tokens in the tagging.public class TaggerEvaluator<E> extends Object implements ObjectHandler<Tagging<E>>
TaggerEvaluator provides evaluation for
first-best taggers implementing the Tagger interface.
The basis of evaluation is a gold-standard set of reference
taggings. The evaluation is of taggings produced by a system (or
other means) known as response taggings. Cases consisting of of
reference and response taggings may be added to the evaluation
using addCase(Tagging,Tagging).
The evaluator takes a tagger as an argument in its constructor.
If the tagger is not null, the ObjectHandler method handle(Tagging) may be used to supply reference taggings for
which a response will be created using the tagger and then added
as a test case. The tagger may be reset using setTagger(Tagger),
which is useful for producing a single evaluation of different
taggers, such as for cross-validation.
The constructor also takes an argument determining whether inputs should be stored or not. If they are stored, then all input tokens will be available in the token-level evaluation. Tokens must be stored in order to compute unknown token accuracy.
The overall case-level accuracy, measuring how many inputs
received a completely correct set of tags, is returned by caseAccuracy().
The method inputSizeHistogram() returns a map from
integers to the number of reference taggings with that many
input tokens.
The lastCaseToString(Set) may be used to return a
string-based representation of the last case added. This method
requires a set of the known tokens, or null if known tokens
are not being tracked.
The primary results at the token level are returned as a
classifier evaluator by tokenEval(). The cases here
are individual tokens. For instance, if there were 100 cases used
for training of 15 tokens each, the classifier evaluator will
consider 15*100 = 1500 cases, one for each token. If the inputs
are stored, they will be passed on to this classifier evaluator and
available through the evaluator's methods.
Accuracy for tokens not in a specified set, typically the tokens
used in training, are available through unknownTokenEval(Set).
| Constructor and Description |
|---|
TaggerEvaluator(Tagger<E> tagger,
boolean storeTokens)
Construct a tagger evaluator using the specified tagger that
stores inputs if the specified flag is
true. |
| Modifier and Type | Method and Description |
|---|---|
void |
addCase(Tagging<E> referenceTagging,
Tagging<E> responseTagging)
Add a test case to this evaluator consisting of the specified
reference and response taggings.
|
double |
caseAccuracy()
Return the accuracy at the entire case level.
|
void |
handle(Tagging<E> referenceTagging)
Add a case for the specified reference tagging using the
contained tagger to generate a response tagging.
|
ObjectToCounterMap<Integer> |
inputSizeHistogram()
Returns a mapping from integers to the number of test cases
with that many tokens.
|
String |
lastCaseToString(Set<E> knownTokenSet)
Return a string-based representation of the last case
to be evaluated based on the specified known token set.
|
int |
numCases()
Returns the number of cases for this evaluation.
|
long |
numTokens()
Returns the number of tokens tested in the complete set
of test cases.
|
void |
setTagger(Tagger<E> tagger)
Set the tagger for this evaluator to the specified value.
|
boolean |
storeTokens()
Returns
true if this evaluator stores input tokens. |
Tagger<E> |
tagger()
Return the tagger for this evaluator.
|
List<String> |
tags()
Return the list of tags seen so far by this tagger evaluator in
either references or responses.
|
BaseClassifierEvaluator<E> |
tokenEval()
Returns the token-level evaluation for this tag evaluator.
|
BaseClassifierEvaluator<E> |
unknownTokenEval(Set<E> knownTokenSet)
Return the accuracy over known token set as an instance
of a classifier evaluator whose cases are individual
tokens not in the specified known token set.
|
public TaggerEvaluator(Tagger<E> tagger, boolean storeTokens)
true.tagger - Tagger to use for generating responses, or null
if cases are added manually.storeTokens - Flag set to true if the input tokens
for cases are stored.public Tagger<E> tagger()
public void setTagger(Tagger<E> tagger)
tagger - Tagger to use to generate responses.public boolean storeTokens()
true if this evaluator stores input tokens.true if this evaluator stores input tokens.public void handle(Tagging<E> referenceTagging)
handle in interface ObjectHandler<Tagging<E>>referenceTagging - Reference gold-standard tagging.NullPointerException - If the underlying tagger is null.public void addCase(Tagging<E> referenceTagging, Tagging<E> responseTagging)
referenceTagging - Reference gold-standard tags.responseTagging - Response system tags.IllegalArgumentException - If the token lengths are not
the same in the two taggings.public int numCases()
public long numTokens()
public List<String> tags()
public ObjectToCounterMap<Integer> inputSizeHistogram()
public double caseAccuracy()
public BaseClassifierEvaluator<E> unknownTokenEval(Set<E> knownTokenSet)
knownTokenSet - Set of known tokens to exclude from
evaluation.UnsupportedOperationException - If the inputs are not
being stored.public BaseClassifierEvaluator<E> tokenEval()
public String lastCaseToString(Set<E> knownTokenSet)
null, known tokens are
not distinguished.knownTokenSet - Set of known tokens.Copyright © 2016 Alias-i, Inc.. All rights reserved.