E - the type of object classifiedpublic class BernoulliClassifier<E> extends Object implements JointClassifier<E>, ObjectHandler<Classified<E>>, Serializable
BernoulliClassifier provides a feature-based
classifier where feature values are reduced to booleans based on a
specified threshold. Training events are supplied in the usual
way through the handle(Classified) method.
Given a feature threshold of t, any feature with
value strictly greater than the threshold t for a
given input is activated, and all other features are not activated
for that input.
The likelihood of a feature in a category is estimated with the
training sample counts using add-one smoothing (also known as
Laplace smoothing, or a uniform Dirichlet prior). There is also
a term for the category distribution. Suppose F is
the complete set of features seen during training. Further suppose
that count(cat) is the number of training samples
for category cat, and count(cat,feat)
is the number of training instaces of the specified category that
had the specified feature activated. Thus the contribution of
each feature is computed by:
p(+feat|cat) = (count(cat,feat) + 1) / (count(cat)+2) p(-feat|cat) = 1.0 - p(cat,feat)
Assuming the total number of training instances is totalCount,
we use a simple maximum-likelihood estimate for the category probability:
p(cat) = count(cat) / totalCountWith these two definitions, we define the joint probability estimate for a category
cat given activated features
{f[0],...,f[n-1]} and unactivated features
{g[0],...,g[m-1]} is:
p(cat,{f[0],...f[n-1]})
= p(cat)
* Πi < n p(f[i]|cat)
* Πj < m p(-g[j]|cat)
The JointClassification class requires log (base 2) estimates,
and is responsible for converting these to conditional estimates.
The scores in this case are just the log2 joint estimates.
The dynamic form of the estimator may be used for classification, but it is not very efficient. It loops over every feature for every category.
The serialized version of a Bernoulli classifier will
deserialize as an equivalent instance of
BernoulliClassifier. In order to serialize a
Bernoulli classifier, the feature extractor must be serializable.
Otherwise an exception will be raised during serialization.
Compilation is not yet implemented.
| Constructor and Description |
|---|
BernoulliClassifier(FeatureExtractor<E> featureExtractor)
Construct a Bernoulli classifier with the specified feature
extractor and the default feature activation threshold of 0.0.
|
BernoulliClassifier(FeatureExtractor<E> featureExtractor,
double featureActivationThreshold)
Construct a Bernoulli classifier with the specified feature
extractor and specified feature activation threshold.
|
| Modifier and Type | Method and Description |
|---|---|
String[] |
categories()
Returns a copy of the list the categories for this classifier.
|
JointClassification |
classify(E input)
Classify the specified input using this Bernoulli classifier.
|
double |
featureActivationThreshold()
Returns the feature activation threshold.
|
FeatureExtractor<E> |
featureExtractor()
Return the feature extractor for this classifier.
|
void |
handle(Classified<E> classified)
Handle the specified training classified object.
|
public BernoulliClassifier(FeatureExtractor<E> featureExtractor)
featureExtractor - Feature extractor for classification.public BernoulliClassifier(FeatureExtractor<E> featureExtractor, double featureActivationThreshold)
featureExtractor - Feature extractor for classification.featureActivationThreshold - The threshold for feature
activation (see the class documentation).public double featureActivationThreshold()
public FeatureExtractor<E> featureExtractor()
public String[] categories()
public void handle(Classified<E> classified)
handle in interface ObjectHandler<Classified<E>>classified - Classified object to add to handle
as training data.public JointClassification classify(E input)
classify in interface BaseClassifier<E>classify in interface ConditionalClassifier<E>classify in interface JointClassifier<E>classify in interface RankedClassifier<E>classify in interface ScoredClassifier<E>input - Input to classify.Copyright © 2016 Alias-i, Inc.. All rights reserved.