Class HMM

  • All Implemented Interfaces:
    Classifier, Predictor, org.apache.hadoop.io.Writable

    public final class HMM
    extends AbstractClassifier
    implements org.apache.hadoop.io.Writable
    Hidden Markov Model implementation for multiple observations for all three types of problems HMM aims to solve (Decoding, likelihood estimation, unsupervised/supervised learning).
    Author:
    thomas.jungblut
    • Constructor Detail

      • HMM

        public HMM()
      • HMM

        public HMM​(int numVisibleStates,
                   int numHiddenStates)
    • Method Detail

      • estimateLikelihood

        public double estimateLikelihood​(de.jungblut.math.DoubleVector[] observationSequence)
        Likelihood estimation on the current HMM. It estimates the likelihood that the given observation sequence is about to happen. P( O | lambda ) where O is the observation sequence and lambda are the HMM's parameters. This is done by executing the forward algorithm with the given observations clamped to the visible states.
        Parameters:
        observationSequence - the given sequence of observations (features).
        Returns:
        the likelihood (not a probability!) that the given sequence is about to happen.
      • decode

        public de.jungblut.math.DoubleMatrix decode​(de.jungblut.math.DoubleVector[] observationSequence,
                                                    de.jungblut.math.DoubleVector[] featuresPerHiddenState)
        Decodes the given observation sequence (features) with the current HMM. This discovers the best hidden state sequence Q that is derived by executing the Viterbi algorithm with the given observations and the HMM's parameters lambda. This is a proxy to ViterbiUtils decode(DoubleVector[], DoubleVector[]).
        Parameters:
        observationSequence - the given sequence of features.
        Returns:
        a matrix containing the predicted hidden state on each row vector.
      • trainUnsupervised

        public void trainUnsupervised​(de.jungblut.math.DoubleVector[] features,
                                      double epsilon,
                                      int maxIterations,
                                      boolean verbose)
        Trains the current models parameters by executing a baum-welch expectation maximization algorithm. TODO this should also be log-scaled for accuracy.
        Parameters:
        features - the visible state activations (the vector will be traversed for non-zero entries, so the value actually doesn't matter).
        epsilon - the absolute difference in the train model to the previous. If smaller than given value the iterations are stopped and the training finishes.
        maxIterations - if the epsilon threshold is never reached, the maximum iterations usually applies by stopping computation after given number of iterations.
        verbose - when set to true it will print information about the expectimax values per iteration.
      • trainSupervised

        public void trainSupervised​(de.jungblut.math.DoubleVector[] features,
                                    de.jungblut.math.DoubleVector[] outcome)
        Trains the current models parameters by executing a forwad pass over the given observations (hidden and visible states). Probabilities are +1 smoothed while counting in case there would be zero probability somewhere. This method is compatible to the Classifier#train method so this model can be used as a simple classifier.
        Parameters:
        features - the visible state activations (the vector will be traversed for non-zero entries, so the value actually doesn't matter).
        outcome - the outcome that was assigned to the given features. This can be in the binary case a single element vector (0d or 1d), or in the multi-class case a vector which index denotes the class (from zero to numHiddenStates, activation is again 0d or 1d). Note that in the multi-class case just a single state can be turned on, so the classes are mutual exclusive.
      • train

        public void train​(de.jungblut.math.DoubleVector[] features,
                          de.jungblut.math.DoubleVector[] outcome)
        Description copied from interface: Classifier
        Trains this classifier with the given features and the outcome.
        Specified by:
        train in interface Classifier
        Overrides:
        train in class AbstractClassifier
        outcome - the outcome must have classes labeled as doubles. E.G. in the binary case you have a single element and decide between 0d and 1d. In higher dimensional cases you have each of these single elements mapped to a dimension.
      • predict

        public de.jungblut.math.DoubleVector predict​(de.jungblut.math.DoubleVector features)
        Description copied from interface: Predictor
        Classifies the given features.
        Specified by:
        predict in interface Predictor
        Returns:
        the vector that contains an indicator at the index of the class. Usually zero or 1, in some cases it is a probability or activation value.
      • predict

        public de.jungblut.math.DoubleVector predict​(de.jungblut.math.DoubleVector features,
                                                     de.jungblut.math.DoubleVector previousOutcome)
      • getNumHiddenStates

        public int getNumHiddenStates()
      • getNumVisibleStates

        public int getNumVisibleStates()
      • getEmissionProbabilitiyMatrix

        public de.jungblut.math.DoubleMatrix getEmissionProbabilitiyMatrix()
      • getHiddenPriorProbability

        public de.jungblut.math.DoubleVector getHiddenPriorProbability()
      • getTransitionProbabilityMatrix

        public de.jungblut.math.DoubleMatrix getTransitionProbabilityMatrix()
      • write

        public void write​(java.io.DataOutput out)
                   throws java.io.IOException
        Specified by:
        write in interface org.apache.hadoop.io.Writable
        Throws:
        java.io.IOException
      • readFields

        public void readFields​(java.io.DataInput in)
                        throws java.io.IOException
        Specified by:
        readFields in interface org.apache.hadoop.io.Writable
        Throws:
        java.io.IOException