Package de.jungblut.nlp
Class MarkovChain
- java.lang.Object
-
- de.jungblut.nlp.MarkovChain
-
public final class MarkovChain extends java.lang.ObjectMarkov chain, that can "learn" the state transition probabilities by a given input and returns the probability for a given sequence of states.- Author:
- thomas.jungblut
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description doubleaverageTransitionProbability(int[] sequence)int[]completeStateSequence(com.google.common.base.Optional<java.util.Random> optionalRandom, int[] stateSequence, int... unsuppliedStateIndices)Completes the given state sequence by picking the best next state on the transition probabilities (so a transition with a high probability is picked more often).static MarkovChaincreate(int numStates)Creates a new markov chain with the supplied number of states.static MarkovChaincreate(int numStates, de.jungblut.math.DoubleMatrix mat)Creates a new markov chain with the supplied number of states and its predefined transition matrix.intgetNumStates()doublegetProbabilityForSequence(int[] stateSequence)Calculates the probability that the given sequence occurs.de.jungblut.math.DoubleMatrixgetTransitionProbabilities()de.jungblut.math.DoubleVectorgetTransitionProbabilities(int[] stateSequence)voidtrain(java.util.stream.Stream<int[]> states)Trains the transition probabilities of the markov chain.
-
-
-
Method Detail
-
train
public void train(java.util.stream.Stream<int[]> states)
Trains the transition probabilities of the markov chain.
Each list element contains a set of states. The values of the element-states are nominal and should be lower than the number of provided states (each nominal will be a index, so it's from 0 to numStates-1). So each element can be arbitrary sized, because in markov chains we are considering the transition between two states, thus it will measure the occurrence of each following two state pairs. e.G. [ 1, 2, 3, 4 ] will measure the probabilities of [1,2],[2,3],[3,4].
-
getProbabilityForSequence
public double getProbabilityForSequence(int[] stateSequence)
Calculates the probability that the given sequence occurs.- Returns:
- value between 0d and 1d, where 1d is very likely that the sequence is happening.
-
averageTransitionProbability
public double averageTransitionProbability(int[] sequence)
- Returns:
- the average transition probability of the given sequence.
-
getTransitionProbabilities
public de.jungblut.math.DoubleVector getTransitionProbabilities(int[] stateSequence)
- Returns:
- the transition probabilities for the states.
-
completeStateSequence
public int[] completeStateSequence(com.google.common.base.Optional<java.util.Random> optionalRandom, int[] stateSequence, int... unsuppliedStateIndices)Completes the given state sequence by picking the best next state on the transition probabilities (so a transition with a high probability is picked more often). If the optional random is not provided, it picks the next state by the highest transition probability between the states (it predicts the next state based on the previous).
-
getTransitionProbabilities
public de.jungblut.math.DoubleMatrix getTransitionProbabilities()
- Returns:
- the state transition probability matrix to export/serialize.
-
getNumStates
public int getNumStates()
- Returns:
- how many states were defined?
-
create
public static MarkovChain create(int numStates)
Creates a new markov chain with the supplied number of states.
-
create
public static MarkovChain create(int numStates, de.jungblut.math.DoubleMatrix mat)
Creates a new markov chain with the supplied number of states and its predefined transition matrix.
-
-