Home | Trees | Indices | Help |
|
---|
|
object --+ | HiddenMarkovModel
Hidden Markov model class, a generative model for labelling sequence data. These models define the joint probability of a sequence of symbols and their labels (state transitions) as the product of the starting state probability, the probability of each state transition, and the probability of each observation being generated from each state. This is described in more detail in the module documentation.
This implementation is based on the HMM description in Chapter 8, Huang, Acero and Hon, Spoken Language Processing.
|
|||
|
|||
float |
|
||
float |
|
||
list |
|
||
float |
|
||
|
|||
sequence of any |
|
||
sequence of any |
|
||
list |
|
||
|
|||
|
|||
|
|||
|
|||
|
|||
array |
|
||
array |
|
||
|
|||
Inherited from |
|
|||
|
Creates a hidden markov model parametised by the the states, transition probabilities, output probabilities and priors.
|
Returns the probability of the given symbol sequence. If the sequence is labelled, then returns the joint probability of the symbol, state sequence. Otherwise, uses the forward algorithm to find the probability over all label sequences.
|
Returns the log-probability of the given symbol sequence. If the sequence is labelled, then returns the joint log-probability of the symbol, state sequence. Otherwise, uses the forward algorithm to find the log-probability over all label sequences.
|
Tags the sequence with the highest probability state sequence. This uses the best_path method to find the Viterbi path.
|
|
Returns the state sequence of the optimal (most probable) path through the HMM. Uses the Viterbi algorithm to calculate this part by dynamic programming.
|
Returns the state sequence of the optimal (most probable) path through the HMM. Uses the Viterbi algorithm to calculate this part by dynamic programming. This uses a simple, direct method, and is included for teaching purposes.
|
Randomly sample the HMM to generate a sentence of a given length. This samples the prior distribution then the observation distribution and transition distribution for each subsequent observation and state. This will mostly generate unintelligible garbage, but can provide some amusement.
|
Returns the entropy over labellings of the given sequence. This is given by: H(O) = - sum_S Pr(S | O) log Pr(S | O) where the summation ranges over all state sequences, S. Let M{Z = Pr(O) = sum_S Pr(S, O)} where the summation ranges over all state sequences and O is the observation sequence. As such the entropy can be re-expressed as: H = - sum_S Pr(S | O) log [ Pr(S, O) / Z ] = log Z - sum_S Pr(S | O) log Pr(S, 0) = log Z - sum_S Pr(S | O) [ log Pr(S_0) + sum_t Pr(S_t | S_{t-1}) + sum_t Pr(O_t | S_t) ] The order of summation for the log terms can be flipped, allowing dynamic programming to be used to calculate the entropy. Specifically, we use the forward and backward probabilities (alpha, beta) giving: H = log Z - sum_s0 alpha_0(s0) beta_0(s0) / Z * log Pr(s0) + sum_t,si,sj alpha_t(si) Pr(sj | si) Pr(O_t+1 | sj) beta_t(sj) / Z * log Pr(sj | si) + sum_t,st alpha_t(st) beta_t(st) / Z * log Pr(O_t | st) This simply uses alpha and beta to find the probabilities of partial sequences, constrained to include the given state(s) at some point in time. |
Return the forward probability matrix, a T by N array of log-probabilities, where T is the length of the sequence and N is the number of states. Each entry (t, s) gives the probability of being in state s at time t after observing the partial symbol sequence up to and including t.
|
Return the backward probability matrix, a T by N array of log-probabilities, where T is the length of the sequence and N is the number of states. Each entry (t, s) gives the probability of being in state s at time t after observing the partial symbol sequence from t .. T.
|
repr(x)
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Wed May 16 22:47:31 2007 | http://epydoc.sourceforge.net |