Back to index

python-biopython  1.60
Public Member Functions | Public Attributes | Private Member Functions | Private Attributes
Bio.HMM.MarkovModel.HiddenMarkovModel Class Reference

List of all members.

Public Member Functions

def __init__
def get_blank_transitions
def get_blank_emissions
def transitions_from
def transitions_to
def viterbi

Public Attributes

 initial_prob
 transition_prob
 emission_prob

Private Member Functions

def _log_transform

Private Attributes

 _transition_pseudo
 _emission_pseudo
 _transitions_from
 _transitions_to

Detailed Description

Represent a hidden markov model that can be used for state estimation.

Definition at line 435 of file MarkovModel.py.


Constructor & Destructor Documentation

def Bio.HMM.MarkovModel.HiddenMarkovModel.__init__ (   self,
  initial_prob,
  transition_prob,
  emission_prob,
  transition_pseudo,
  emission_pseudo 
)
Initialize a Markov Model.

Note: You should use the MarkovModelBuilder class instead of
initiating this class directly.

Arguments:

o initial_prob - A dictionary of initial probabilities for all states.

o transition_prob -- A dictionary of transition probabilities for all
possible transitions in the sequence.

o emission_prob -- A dictionary of emission probabilities for all
possible emissions from the sequence states.

o transition_pseudo -- Pseudo-counts to be used for the transitions,
when counting for purposes of estimating transition probabilities.

o emission_pseudo -- Pseudo-counts to be used for the emissions,
when counting for purposes of estimating emission probabilities.

Definition at line 439 of file MarkovModel.py.

00439 
00440                  transition_pseudo, emission_pseudo):
00441         """Initialize a Markov Model.
00442 
00443         Note: You should use the MarkovModelBuilder class instead of
00444         initiating this class directly.
00445 
00446         Arguments:
00447 
00448         o initial_prob - A dictionary of initial probabilities for all states.
00449 
00450         o transition_prob -- A dictionary of transition probabilities for all
00451         possible transitions in the sequence.
00452 
00453         o emission_prob -- A dictionary of emission probabilities for all
00454         possible emissions from the sequence states.
00455 
00456         o transition_pseudo -- Pseudo-counts to be used for the transitions,
00457         when counting for purposes of estimating transition probabilities.
00458 
00459         o emission_pseudo -- Pseudo-counts to be used for the emissions,
00460         when counting for purposes of estimating emission probabilities.
00461         """
00462 
00463         self.initial_prob = initial_prob
00464 
00465         self._transition_pseudo = transition_pseudo
00466         self._emission_pseudo = emission_pseudo
00467         
00468         self.transition_prob = transition_prob
00469         self.emission_prob = emission_prob
00470 
00471         # a dictionary of the possible transitions from each state
00472         # each key is a source state, mapped to a list of the destination states
00473         # that are reachable from the source state via a transition
00474         self._transitions_from = \
00475            _calculate_from_transitions(self.transition_prob)
00476 
00477         # a dictionary of the possible transitions to each state
00478         # each key is a destination state, mapped to a list of source states
00479         # from which the destination is reachable via a transition
00480         self._transitions_to = \
00481            _calculate_to_transitions(self.transition_prob)
00482 

Here is the caller graph for this function:


Member Function Documentation

def Bio.HMM.MarkovModel.HiddenMarkovModel._log_transform (   self,
  probability 
) [private]
Return log transform of the given probability dictionary.

When calculating the Viterbi equation, add logs of probabilities rather
than multiplying probabilities, to avoid underflow errors. This method
returns a new dictionary with the same keys as the given dictionary
and log-transformed values.

Definition at line 635 of file MarkovModel.py.

00635 
00636     def _log_transform(self, probability):
00637         """Return log transform of the given probability dictionary.
00638 
00639         When calculating the Viterbi equation, add logs of probabilities rather
00640         than multiplying probabilities, to avoid underflow errors. This method
00641         returns a new dictionary with the same keys as the given dictionary
00642         and log-transformed values.
00643         """
00644         log_prob = copy.copy(probability)
00645         try:
00646             neg_inf = float("-inf")
00647         except ValueError:
00648             #On Python 2.5 or older that was handled in C code,
00649             #and failed on Windows XP 32bit
00650             neg_inf = - 1E400
00651         for key in log_prob:
00652             prob = log_prob[key]
00653             if prob > 0:
00654                 log_prob[key] = math.log(log_prob[key])
00655             else:
00656                 log_prob[key] = neg_inf
00657 
00658         return log_prob
00659     

Here is the caller graph for this function:

Get the starting default emmissions for each sequence.

This returns a dictionary of the default emmissions for each
letter. The dictionary is structured with keys as
(seq_letter, emmission_letter) and values as the starting number
of emmissions.

Definition at line 493 of file MarkovModel.py.

00493 
00494     def get_blank_emissions(self):
00495         """Get the starting default emmissions for each sequence.
00496         
00497         This returns a dictionary of the default emmissions for each
00498         letter. The dictionary is structured with keys as
00499         (seq_letter, emmission_letter) and values as the starting number
00500         of emmissions.
00501         """
00502         return self._emission_pseudo

Get the default transitions for the model.

Returns a dictionary of all of the default transitions between any
two letters in the sequence alphabet. The dictionary is structured
with keys as (letter1, letter2) and values as the starting number
of transitions.

Definition at line 483 of file MarkovModel.py.

00483 
00484     def get_blank_transitions(self):
00485         """Get the default transitions for the model.
00486 
00487         Returns a dictionary of all of the default transitions between any
00488         two letters in the sequence alphabet. The dictionary is structured
00489         with keys as (letter1, letter2) and values as the starting number
00490         of transitions.
00491         """
00492         return self._transition_pseudo

def Bio.HMM.MarkovModel.HiddenMarkovModel.transitions_from (   self,
  state_letter 
)
Get all destination states to which there are transitions from the
state_letter source state.

This returns all letters which the given state_letter can transition
to. An empty list is returned if state_letter has no outgoing
transitions.

Definition at line 503 of file MarkovModel.py.

00503 
00504     def transitions_from(self, state_letter):
00505         """Get all destination states to which there are transitions from the
00506         state_letter source state.
00507 
00508         This returns all letters which the given state_letter can transition
00509         to. An empty list is returned if state_letter has no outgoing
00510         transitions.
00511         """
00512         if state_letter in self._transitions_from:
00513             return self._transitions_from[state_letter]
00514         else:
00515             return []

def Bio.HMM.MarkovModel.HiddenMarkovModel.transitions_to (   self,
  state_letter 
)
Get all source states from which there are transitions to the
state_letter destination state.

This returns all letters which the given state_letter is reachable
from. An empty list is returned if state_letter is unreachable.

Definition at line 516 of file MarkovModel.py.

00516 
00517     def transitions_to(self, state_letter):
00518         """Get all source states from which there are transitions to the
00519         state_letter destination state.
00520 
00521         This returns all letters which the given state_letter is reachable
00522         from. An empty list is returned if state_letter is unreachable.
00523         """
00524         if state_letter in self._transitions_to:
00525             return self._transitions_to[state_letter]
00526         else:
00527             return []

Here is the caller graph for this function:

def Bio.HMM.MarkovModel.HiddenMarkovModel.viterbi (   self,
  sequence,
  state_alphabet 
)
Calculate the most probable state path using the Viterbi algorithm.

This implements the Viterbi algorithm (see pgs 55-57 in Durbin et
al for a full explanation -- this is where I took my implementation
ideas from), to allow decoding of the state path, given a sequence
of emissions.

Arguments:

o sequence -- A Seq object with the emission sequence that we
want to decode.

o state_alphabet -- The alphabet of the possible state sequences
that can be generated.

Definition at line 528 of file MarkovModel.py.

00528 
00529     def viterbi(self, sequence, state_alphabet):
00530         """Calculate the most probable state path using the Viterbi algorithm.
00531 
00532         This implements the Viterbi algorithm (see pgs 55-57 in Durbin et
00533         al for a full explanation -- this is where I took my implementation
00534         ideas from), to allow decoding of the state path, given a sequence
00535         of emissions.
00536 
00537         Arguments:
00538 
00539         o sequence -- A Seq object with the emission sequence that we
00540         want to decode.
00541 
00542         o state_alphabet -- The alphabet of the possible state sequences
00543         that can be generated.
00544         """
00545 
00546         # calculate logarithms of the initial, transition, and emission probs
00547         log_initial = self._log_transform(self.initial_prob)
00548         log_trans = self._log_transform(self.transition_prob)
00549         log_emission = self._log_transform(self.emission_prob)
00550 
00551         viterbi_probs = {}
00552         pred_state_seq = {}
00553         state_letters = state_alphabet.letters
00554 
00555         # --- recursion
00556         # loop over the training squence (i = 1 .. L)
00557         # NOTE: My index numbers are one less than what is given in Durbin
00558         # et al, since we are indexing the sequence going from 0 to
00559         # (Length - 1) not 1 to Length, like in Durbin et al.
00560         for i in range(0, len(sequence)):
00561             # loop over all of the possible i-th states in the state path
00562             for cur_state in state_letters:
00563                 # e_{l}(x_{i})
00564                 emission_part = log_emission[(cur_state, sequence[i])]
00565 
00566                 max_prob = 0
00567                 if i == 0:
00568                     # for the first state, use the initial probability rather
00569                     # than looking back to previous states
00570                     max_prob = log_initial[cur_state]
00571                 else:
00572                     # loop over all possible (i-1)-th previous states
00573                     possible_state_probs = {}
00574                     for prev_state in self.transitions_to(cur_state):
00575                         # a_{kl}
00576                         trans_part = log_trans[(prev_state, cur_state)]
00577 
00578                         # v_{k}(i - 1)
00579                         viterbi_part = viterbi_probs[(prev_state, i - 1)]
00580                         cur_prob = viterbi_part + trans_part
00581 
00582                         possible_state_probs[prev_state] = cur_prob
00583 
00584                     # calculate the viterbi probability using the max
00585                     max_prob = max(possible_state_probs.values())
00586 
00587                 # v_{k}(i)
00588                 viterbi_probs[(cur_state, i)] = (emission_part + max_prob)
00589 
00590                 if i > 0:
00591                     # get the most likely prev_state leading to cur_state
00592                     for state in possible_state_probs:
00593                         if possible_state_probs[state] == max_prob:
00594                             pred_state_seq[(i - 1, cur_state)] = state
00595                             break
00596                     
00597         # --- termination
00598         # calculate the probability of the state path
00599         # loop over all states
00600         all_probs = {}
00601         for state in state_letters:
00602             # v_{k}(L)
00603             all_probs[state] = viterbi_probs[(state, len(sequence) - 1)]
00604 
00605         state_path_prob = max(all_probs.values())
00606 
00607         # find the last pointer we need to trace back from
00608         last_state = ''
00609         for state in all_probs:
00610             if all_probs[state] == state_path_prob:
00611                 last_state = state
00612 
00613         assert last_state != '', "Didn't find the last state to trace from!"
00614                 
00615         # --- traceback
00616         traceback_seq = MutableSeq('', state_alphabet)
00617         
00618         loop_seq = range(1, len(sequence))
00619         loop_seq.reverse()
00620 
00621         # last_state is the last state in the most probable state sequence.
00622         # Compute that sequence by walking backwards in time. From the i-th
00623         # state in the sequence, find the (i-1)-th state as the most
00624         # probable state preceding the i-th state.
00625         state = last_state
00626         traceback_seq.append(state)
00627         for i in loop_seq:
00628             state = pred_state_seq[(i - 1, state)]
00629             traceback_seq.append(state)
00630 
00631         # put the traceback sequence in the proper orientation
00632         traceback_seq.reverse()
00633 
00634         return traceback_seq.toseq(), state_path_prob

Here is the call graph for this function:


Member Data Documentation

Definition at line 465 of file MarkovModel.py.

Definition at line 464 of file MarkovModel.py.

Definition at line 473 of file MarkovModel.py.

Definition at line 479 of file MarkovModel.py.

Definition at line 468 of file MarkovModel.py.

Definition at line 462 of file MarkovModel.py.

Definition at line 467 of file MarkovModel.py.


The documentation for this class was generated from the following file: