Back to index

python-biopython  1.60
Public Member Functions | Private Member Functions | Private Attributes
Bio.HMM.DynamicProgramming.ScaledDPAlgorithms Class Reference
Inheritance diagram for Bio.HMM.DynamicProgramming.ScaledDPAlgorithms:
Inheritance graph
[legend]
Collaboration diagram for Bio.HMM.DynamicProgramming.ScaledDPAlgorithms:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def __init__
def forward_algorithm
def backward_algorithm

Private Member Functions

def _calculate_s_value
def _forward_recursion
def _backward_recursion

Private Attributes

 _s_values

Detailed Description

Implement forward and backward algorithms using a rescaling approach.

This scales the f and b variables, so that they remain within a
manageable numerical interval during calculations. This approach is
described in Durbin et al. on p 78.

This approach is a little more straightfoward then log transformation
but may still give underflow errors for some types of models. In these
cases, the LogDPAlgorithms class should be used.

Definition at line 157 of file DynamicProgramming.py.


Constructor & Destructor Documentation

def Bio.HMM.DynamicProgramming.ScaledDPAlgorithms.__init__ (   self,
  markov_model,
  sequence 
)
Initialize the scaled approach to calculating probabilities.
Arguments:

o markov_model -- The current Markov model we are working with.

o sequence -- A TrainingSequence object that must have a
set of emissions to work with.

Reimplemented from Bio.HMM.DynamicProgramming.AbstractDPAlgorithms.

Definition at line 168 of file DynamicProgramming.py.

00168 
00169     def __init__(self, markov_model, sequence):
00170         """Initialize the scaled approach to calculating probabilities.
00171         Arguments:
00172 
00173         o markov_model -- The current Markov model we are working with.
00174 
00175         o sequence -- A TrainingSequence object that must have a
00176         set of emissions to work with.
00177         """
00178         AbstractDPAlgorithms.__init__(self, markov_model, sequence)
00179 
00180         self._s_values = {}

Here is the caller graph for this function:


Member Function Documentation

def Bio.HMM.DynamicProgramming.ScaledDPAlgorithms._backward_recursion (   self,
  cur_state,
  sequence_pos,
  backward_vars 
) [private]
Calculate the value of the backward recursion

Arguments:

o cur_state -- The letter of the state we are calculating the
forward variable for.

o sequence_pos -- The position we are at in the training seq.

o backward_vars -- The current set of backward variables

Reimplemented from Bio.HMM.DynamicProgramming.AbstractDPAlgorithms.

Definition at line 270 of file DynamicProgramming.py.

00270 
00271     def _backward_recursion(self, cur_state, sequence_pos, backward_vars):
00272         """Calculate the value of the backward recursion
00273 
00274         Arguments:
00275 
00276         o cur_state -- The letter of the state we are calculating the
00277         forward variable for.
00278 
00279         o sequence_pos -- The position we are at in the training seq.
00280 
00281         o backward_vars -- The current set of backward variables
00282         """
00283         # calculate the s value, if we haven't done so already (ie. during
00284         # a previous forward or backward recursion)
00285         if sequence_pos not in self._s_values:
00286             self._s_values[sequence_pos] = \
00287               self._calculate_s_value(sequence_pos, backward_vars)
00288 
00289         # loop over all of the possible states at the position
00290         state_pos_sum = 0
00291         have_transition = 0
00292         for second_state in self._mm.transitions_from(cur_state):
00293             have_transition = 1
00294             # e_{l}(x_{i + 1})
00295             seq_letter = self._seq.emissions[sequence_pos + 1]
00296             cur_emission_prob = self._mm.emission_prob[(cur_state, seq_letter)]
00297 
00298             # get the previous backward_var value
00299             # b_{l}(i + 1)
00300             prev_backward = backward_vars[(second_state, sequence_pos + 1)]
00301 
00302             # the transition probability -- a_{kl}
00303             cur_transition_prob = self._mm.transition_prob[(cur_state,
00304                                                             second_state)]
00305 
00306             state_pos_sum += (cur_emission_prob * prev_backward *
00307                               cur_transition_prob)
00308 
00309         # if we have a probability for a transition, return it
00310         if have_transition:
00311             return (state_pos_sum / float(self._s_values[sequence_pos]))
00312         # otherwise we have no probability (ie. we can't do this transition)
00313         # and return None
00314         else:
00315             return None
            

Here is the call graph for this function:

def Bio.HMM.DynamicProgramming.ScaledDPAlgorithms._calculate_s_value (   self,
  seq_pos,
  previous_vars 
) [private]
Calculate the next scaling variable for a sequence position.

This utilizes the approach of choosing s values such that the
sum of all of the scaled f values is equal to 1.

Arguments:

o seq_pos -- The current position we are at in the sequence.

o previous_vars -- All of the forward or backward variables
calculated so far.

Returns:

o The calculated scaling variable for the sequence item.

Definition at line 181 of file DynamicProgramming.py.

00181 
00182     def _calculate_s_value(self, seq_pos, previous_vars):
00183         """Calculate the next scaling variable for a sequence position.
00184 
00185         This utilizes the approach of choosing s values such that the
00186         sum of all of the scaled f values is equal to 1.
00187 
00188         Arguments:
00189 
00190         o seq_pos -- The current position we are at in the sequence.
00191 
00192         o previous_vars -- All of the forward or backward variables
00193         calculated so far.
00194 
00195         Returns:
00196 
00197         o The calculated scaling variable for the sequence item.
00198         """
00199         # all of the different letters the state can have
00200         state_letters = self._seq.states.alphabet.letters
00201 
00202         # loop over all of the possible states
00203         s_value = 0
00204         for main_state in state_letters:
00205             emission = self._mm.emission_prob[(main_state,
00206                                                self._seq.emissions[seq_pos])]
00207 
00208             # now sum over all of the previous vars and transitions
00209             trans_and_var_sum = 0
00210             for second_state in self._mm.transitions_from(main_state):
00211                 # the value of the previous f or b value
00212                 var_value = previous_vars[(second_state, seq_pos - 1)]
00213 
00214                 # the transition probability
00215                 trans_value = self._mm.transition_prob[(second_state,
00216                                                         main_state)]
00217 
00218                 trans_and_var_sum += (var_value * trans_value)
00219 
00220             s_value += (emission * trans_and_var_sum)
00221 
00222         return s_value

Here is the caller graph for this function:

def Bio.HMM.DynamicProgramming.ScaledDPAlgorithms._forward_recursion (   self,
  cur_state,
  sequence_pos,
  forward_vars 
) [private]
Calculate the value of the forward recursion.

Arguments:

o cur_state -- The letter of the state we are calculating the
forward variable for.

o sequence_pos -- The position we are at in the training seq.

o forward_vars -- The current set of forward variables

Definition at line 223 of file DynamicProgramming.py.

00223 
00224     def _forward_recursion(self, cur_state, sequence_pos, forward_vars):
00225         """Calculate the value of the forward recursion.
00226 
00227         Arguments:
00228 
00229         o cur_state -- The letter of the state we are calculating the
00230         forward variable for.
00231 
00232         o sequence_pos -- The position we are at in the training seq.
00233 
00234         o forward_vars -- The current set of forward variables
00235         """
00236         # calculate the s value, if we haven't done so already (ie. during
00237         # a previous forward or backward recursion)
00238         if sequence_pos not in self._s_values:
00239             self._s_values[sequence_pos] = \
00240               self._calculate_s_value(sequence_pos, forward_vars)
00241 
00242         # e_{l}(x_{i})
00243         seq_letter = self._seq.emissions[sequence_pos]
00244         cur_emission_prob = self._mm.emission_prob[(cur_state, seq_letter)]
00245         # divide by the scaling value
00246         scale_emission_prob = (float(cur_emission_prob) /
00247                                float(self._s_values[sequence_pos]))
00248         
00249         # loop over all of the possible states at the position
00250         state_pos_sum = 0
00251         have_transition = 0
00252         for second_state in self._mm.transitions_from(cur_state):
00253             have_transition = 1
00254             
00255             # get the previous forward_var values
00256             # f_{k}(i - 1)
00257             prev_forward = forward_vars[(second_state, sequence_pos - 1)]
00258 
00259             # a_{kl}
00260             cur_trans_prob = self._mm.transition_prob[(second_state,
00261                                                        cur_state)]
00262             state_pos_sum += prev_forward * cur_trans_prob
00263 
00264         # if we have the possiblity of having a transition
00265         # return the recursion value
00266         if have_transition:
00267             return (scale_emission_prob * state_pos_sum)
00268         else:
00269             return None

Here is the call graph for this function:

Here is the caller graph for this function:

Calculate sequence probability using the backward algorithm.

This implements the backward algorithm, as described on p58-59 of
Durbin et al.

Returns:

o A dictionary containing the backwards variables. This has keys
of the form (state letter, position in the training sequence),
and values containing the calculated backward variable.

Definition at line 107 of file DynamicProgramming.py.

00107 
00108     def backward_algorithm(self):
00109         """Calculate sequence probability using the backward algorithm.
00110 
00111         This implements the backward algorithm, as described on p58-59 of
00112         Durbin et al.
00113 
00114         Returns:
00115 
00116         o A dictionary containing the backwards variables. This has keys
00117         of the form (state letter, position in the training sequence),
00118         and values containing the calculated backward variable.
00119         """
00120         # all of the different letters that the state path can be in
00121         state_letters = self._seq.states.alphabet.letters
00122         
00123         # -- initialize the algorithm
00124         #
00125         # NOTE: My index numbers are one less than what is given in Durbin
00126         # et al, since we are indexing the sequence going from 0 to
00127         # (Length - 1) not 1 to Length, like in Durbin et al.
00128         #
00129         backward_var = {}
00130         
00131         first_letter = state_letters[0]
00132         # b_{k}(L) = a_{k0} for all k
00133         for state in state_letters:
00134             backward_var[(state, len(self._seq.emissions) - 1)] = \
00135               self._mm.transition_prob[(state, state_letters[0])]
00136 
00137         # -- recursion
00138         # first loop over the training sequence backwards
00139         # Recursion step: (i = L - 1 ... 1)
00140         all_indexes = range(len(self._seq.emissions) - 1)
00141         all_indexes.reverse()
00142         for i in all_indexes:
00143             # now loop over the letters in the state path
00144             for main_state in state_letters:
00145                 # calculate the backward value using the appropriate
00146                 # method to prevent underflow errors
00147                 backward_value = self._backward_recursion(main_state, i,
00148                                                           backward_var)
00149 
00150                 if backward_value is not None:
00151                     backward_var[(main_state, i)] = backward_value
00152 
00153         # skip the termination step to avoid recalculations -- you should
00154         # get sequence probabilities using the forward algorithm
00155 
00156         return backward_var
        

Here is the call graph for this function:

Calculate sequence probability using the forward algorithm.

This implements the foward algorithm, as described on p57-58 of
Durbin et al.

Returns:

o A dictionary containing the foward variables. This has keys of the
form (state letter, position in the training sequence), and values
containing the calculated forward variable.

o The calculated probability of the sequence.

Definition at line 42 of file DynamicProgramming.py.

00042 
00043     def forward_algorithm(self):
00044         """Calculate sequence probability using the forward algorithm.
00045 
00046         This implements the foward algorithm, as described on p57-58 of
00047         Durbin et al.
00048 
00049         Returns:
00050 
00051         o A dictionary containing the foward variables. This has keys of the
00052         form (state letter, position in the training sequence), and values
00053         containing the calculated forward variable.
00054 
00055         o The calculated probability of the sequence.
00056         """
00057         # all of the different letters that the state path can be in
00058         state_letters = self._seq.states.alphabet.letters
00059         
00060         # -- initialize the algorithm
00061         #
00062         # NOTE: My index numbers are one less than what is given in Durbin
00063         # et al, since we are indexing the sequence going from 0 to
00064         # (Length - 1) not 1 to Length, like in Durbin et al.
00065         #
00066         forward_var = {}
00067         # f_{0}(0) = 1 
00068         forward_var[(state_letters[0], -1)] = 1
00069         # f_{k}(0) = 0, for k > 0
00070         for k in range(1, len(state_letters)):
00071             forward_var[(state_letters[k], -1)] = 0
00072 
00073         # -- now do the recursion step
00074         # loop over the training sequence
00075         # Recursion step: (i = 1 .. L)
00076         for i in range(len(self._seq.emissions)):
00077             # now loop over the letters in the state path
00078             for main_state in state_letters:
00079                 # calculate the forward value using the appropriate
00080                 # method to prevent underflow errors
00081                 forward_value = self._forward_recursion(main_state, i,
00082                                                         forward_var)
00083 
00084                 if forward_value is not None:
00085                     forward_var[(main_state, i)] = forward_value
00086                 
00087         # -- termination step - calculate the probability of the sequence
00088         first_state = state_letters[0]
00089         seq_prob = 0
00090 
00091         for state_item in state_letters:
00092             # f_{k}(L)
00093             forward_value = forward_var[(state_item,
00094                                          len(self._seq.emissions) - 1)]
00095             # a_{k0}
00096             transition_value = self._mm.transition_prob[(state_item,
00097                                                          first_state)]
00098 
00099             seq_prob += forward_value * transition_value
00100 
00101         return forward_var, seq_prob

Here is the call graph for this function:


Member Data Documentation

Definition at line 179 of file DynamicProgramming.py.


The documentation for this class was generated from the following file: