pythonbiopython
1.60

Public Member Functions  
def  __init__ 
def  get_markov_model 
def  set_initial_probabilities 
def  set_equal_probabilities 
def  set_random_initial_probabilities 
def  set_random_transition_probabilities 
def  set_random_emission_probabilities 
def  set_random_probabilities 
def  allow_all_transitions 
def  allow_transition 
def  destroy_transition 
def  set_transition_score 
def  set_transition_pseudocount 
def  set_emission_score 
def  set_emission_pseudocount 
Public Attributes  
initial_prob  
transition_prob  
emission_prob  
transition_pseudo  
emission_pseudo  
Static Public Attributes  
int  DEFAULT_PSEUDO = 1 
Private Member Functions  
def  _all_blank 
def  _all_pseudo 
Private Attributes  
_state_alphabet  
_emission_alphabet 
Interface to build up a Markov Model. This class is designed to try to separate the task of specifying the Markov Model from the actual model itself. This is in hopes of making the actual Markov Model classes smaller. So, this builder class should be used to create Markov models instead of trying to initiate a Markov Model directly.
Definition at line 73 of file MarkovModel.py.
def Bio.HMM.MarkovModel.MarkovModelBuilder.__init__  (  self,  
state_alphabet,  
emission_alphabet  
) 
Initialize a builder to create Markov Models. Arguments: o state_alphabet  An alphabet containing all of the letters that can appear in the states o emission_alphabet  An alphabet containing all of the letters for states that can be emitted by the HMM.
Definition at line 86 of file MarkovModel.py.
00086 00087 def __init__(self, state_alphabet, emission_alphabet): 00088 """Initialize a builder to create Markov Models. 00089 00090 Arguments: 00091 00092 o state_alphabet  An alphabet containing all of the letters that 00093 can appear in the states 00094 00095 o emission_alphabet  An alphabet containing all of the letters for 00096 states that can be emitted by the HMM. 00097 """ 00098 self._state_alphabet = state_alphabet 00099 self._emission_alphabet = emission_alphabet 00100 00101 # probabilities for the initial state, initialized by calling 00102 # set_initial_probabilities (required) 00103 self.initial_prob = {} 00104 00105 # the probabilities for transitions and emissions 00106 # by default we have no transitions and all possible emissions 00107 self.transition_prob = {} 00108 self.emission_prob = self._all_blank(state_alphabet, 00109 emission_alphabet) 00110 00111 # the default pseudocounts for transition and emission counting 00112 self.transition_pseudo = {} 00113 self.emission_pseudo = self._all_pseudo(state_alphabet, 00114 emission_alphabet)
def Bio.HMM.MarkovModel.MarkovModelBuilder._all_blank  (  self,  
first_alphabet,  
second_alphabet  
)  [private] 
Return a dictionary with all counts set to zero. This uses the letters in the first and second alphabet to create a dictionary with keys of two tuples organized as (letter of first alphabet, letter of second alphabet). The values are all set to 0.
Definition at line 115 of file MarkovModel.py.
00115 00116 def _all_blank(self, first_alphabet, second_alphabet): 00117 """Return a dictionary with all counts set to zero. 00118 00119 This uses the letters in the first and second alphabet to create 00120 a dictionary with keys of two tuples organized as 00121 (letter of first alphabet, letter of second alphabet). The values 00122 are all set to 0. 00123 """ 00124 all_blank = {} 00125 for first_state in first_alphabet.letters: 00126 for second_state in second_alphabet.letters: 00127 all_blank[(first_state, second_state)] = 0 00128 00129 return all_blank
def Bio.HMM.MarkovModel.MarkovModelBuilder._all_pseudo  (  self,  
first_alphabet,  
second_alphabet  
)  [private] 
Return a dictionary with all counts set to a default value. This takes the letters in first alphabet and second alphabet and creates a dictionary with keys of two tuples organized as: (letter of first alphabet, letter of second alphabet). The values are all set to the value of the class attribute DEFAULT_PSEUDO.
Definition at line 130 of file MarkovModel.py.
00130 00131 def _all_pseudo(self, first_alphabet, second_alphabet): 00132 """Return a dictionary with all counts set to a default value. 00133 00134 This takes the letters in first alphabet and second alphabet and 00135 creates a dictionary with keys of two tuples organized as: 00136 (letter of first alphabet, letter of second alphabet). The values 00137 are all set to the value of the class attribute DEFAULT_PSEUDO. 00138 """ 00139 all_counts = {} 00140 for first_state in first_alphabet.letters: 00141 for second_state in second_alphabet.letters: 00142 all_counts[(first_state, second_state)] = self.DEFAULT_PSEUDO 00143 00144 return all_counts
A convenience function to create transitions between all states. By default all transitions within the alphabet are disallowed; this is a way to change this to allow all possible transitions.
Definition at line 301 of file MarkovModel.py.
00301 00302 def allow_all_transitions(self): 00303 """A convenience function to create transitions between all states. 00304 00305 By default all transitions within the alphabet are disallowed; this 00306 is a way to change this to allow all possible transitions. 00307 """ 00308 # first get all probabilities and pseudo counts set 00309 # to the default values 00310 all_probs = self._all_blank(self._state_alphabet, 00311 self._state_alphabet) 00312 00313 all_pseudo = self._all_pseudo(self._state_alphabet, 00314 self._state_alphabet) 00315 00316 # now set any probabilities and pseudo counts that 00317 # were previously set 00318 for set_key in self.transition_prob: 00319 all_probs[set_key] = self.transition_prob[set_key] 00320 00321 for set_key in self.transition_pseudo: 00322 all_pseudo[set_key] = self.transition_pseudo[set_key] 00323 00324 # finally reinitialize the transition probs and pseudo counts 00325 self.transition_prob = all_probs 00326 self.transition_pseudo = all_pseudo
def Bio.HMM.MarkovModel.MarkovModelBuilder.allow_transition  (  self,  
from_state,  
to_state,  
probability = None , 

pseudocount = None 

) 
Set a transition as being possible between the two states. probability and pseudocount are optional arguments specifying the probabilities and pseudo counts for the transition. If these are not supplied, then the values are set to the default values. Raises: KeyError  if the two states already have an allowed transition.
Definition at line 328 of file MarkovModel.py.
00328 00329 pseudocount = None): 00330 """Set a transition as being possible between the two states. 00331 00332 probability and pseudocount are optional arguments 00333 specifying the probabilities and pseudo counts for the transition. 00334 If these are not supplied, then the values are set to the 00335 default values. 00336 00337 Raises: 00338 KeyError  if the two states already have an allowed transition. 00339 """ 00340 # check the sanity of adding these states 00341 for state in [from_state, to_state]: 00342 assert state in self._state_alphabet.letters, \ 00343 "State %s was not found in the sequence alphabet" % state 00344 00345 # ensure that the states are not already set 00346 if ((from_state, to_state) not in self.transition_prob and 00347 (from_state, to_state) not in self.transition_pseudo): 00348 # set the initial probability 00349 if probability is None: 00350 probability = 0 00351 self.transition_prob[(from_state, to_state)] = probability 00352 00353 # set the initial pseudocounts 00354 if pseudocount is None: 00355 pseudcount = self.DEFAULT_PSEUDO 00356 self.transition_pseudo[(from_state, to_state)] = pseudocount 00357 else: 00358 raise KeyError("Transition from %s to %s is already allowed." 00359 % (from_state, to_state))
def Bio.HMM.MarkovModel.MarkovModelBuilder.destroy_transition  (  self,  
from_state,  
to_state  
) 
Restrict transitions between the two states. Raises: KeyError if the transition is not currently allowed.
Definition at line 360 of file MarkovModel.py.
00360 00361 def destroy_transition(self, from_state, to_state): 00362 """Restrict transitions between the two states. 00363 00364 Raises: 00365 KeyError if the transition is not currently allowed. 00366 """ 00367 try: 00368 del self.transition_prob[(from_state, to_state)] 00369 del self.transition_pseudo[(from_state, to_state)] 00370 except KeyError: 00371 raise KeyError("Transition from %s to %s is already disallowed." 00372 % (from_state, to_state))
Return the markov model corresponding with the current parameters. Each markov model returned by a call to this function is unique (ie. they don't influence each other).
Definition at line 145 of file MarkovModel.py.
00145 00146 def get_markov_model(self): 00147 """Return the markov model corresponding with the current parameters. 00148 00149 Each markov model returned by a call to this function is unique 00150 (ie. they don't influence each other). 00151 """ 00152 00153 # user must set initial probabilities 00154 if not self.initial_prob: 00155 raise Exception("set_initial_probabilities must be called to " + 00156 "fully initialize the Markov model") 00157 00158 initial_prob = copy.deepcopy(self.initial_prob) 00159 transition_prob = copy.deepcopy(self.transition_prob) 00160 emission_prob = copy.deepcopy(self.emission_prob) 00161 transition_pseudo = copy.deepcopy(self.transition_pseudo) 00162 emission_pseudo = copy.deepcopy(self.emission_pseudo) 00163 00164 return HiddenMarkovModel(initial_prob, transition_prob, emission_prob, 00165 transition_pseudo, emission_pseudo)
def Bio.HMM.MarkovModel.MarkovModelBuilder.set_emission_pseudocount  (  self,  
seq_state,  
emission_state,  
count  
) 
Set the default pseudocount for an emission. To avoid computational problems, it is helpful to be able to set a 'default' pseudocount to start with for estimating transition and emission probabilities (see p62 in Durbin et al for more discussion on this. By default, all emissions have a pseudocount of 1. Raises: KeyError if the emission from the given state is not allowed.
Definition at line 417 of file MarkovModel.py.
00417 00418 def set_emission_pseudocount(self, seq_state, emission_state, count): 00419 """Set the default pseudocount for an emission. 00420 00421 To avoid computational problems, it is helpful to be able to 00422 set a 'default' pseudocount to start with for estimating 00423 transition and emission probabilities (see p62 in Durbin et al 00424 for more discussion on this. By default, all emissions have 00425 a pseudocount of 1. 00426 00427 Raises: 00428 KeyError if the emission from the given state is not allowed. 00429 """ 00430 if (seq_state, emission_state) in self.emission_pseudo: 00431 self.emission_pseudo[(seq_state, emission_state)] = count 00432 else: 00433 raise KeyError("Emission of %s from %s is not allowed." 00434 % (emission_state, seq_state))
def Bio.HMM.MarkovModel.MarkovModelBuilder.set_emission_score  (  self,  
seq_state,  
emission_state,  
probability  
) 
Set the probability of a emission from a particular state. Raises: KeyError if the emission from the given state is not allowed.
Definition at line 405 of file MarkovModel.py.
00405 00406 def set_emission_score(self, seq_state, emission_state, probability): 00407 """Set the probability of a emission from a particular state. 00408 00409 Raises: 00410 KeyError if the emission from the given state is not allowed. 00411 """ 00412 if (seq_state, emission_state) in self.emission_prob: 00413 self.emission_prob[(seq_state, emission_state)] = probability 00414 else: 00415 raise KeyError("Emission of %s from %s is not allowed." 00416 % (emission_state, seq_state))
Reset all probabilities to be an average value. Resets the values of all initial probabilities and all allowed transitions and all allowed emissions to be equal to 1 divided by the number of possible elements. This is useful if you just want to initialize a Markov Model to starting values (ie. if you have no prior notions of what the probabilities should be  or if you are just feeling too lazy to calculate them :). Warning 1  this will reset all currently set probabilities. Warning 2  This just sets all probabilities for transitions and emissions to total up to 1, so it doesn't ensure that the sum of each set of transitions adds up to 1.
Definition at line 207 of file MarkovModel.py.
00207 00208 def set_equal_probabilities(self): 00209 """Reset all probabilities to be an average value. 00210 00211 Resets the values of all initial probabilities and all allowed 00212 transitions and all allowed emissions to be equal to 1 divided by the 00213 number of possible elements. 00214 00215 This is useful if you just want to initialize a Markov Model to 00216 starting values (ie. if you have no prior notions of what the 00217 probabilities should be  or if you are just feeling too lazy 00218 to calculate them :). 00219 00220 Warning 1  this will reset all currently set probabilities. 00221 00222 Warning 2  This just sets all probabilities for transitions and 00223 emissions to total up to 1, so it doesn't ensure that the sum of 00224 each set of transitions adds up to 1. 00225 """ 00226 00227 # set initial state probabilities 00228 new_initial_prob = float(1) / float(len(self.transition_prob)) 00229 for state in self._state_alphabet.letters: 00230 self.initial_prob[state] = new_initial_prob 00231 00232 # set the transitions 00233 new_trans_prob = float(1) / float(len(self.transition_prob)) 00234 for key in self.transition_prob: 00235 self.transition_prob[key] = new_trans_prob 00236 00237 # set the emissions 00238 new_emission_prob = float(1) / float(len(self.emission_prob)) 00239 for key in self.emission_prob: 00240 self.emission_prob[key] = new_emission_prob 00241
def Bio.HMM.MarkovModel.MarkovModelBuilder.set_initial_probabilities  (  self,  
initial_prob  
) 
Set initial state probabilities. initial_prob is a dictionary mapping states to probabilities. Suppose, for example, that the state alphabet is ['A', 'B']. Call set_initial_prob({'A': 1}) to guarantee that the initial state will be 'A'. Call set_initial_prob({'A': 0.5, 'B': 0.5}) to make each initial state equally probable. This method must now be called in order to use the Markov model because the calculation of initial probabilities has changed incompatibly; the previous calculation was incorrect. If initial probabilities are set for all states, then they should add up to 1. Otherwise the sum should be <= 1. The residual probability is divided up evenly between all the states for which the initial probability has not been set. For example, calling set_initial_prob({}) results in P('A') = 0.5 and P('B') = 0.5, for the above example.
Definition at line 166 of file MarkovModel.py.
00166 00167 def set_initial_probabilities(self, initial_prob): 00168 """Set initial state probabilities. 00169 00170 initial_prob is a dictionary mapping states to probabilities. 00171 Suppose, for example, that the state alphabet is ['A', 'B']. Call 00172 set_initial_prob({'A': 1}) to guarantee that the initial 00173 state will be 'A'. Call set_initial_prob({'A': 0.5, 'B': 0.5}) 00174 to make each initial state equally probable. 00175 00176 This method must now be called in order to use the Markov model 00177 because the calculation of initial probabilities has changed 00178 incompatibly; the previous calculation was incorrect. 00179 00180 If initial probabilities are set for all states, then they should add up 00181 to 1. Otherwise the sum should be <= 1. The residual probability is 00182 divided up evenly between all the states for which the initial 00183 probability has not been set. For example, calling 00184 set_initial_prob({}) results in P('A') = 0.5 and P('B') = 0.5, 00185 for the above example. 00186 """ 00187 self.initial_prob = copy.copy(initial_prob) 00188 00189 # ensure that all referenced states are valid 00190 for state in initial_prob.iterkeys(): 00191 assert state in self._state_alphabet.letters, \ 00192 "State %s was not found in the sequence alphabet" % state 00193 00194 # distribute the residual probability, if any 00195 num_states_not_set =\ 00196 len(self._state_alphabet.letters)  len(self.initial_prob) 00197 if num_states_not_set < 0: 00198 raise Exception("Initial probabilities can't exceed # of states") 00199 prob_sum = sum(self.initial_prob.values()) 00200 if prob_sum > 1.0: 00201 raise Exception("Total initial probability cannot exceed 1.0") 00202 if num_states_not_set > 0: 00203 prob = (1.0  prob_sum) / num_states_not_set 00204 for state in self._state_alphabet.letters: 00205 if not state in self.initial_prob: 00206 self.initial_prob[state] = prob
Set all allowed emission probabilities to a randomly generated distribution. Returns the dictionary containing the emission probabilities.
Definition at line 270 of file MarkovModel.py.
00270 00271 def set_random_emission_probabilities(self): 00272 """Set all allowed emission probabilities to a randomly generated 00273 distribution. Returns the dictionary containing the emission 00274 probabilities. 00275 """ 00276 00277 if not self.emission_prob: 00278 raise Exception("No emissions have been allowed yet. " + 00279 "Allow some or all emissions.") 00280 00281 emissions = _calculate_emissions(self.emission_prob) 00282 for state in emissions.iterkeys(): 00283 freqs = _gen_random_array(len(emissions[state])) 00284 for symbol in emissions[state]: 00285 self.emission_prob[(state, symbol)] = freqs.pop() 00286 00287 return self.emission_prob 00288
Set all initial state probabilities to a randomly generated distribution. Returns the dictionary containing the initial probabilities.
Definition at line 242 of file MarkovModel.py.
00242 00243 def set_random_initial_probabilities(self): 00244 """Set all initial state probabilities to a randomly generated distribution. 00245 Returns the dictionary containing the initial probabilities. 00246 """ 00247 initial_freqs = _gen_random_array(len(self._state_alphabet.letters)) 00248 for state in self._state_alphabet.letters: 00249 self.initial_prob[state] = initial_freqs.pop() 00250 00251 return self.initial_prob
Set all probabilities to randomly generated numbers. Resets probabilities of all initial states, transitions, and emissions to random values.
Definition at line 289 of file MarkovModel.py.
00289 00290 def set_random_probabilities(self): 00291 """Set all probabilities to randomly generated numbers. 00292 00293 Resets probabilities of all initial states, transitions, and 00294 emissions to random values. 00295 """ 00296 self.set_random_initial_probabilities() 00297 self.set_random_transition_probabilities() 00298 self.set_random_emission_probabilities()
Set all allowed transition probabilities to a randomly generated distribution. Returns the dictionary containing the transition probabilities.
Definition at line 252 of file MarkovModel.py.
00252 00253 def set_random_transition_probabilities(self): 00254 """Set all allowed transition probabilities to a randomly generated distribution. 00255 Returns the dictionary containing the transition probabilities. 00256 """ 00257 00258 if not self.transition_prob: 00259 raise Exception("No transitions have been allowed yet. " + 00260 "Allow some or all transitions by calling " + 00261 "allow_transition or allow_all_transitions first.") 00262 00263 transitions_from = _calculate_from_transitions(self.transition_prob) 00264 for from_state in transitions_from.keys(): 00265 freqs = _gen_random_array(len(transitions_from[from_state])) 00266 for to_state in transitions_from[from_state]: 00267 self.transition_prob[(from_state, to_state)] = freqs.pop() 00268 00269 return self.transition_prob
def Bio.HMM.MarkovModel.MarkovModelBuilder.set_transition_pseudocount  (  self,  
from_state,  
to_state,  
count  
) 
Set the default pseudocount for a transition. To avoid computational problems, it is helpful to be able to set a 'default' pseudocount to start with for estimating transition and emission probabilities (see p62 in Durbin et al for more discussion on this. By default, all transitions have a pseudocount of 1. Raises: KeyError if the transition is not allowed.
Definition at line 385 of file MarkovModel.py.
00385 00386 def set_transition_pseudocount(self, from_state, to_state, count): 00387 """Set the default pseudocount for a transition. 00388 00389 To avoid computational problems, it is helpful to be able to 00390 set a 'default' pseudocount to start with for estimating 00391 transition and emission probabilities (see p62 in Durbin et al 00392 for more discussion on this. By default, all transitions have 00393 a pseudocount of 1. 00394 00395 Raises: 00396 KeyError if the transition is not allowed. 00397 """ 00398 if (from_state, to_state) in self.transition_pseudo: 00399 self.transition_pseudo[(from_state, to_state)] = count 00400 else: 00401 raise KeyError("Transition from %s to %s is not allowed." 00402 % (from_state, to_state))
def Bio.HMM.MarkovModel.MarkovModelBuilder.set_transition_score  (  self,  
from_state,  
to_state,  
probability  
) 
Set the probability of a transition between two states. Raises: KeyError if the transition is not allowed.
Definition at line 373 of file MarkovModel.py.
00373 00374 def set_transition_score(self, from_state, to_state, probability): 00375 """Set the probability of a transition between two states. 00376 00377 Raises: 00378 KeyError if the transition is not allowed. 00379 """ 00380 if (from_state, to_state) in self.transition_prob: 00381 self.transition_prob[(from_state, to_state)] = probability 00382 else: 00383 raise KeyError("Transition from %s to %s is not allowed." 00384 % (from_state, to_state))
Definition at line 98 of file MarkovModel.py.
Definition at line 97 of file MarkovModel.py.
int Bio.HMM.MarkovModel.MarkovModelBuilder.DEFAULT_PSEUDO = 1 [static] 
Definition at line 84 of file MarkovModel.py.
Definition at line 107 of file MarkovModel.py.
Definition at line 112 of file MarkovModel.py.
Definition at line 102 of file MarkovModel.py.
Definition at line 106 of file MarkovModel.py.
Definition at line 111 of file MarkovModel.py.