pythonbiopython
1.60

Public Member Functions  
def  __init__ 
def  get_blank_transitions 
def  get_blank_emissions 
def  transitions_from 
def  transitions_to 
def  viterbi 
Public Attributes  
initial_prob  
transition_prob  
emission_prob  
Private Member Functions  
def  _log_transform 
Private Attributes  
_transition_pseudo  
_emission_pseudo  
_transitions_from  
_transitions_to 
Represent a hidden markov model that can be used for state estimation.
Definition at line 435 of file MarkovModel.py.
def Bio.HMM.MarkovModel.HiddenMarkovModel.__init__  (  self,  
initial_prob,  
transition_prob,  
emission_prob,  
transition_pseudo,  
emission_pseudo  
) 
Initialize a Markov Model. Note: You should use the MarkovModelBuilder class instead of initiating this class directly. Arguments: o initial_prob  A dictionary of initial probabilities for all states. o transition_prob  A dictionary of transition probabilities for all possible transitions in the sequence. o emission_prob  A dictionary of emission probabilities for all possible emissions from the sequence states. o transition_pseudo  Pseudocounts to be used for the transitions, when counting for purposes of estimating transition probabilities. o emission_pseudo  Pseudocounts to be used for the emissions, when counting for purposes of estimating emission probabilities.
Definition at line 439 of file MarkovModel.py.
00439 00440 transition_pseudo, emission_pseudo): 00441 """Initialize a Markov Model. 00442 00443 Note: You should use the MarkovModelBuilder class instead of 00444 initiating this class directly. 00445 00446 Arguments: 00447 00448 o initial_prob  A dictionary of initial probabilities for all states. 00449 00450 o transition_prob  A dictionary of transition probabilities for all 00451 possible transitions in the sequence. 00452 00453 o emission_prob  A dictionary of emission probabilities for all 00454 possible emissions from the sequence states. 00455 00456 o transition_pseudo  Pseudocounts to be used for the transitions, 00457 when counting for purposes of estimating transition probabilities. 00458 00459 o emission_pseudo  Pseudocounts to be used for the emissions, 00460 when counting for purposes of estimating emission probabilities. 00461 """ 00462 00463 self.initial_prob = initial_prob 00464 00465 self._transition_pseudo = transition_pseudo 00466 self._emission_pseudo = emission_pseudo 00467 00468 self.transition_prob = transition_prob 00469 self.emission_prob = emission_prob 00470 00471 # a dictionary of the possible transitions from each state 00472 # each key is a source state, mapped to a list of the destination states 00473 # that are reachable from the source state via a transition 00474 self._transitions_from = \ 00475 _calculate_from_transitions(self.transition_prob) 00476 00477 # a dictionary of the possible transitions to each state 00478 # each key is a destination state, mapped to a list of source states 00479 # from which the destination is reachable via a transition 00480 self._transitions_to = \ 00481 _calculate_to_transitions(self.transition_prob) 00482
def Bio.HMM.MarkovModel.HiddenMarkovModel._log_transform  (  self,  
probability  
)  [private] 
Return log transform of the given probability dictionary. When calculating the Viterbi equation, add logs of probabilities rather than multiplying probabilities, to avoid underflow errors. This method returns a new dictionary with the same keys as the given dictionary and logtransformed values.
Definition at line 635 of file MarkovModel.py.
00635 00636 def _log_transform(self, probability): 00637 """Return log transform of the given probability dictionary. 00638 00639 When calculating the Viterbi equation, add logs of probabilities rather 00640 than multiplying probabilities, to avoid underflow errors. This method 00641 returns a new dictionary with the same keys as the given dictionary 00642 and logtransformed values. 00643 """ 00644 log_prob = copy.copy(probability) 00645 try: 00646 neg_inf = float("inf") 00647 except ValueError: 00648 #On Python 2.5 or older that was handled in C code, 00649 #and failed on Windows XP 32bit 00650 neg_inf =  1E400 00651 for key in log_prob: 00652 prob = log_prob[key] 00653 if prob > 0: 00654 log_prob[key] = math.log(log_prob[key]) 00655 else: 00656 log_prob[key] = neg_inf 00657 00658 return log_prob 00659
Get the starting default emmissions for each sequence. This returns a dictionary of the default emmissions for each letter. The dictionary is structured with keys as (seq_letter, emmission_letter) and values as the starting number of emmissions.
Definition at line 493 of file MarkovModel.py.
00493 00494 def get_blank_emissions(self): 00495 """Get the starting default emmissions for each sequence. 00496 00497 This returns a dictionary of the default emmissions for each 00498 letter. The dictionary is structured with keys as 00499 (seq_letter, emmission_letter) and values as the starting number 00500 of emmissions. 00501 """ 00502 return self._emission_pseudo
Get the default transitions for the model. Returns a dictionary of all of the default transitions between any two letters in the sequence alphabet. The dictionary is structured with keys as (letter1, letter2) and values as the starting number of transitions.
Definition at line 483 of file MarkovModel.py.
00483 00484 def get_blank_transitions(self): 00485 """Get the default transitions for the model. 00486 00487 Returns a dictionary of all of the default transitions between any 00488 two letters in the sequence alphabet. The dictionary is structured 00489 with keys as (letter1, letter2) and values as the starting number 00490 of transitions. 00491 """ 00492 return self._transition_pseudo
def Bio.HMM.MarkovModel.HiddenMarkovModel.transitions_from  (  self,  
state_letter  
) 
Get all destination states to which there are transitions from the state_letter source state. This returns all letters which the given state_letter can transition to. An empty list is returned if state_letter has no outgoing transitions.
Definition at line 503 of file MarkovModel.py.
00503 00504 def transitions_from(self, state_letter): 00505 """Get all destination states to which there are transitions from the 00506 state_letter source state. 00507 00508 This returns all letters which the given state_letter can transition 00509 to. An empty list is returned if state_letter has no outgoing 00510 transitions. 00511 """ 00512 if state_letter in self._transitions_from: 00513 return self._transitions_from[state_letter] 00514 else: 00515 return []
def Bio.HMM.MarkovModel.HiddenMarkovModel.transitions_to  (  self,  
state_letter  
) 
Get all source states from which there are transitions to the state_letter destination state. This returns all letters which the given state_letter is reachable from. An empty list is returned if state_letter is unreachable.
Definition at line 516 of file MarkovModel.py.
00516 00517 def transitions_to(self, state_letter): 00518 """Get all source states from which there are transitions to the 00519 state_letter destination state. 00520 00521 This returns all letters which the given state_letter is reachable 00522 from. An empty list is returned if state_letter is unreachable. 00523 """ 00524 if state_letter in self._transitions_to: 00525 return self._transitions_to[state_letter] 00526 else: 00527 return []
def Bio.HMM.MarkovModel.HiddenMarkovModel.viterbi  (  self,  
sequence,  
state_alphabet  
) 
Calculate the most probable state path using the Viterbi algorithm. This implements the Viterbi algorithm (see pgs 5557 in Durbin et al for a full explanation  this is where I took my implementation ideas from), to allow decoding of the state path, given a sequence of emissions. Arguments: o sequence  A Seq object with the emission sequence that we want to decode. o state_alphabet  The alphabet of the possible state sequences that can be generated.
Definition at line 528 of file MarkovModel.py.
00528 00529 def viterbi(self, sequence, state_alphabet): 00530 """Calculate the most probable state path using the Viterbi algorithm. 00531 00532 This implements the Viterbi algorithm (see pgs 5557 in Durbin et 00533 al for a full explanation  this is where I took my implementation 00534 ideas from), to allow decoding of the state path, given a sequence 00535 of emissions. 00536 00537 Arguments: 00538 00539 o sequence  A Seq object with the emission sequence that we 00540 want to decode. 00541 00542 o state_alphabet  The alphabet of the possible state sequences 00543 that can be generated. 00544 """ 00545 00546 # calculate logarithms of the initial, transition, and emission probs 00547 log_initial = self._log_transform(self.initial_prob) 00548 log_trans = self._log_transform(self.transition_prob) 00549 log_emission = self._log_transform(self.emission_prob) 00550 00551 viterbi_probs = {} 00552 pred_state_seq = {} 00553 state_letters = state_alphabet.letters 00554 00555 #  recursion 00556 # loop over the training squence (i = 1 .. L) 00557 # NOTE: My index numbers are one less than what is given in Durbin 00558 # et al, since we are indexing the sequence going from 0 to 00559 # (Length  1) not 1 to Length, like in Durbin et al. 00560 for i in range(0, len(sequence)): 00561 # loop over all of the possible ith states in the state path 00562 for cur_state in state_letters: 00563 # e_{l}(x_{i}) 00564 emission_part = log_emission[(cur_state, sequence[i])] 00565 00566 max_prob = 0 00567 if i == 0: 00568 # for the first state, use the initial probability rather 00569 # than looking back to previous states 00570 max_prob = log_initial[cur_state] 00571 else: 00572 # loop over all possible (i1)th previous states 00573 possible_state_probs = {} 00574 for prev_state in self.transitions_to(cur_state): 00575 # a_{kl} 00576 trans_part = log_trans[(prev_state, cur_state)] 00577 00578 # v_{k}(i  1) 00579 viterbi_part = viterbi_probs[(prev_state, i  1)] 00580 cur_prob = viterbi_part + trans_part 00581 00582 possible_state_probs[prev_state] = cur_prob 00583 00584 # calculate the viterbi probability using the max 00585 max_prob = max(possible_state_probs.values()) 00586 00587 # v_{k}(i) 00588 viterbi_probs[(cur_state, i)] = (emission_part + max_prob) 00589 00590 if i > 0: 00591 # get the most likely prev_state leading to cur_state 00592 for state in possible_state_probs: 00593 if possible_state_probs[state] == max_prob: 00594 pred_state_seq[(i  1, cur_state)] = state 00595 break 00596 00597 #  termination 00598 # calculate the probability of the state path 00599 # loop over all states 00600 all_probs = {} 00601 for state in state_letters: 00602 # v_{k}(L) 00603 all_probs[state] = viterbi_probs[(state, len(sequence)  1)] 00604 00605 state_path_prob = max(all_probs.values()) 00606 00607 # find the last pointer we need to trace back from 00608 last_state = '' 00609 for state in all_probs: 00610 if all_probs[state] == state_path_prob: 00611 last_state = state 00612 00613 assert last_state != '', "Didn't find the last state to trace from!" 00614 00615 #  traceback 00616 traceback_seq = MutableSeq('', state_alphabet) 00617 00618 loop_seq = range(1, len(sequence)) 00619 loop_seq.reverse() 00620 00621 # last_state is the last state in the most probable state sequence. 00622 # Compute that sequence by walking backwards in time. From the ith 00623 # state in the sequence, find the (i1)th state as the most 00624 # probable state preceding the ith state. 00625 state = last_state 00626 traceback_seq.append(state) 00627 for i in loop_seq: 00628 state = pred_state_seq[(i  1, state)] 00629 traceback_seq.append(state) 00630 00631 # put the traceback sequence in the proper orientation 00632 traceback_seq.reverse() 00633 00634 return traceback_seq.toseq(), state_path_prob
Definition at line 465 of file MarkovModel.py.
Definition at line 464 of file MarkovModel.py.
Definition at line 473 of file MarkovModel.py.
Definition at line 479 of file MarkovModel.py.
Definition at line 468 of file MarkovModel.py.
Definition at line 462 of file MarkovModel.py.
Definition at line 467 of file MarkovModel.py.