Back to index

python-biopython  1.60
Public Member Functions | Private Attributes
Bio.HMM.Trainer.AbstractTrainer Class Reference
Inheritance diagram for Bio.HMM.Trainer.AbstractTrainer:
Inheritance graph
[legend]

List of all members.

Public Member Functions

def __init__
def log_likelihood
def estimate_params
def ml_estimator

Private Attributes

 _markov_model

Detailed Description

Provide generic functionality needed in all trainers.

Definition at line 42 of file Trainer.py.


Constructor & Destructor Documentation

def Bio.HMM.Trainer.AbstractTrainer.__init__ (   self,
  markov_model 
)

Reimplemented in Bio.HMM.Trainer.KnownStateTrainer, and Bio.HMM.Trainer.BaumWelchTrainer.

Definition at line 45 of file Trainer.py.

00045 
00046     def __init__(self, markov_model):
00047         self._markov_model = markov_model

Here is the caller graph for this function:


Member Function Documentation

def Bio.HMM.Trainer.AbstractTrainer.estimate_params (   self,
  transition_counts,
  emission_counts 
)
Get a maximum likelihood estimation of transition and emmission.

Arguments:

o transition_counts -- A dictionary with the total number of counts
of transitions between two states.

o emissions_counts -- A dictionary with the total number of counts
of emmissions of a particular emission letter by a state letter.

This then returns the maximum likelihood estimators for the
transitions and emissions, estimated by formulas 3.18 in
Durbin et al:

a_{kl} = A_{kl} / sum(A_{kl'})
e_{k}(b) = E_{k}(b) / sum(E_{k}(b'))

Returns:
Transition and emission dictionaries containing the maximum
likelihood estimators.

Definition at line 63 of file Trainer.py.

00063 
00064     def estimate_params(self, transition_counts, emission_counts):
00065         """Get a maximum likelihood estimation of transition and emmission.
00066 
00067         Arguments:
00068         
00069         o transition_counts -- A dictionary with the total number of counts
00070         of transitions between two states.
00071 
00072         o emissions_counts -- A dictionary with the total number of counts
00073         of emmissions of a particular emission letter by a state letter.
00074 
00075         This then returns the maximum likelihood estimators for the
00076         transitions and emissions, estimated by formulas 3.18 in
00077         Durbin et al:
00078 
00079         a_{kl} = A_{kl} / sum(A_{kl'})
00080         e_{k}(b) = E_{k}(b) / sum(E_{k}(b'))
00081 
00082         Returns:
00083         Transition and emission dictionaries containing the maximum
00084         likelihood estimators.
00085         """
00086         # now calculate the information
00087         ml_transitions = self.ml_estimator(transition_counts)
00088         ml_emissions = self.ml_estimator(emission_counts)
00089 
00090         return ml_transitions, ml_emissions

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.HMM.Trainer.AbstractTrainer.log_likelihood (   self,
  probabilities 
)
Calculate the log likelihood of the training seqs.

Arguments:

o probabilities -- A list of the probabilities of each training
sequence under the current paramters, calculated using the forward
algorithm.

Definition at line 48 of file Trainer.py.

00048 
00049     def log_likelihood(self, probabilities):
00050         """Calculate the log likelihood of the training seqs.
00051 
00052         Arguments:
00053 
00054         o probabilities -- A list of the probabilities of each training
00055         sequence under the current paramters, calculated using the forward
00056         algorithm.
00057         """
00058         total_likelihood = 0
00059         for probability in probabilities:
00060             total_likelihood += math.log(probability)
00061 
00062         return total_likelihood
                 

Here is the caller graph for this function:

def Bio.HMM.Trainer.AbstractTrainer.ml_estimator (   self,
  counts 
)
Calculate the maximum likelihood estimator.

This can calculate maximum likelihoods for both transitions
and emissions.

Arguments:

o counts -- A dictionary of the counts for each item.

See estimate_params for a description of the formula used for
calculation.

Definition at line 91 of file Trainer.py.

00091 
00092     def ml_estimator(self, counts):
00093         """Calculate the maximum likelihood estimator.
00094 
00095         This can calculate maximum likelihoods for both transitions
00096         and emissions.
00097 
00098         Arguments:
00099 
00100         o counts -- A dictionary of the counts for each item.
00101 
00102         See estimate_params for a description of the formula used for
00103         calculation.
00104         """
00105         # get an ordered list of all items
00106         all_ordered = counts.keys()
00107         all_ordered.sort()
00108         
00109         ml_estimation = {}
00110 
00111         # the total counts for the current letter we are on
00112         cur_letter = None
00113         cur_letter_counts = 0
00114         
00115         for cur_item in all_ordered:
00116             # if we are on a new letter (ie. the first letter of the tuple)
00117             if cur_item[0] != cur_letter:
00118                 # set the new letter we are working with
00119                 cur_letter = cur_item[0]
00120 
00121                 # count up the total counts for this letter
00122                 cur_letter_counts = counts[cur_item]
00123                 
00124                 # add counts for all other items with the same first letter
00125                 cur_position = all_ordered.index(cur_item) + 1
00126 
00127                 # keep adding while we have the same first letter or until
00128                 # we get to the end of the ordered list
00129                 while (cur_position < len(all_ordered) and
00130                        all_ordered[cur_position][0] == cur_item[0]):
00131                     cur_letter_counts += counts[all_ordered[cur_position]]
00132                     cur_position += 1
00133             # otherwise we've already got the total counts for this letter
00134             else:
00135                 pass
00136 
00137             # now calculate the ml and add it to the estimation
00138             cur_ml = float(counts[cur_item]) / float(cur_letter_counts)
00139             ml_estimation[cur_item] = cur_ml
00140 
00141         return ml_estimation
            

Here is the caller graph for this function:


Member Data Documentation

Definition at line 46 of file Trainer.py.


The documentation for this class was generated from the following file: