Back to index

python-biopython  1.60
__init__.py
Go to the documentation of this file.
00001 # Copyright 2000-2009 by Iddo Friedberg.  All rights reserved.
00002 # This code is part of the Biopython distribution and governed by its
00003 # license.  Please see the LICENSE file that should have been included
00004 # as part of this package.
00005 #
00006 # Iddo Friedberg idoerg@cc.huji.ac.il
00007 
00008 """Substitution matrices, log odds matrices, and operations on them.
00009 
00010 General:
00011 -------
00012 
00013 This module provides a class and a few routines for generating
00014 substitution matrices, similar ot BLOSUM or PAM matrices, but based on
00015 user-provided data.
00016 The class used for these matrices is SeqMat
00017 
00018 Matrices are implemented as a dictionary. Each index contains a 2-tuple,
00019 which are the two residue/nucleotide types replaced. The value differs
00020 according to the matrix's purpose: e.g in a log-odds frequency matrix, the
00021 value would be log(Pij/(Pi*Pj)) where:
00022 Pij: frequency of substitution of letter (residue/nucletide) i by j 
00023 Pi, Pj: expected frequencies of i and j, respectively.
00024 
00025 Usage:
00026 -----
00027 The following section is layed out in the order by which most people wish
00028 to generate a log-odds matrix. Of course, interim matrices can be
00029 generated and investigated. Most people just want a log-odds matrix,
00030 that's all.
00031 
00032 Generating an Accepted Replacement Matrix:
00033 -----------------------------------------
00034 Initially, you should generate an accepted replacement matrix (ARM)
00035 from your data. The values in ARM are the _counted_ number of
00036 replacements according to your data. The data could be a set of pairs
00037 or multiple alignments. So for instance if Alanine was replaced by
00038 Cysteine 10 times, and Cysteine by Alanine 12 times, the corresponding
00039 ARM entries would be:
00040 ['A','C']: 10,
00041 ['C','A'] 12
00042 As order doesn't matter, user can already provide only one entry:
00043 ['A','C']: 22 
00044 A SeqMat instance may be initialized with either a full (first
00045 method of counting: 10, 12) or half (the latter method, 22) matrices. A
00046 Full protein alphabet matrix would be of the size 20x20 = 400. A Half
00047 matrix of that alphabet would be 20x20/2 + 20/2 = 210. That is because
00048 same-letter entries don't change. (The matrix diagonal). Given an
00049 alphabet size of N:
00050 Full matrix size:N*N
00051 Half matrix size: N(N+1)/2
00052  
00053 If you provide a full matrix, the constructore will create a half-matrix
00054 automatically.
00055 If you provide a half-matrix, make sure of a (low, high) sorted order in
00056 the keys: there should only be 
00057 a ('A','C') not a ('C','A').
00058 
00059 Internal functions:
00060 
00061 Generating the observed frequency matrix (OFM):
00062 ----------------------------------------------
00063 Use: OFM = _build_obs_freq_mat(ARM)
00064 The OFM is generated from the ARM, only instead of replacement counts, it
00065 contains replacement frequencies.
00066 
00067 Generating an expected frequency matrix (EFM):
00068 ---------------------------------------------
00069 Use: EFM = _build_exp_freq_mat(OFM,exp_freq_table)
00070 exp_freq_table: should be a freqTableC instantiation. See freqTable.py for
00071 detailed information. Briefly, the expected frequency table has the
00072 frequencies of appearance for each member of the alphabet
00073 
00074 Generating a substitution frequency matrix (SFM):
00075 ------------------------------------------------
00076 Use: SFM = _build_subs_mat(OFM,EFM)
00077 Accepts an OFM, EFM. Provides the division product of the corresponding
00078 values. 
00079 
00080 Generating a log-odds matrix (LOM):
00081 ----------------------------------
00082 Use: LOM=_build_log_odds_mat(SFM[,logbase=10,factor=10.0,roundit=1])
00083 Accepts an SFM. logbase: base of the logarithm used to generate the
00084 log-odds values. factor: factor used to multiply the log-odds values.
00085 roundit: default - true. Whether to round the values.
00086 Each entry is generated by log(LOM[key])*factor
00087 And rounded if required.
00088 
00089 External:
00090 ---------
00091 In most cases, users will want to generate a log-odds matrix only, without
00092 explicitly calling the OFM --> EFM --> SFM stages. The function
00093 build_log_odds_matrix does that. User provides an ARM and an expected
00094 frequency table. The function returns the log-odds matrix.
00095 
00096 Methods for subtraction, addition and multiplication of matrices:
00097 -----------------------------------------------------------------
00098 * Generation of an expected frequency table from an observed frequency
00099   matrix.
00100 * Calculation of linear correlation coefficient between two matrices.
00101 * Calculation of relative entropy is now done using the
00102   _make_relative_entropy method and is stored in the member
00103   self.relative_entropy
00104 * Calculation of entropy is now done using the _make_entropy method and
00105   is stored in the member self.entropy.
00106 * Jensen-Shannon distance between the distributions from which the
00107   matrices are derived. This is a distance function based on the
00108   distribution's entropies.
00109 """
00110 
00111 
00112 import re
00113 import sys
00114 import copy
00115 import math
00116 import warnings
00117 
00118 # BioPython imports
00119 import Bio
00120 from Bio import Alphabet
00121 from Bio.SubsMat import FreqTable
00122 
00123 log = math.log
00124 # Matrix types
00125 NOTYPE = 0
00126 ACCREP = 1
00127 OBSFREQ = 2
00128 SUBS = 3
00129 EXPFREQ = 4
00130 LO = 5
00131 EPSILON = 0.00000000000001
00132 
00133 
00134 class SeqMat(dict):
00135    """A Generic sequence matrix class
00136    The key is a 2-tuple containing the letter indices of the matrix. Those
00137    should be sorted in the tuple (low, high). Because each matrix is dealt
00138    with as a half-matrix."""
00139 
00140    def _alphabet_from_matrix(self):
00141       ab_dict = {}
00142       s = ''
00143       for i in self:
00144          ab_dict[i[0]] = 1
00145          ab_dict[i[1]] = 1
00146       for i in sorted(ab_dict):
00147          s += i
00148       self.alphabet.letters = s
00149 
00150    def __init__(self, data=None, alphabet=None, mat_name='', build_later=0):
00151       # User may supply:
00152       # data: matrix itself
00153       # mat_name: its name. See below.
00154       # alphabet: an instance of Bio.Alphabet, or a subclass. If not
00155       # supplied, constructor builds its own from that matrix."""
00156       # build_later: skip the matrix size assertion. User will build the
00157       # matrix after creating the instance. Constructor builds a half matrix
00158       # filled with zeroes.
00159 
00160       assert type(mat_name) == type('')
00161 
00162       # "data" may be:
00163       # 1) None --> then self.data is an empty dictionary
00164       # 2) type({}) --> then self takes the items in data
00165       # 3) An instance of SeqMat
00166       # This whole creation-during-execution is done to avoid changing
00167       # default values, the way Python does because default values are
00168       # created when the function is defined, not when it is created.
00169       if data:
00170           try:
00171               self.update(data)
00172           except ValueError:
00173               raise ValueError, "Failed to store data in a dictionary"
00174       if alphabet == None:
00175          alphabet = Alphabet.Alphabet()
00176       assert Alphabet.generic_alphabet.contains(alphabet)
00177       self.alphabet = alphabet
00178 
00179       # If passed alphabet is empty, use the letters in the matrix itself
00180       if not self.alphabet.letters:
00181          self._alphabet_from_matrix()
00182       # Assert matrix size: half or full
00183       if not build_later:
00184          N = len(self.alphabet.letters)
00185          assert len(self) == N**2 or len(self) == N*(N+1)/2
00186       self.ab_list = list(self.alphabet.letters)
00187       self.ab_list.sort()
00188       # Names: a string like "BLOSUM62" or "PAM250"
00189       self.mat_name = mat_name
00190       if build_later:
00191          self._init_zero()
00192       else:
00193          # Convert full to half
00194          self._full_to_half()
00195          self._correct_matrix()
00196       self.sum_letters = {}
00197       self.relative_entropy = 0
00198 
00199    def _correct_matrix(self):
00200       keylist = self.keys()
00201       for key in keylist:
00202          if key[0] > key[1]:
00203             self[(key[1],key[0])] = self[key]
00204             del self[key]
00205 
00206    def _full_to_half(self):
00207       """
00208       Convert a full-matrix to a half-matrix
00209       """
00210       # For instance: two entries ('A','C'):13 and ('C','A'):20 will be summed
00211       # into ('A','C'): 33 and the index ('C','A') will be deleted
00212       # alphabet.letters:('A','A') and ('C','C') will remain the same.
00213 
00214       N = len(self.alphabet.letters)
00215       # Do nothing if this is already a half-matrix
00216       if len(self) == N*(N+1)/2:
00217          return
00218       for i in self.ab_list:
00219          for j in self.ab_list[:self.ab_list.index(i)+1]:
00220             if i != j:
00221                self[j,i] = self[j,i] + self[i,j]
00222                del self[i,j]
00223 
00224    def _init_zero(self):
00225       for i in self.ab_list:
00226          for j in self.ab_list[:self.ab_list.index(i)+1]:
00227             self[j,i] = 0.
00228 
00229    def make_entropy(self):
00230       self.entropy = 0
00231       for i in self:
00232          if self[i] > EPSILON:
00233             self.entropy += self[i]*log(self[i])/log(2)
00234       self.entropy = -self.entropy
00235 
00236    def sum(self):
00237       result = {}
00238       for letter in self.alphabet.letters:
00239           result[letter] = 0.0
00240       for pair, value in self.iteritems():
00241           i1, i2 = pair
00242           if i1==i2:
00243               result[i1] += value
00244           else:
00245               result[i1] += value / 2
00246               result[i2] += value / 2
00247       return result
00248 
00249    def print_full_mat(self,f=None,format="%4d",topformat="%4s",
00250               alphabet=None,factor=1,non_sym=None):
00251       f = f or sys.stdout 
00252       # create a temporary dictionary, which holds the full matrix for
00253       # printing
00254       assert non_sym == None or type(non_sym) == type(1.) or \
00255       type(non_sym) == type(1)
00256       full_mat = copy.copy(self)
00257       for i in self:
00258          if i[0] != i[1]:
00259             full_mat[(i[1],i[0])] = full_mat[i]
00260       if not alphabet:
00261          alphabet = self.ab_list
00262       topline = ''
00263       for i in alphabet:
00264          topline = topline + topformat % i
00265       topline = topline + '\n'
00266       f.write(topline)
00267       for i in alphabet:
00268          outline = i
00269          for j in alphabet:
00270             if alphabet.index(j) > alphabet.index(i) and non_sym is not None:
00271                val = non_sym
00272             else:
00273                val = full_mat[i,j]
00274                val *= factor
00275             if val <= -999:
00276                cur_str = '  ND' 
00277             else:
00278                cur_str = format % val
00279             
00280             outline = outline+cur_str
00281          outline = outline+'\n'
00282          f.write(outline)
00283 
00284    def print_mat(self,f=None,format="%4d",bottomformat="%4s",
00285               alphabet=None,factor=1):
00286       """Print a nice half-matrix. f=sys.stdout to see on the screen
00287       User may pass own alphabet, which should contain all letters in the
00288       alphabet of the matrix, but may be in a different order. This
00289       order will be the order of the letters on the axes"""
00290       
00291       f = f or sys.stdout
00292       if not alphabet:
00293          alphabet = self.ab_list
00294       bottomline = ''
00295       for i in alphabet:
00296          bottomline = bottomline + bottomformat % i
00297       bottomline = bottomline + '\n'
00298       for i in alphabet:
00299          outline = i
00300          for j in alphabet[:alphabet.index(i)+1]:
00301             try:
00302                val = self[j,i]
00303             except KeyError:
00304                val = self[i,j]
00305             val *= factor
00306             if val == -999:
00307                cur_str = '  ND' 
00308             else:
00309                cur_str = format % val
00310             
00311             outline = outline+cur_str
00312          outline = outline+'\n'
00313          f.write(outline)
00314       f.write(bottomline)
00315 
00316    def __str__(self):
00317       """Print a nice half-matrix."""
00318       output = ""
00319       alphabet = self.ab_list
00320       n = len(alphabet)
00321       for i in range(n):
00322          c1 = alphabet[i]
00323          output += c1
00324          for j in range(i+1):
00325             c2 = alphabet[j]
00326             try:
00327                val = self[c2,c1]
00328             except KeyError:
00329                val = self[c1,c2]
00330             if val == -999:
00331                output += '  ND' 
00332             else:
00333                output += "%4d" % val
00334          output += '\n'
00335       output += '%4s' * n % tuple(alphabet) + "\n"
00336       return output
00337 
00338    def __sub__(self,other):
00339       """ returns a number which is the subtraction product of the two matrices"""
00340       mat_diff = 0
00341       for i in self:
00342          mat_diff += (self[i] - other[i])
00343       return mat_diff
00344 
00345    def __mul__(self,other):
00346       """ returns a matrix for which each entry is the multiplication product of the
00347       two matrices passed"""
00348       new_mat = copy.copy(self)
00349       for i in self:
00350          new_mat[i] *= other[i]
00351       return new_mat
00352 
00353    def __add__(self, other):
00354       new_mat = copy.copy(self)
00355       for i in self:
00356          new_mat[i] += other[i]
00357       return new_mat
00358 
00359 class AcceptedReplacementsMatrix(SeqMat):
00360     """Accepted replacements matrix"""
00361 
00362 class ObservedFrequencyMatrix(SeqMat):
00363     """Observed frequency matrix"""
00364 
00365 class ExpectedFrequencyMatrix(SeqMat):
00366     """Expected frequency matrix"""
00367 
00368 
00369 class SubstitutionMatrix(SeqMat):
00370     """Substitution matrix"""
00371 
00372     def calculate_relative_entropy(self, obs_freq_mat):
00373         """Calculate and return the relative entropy with respect to an
00374         observed frequency matrix"""
00375         relative_entropy = 0.
00376         for key, value in self.iteritems():
00377             if value > EPSILON:
00378                 relative_entropy += obs_freq_mat[key]*log(value)
00379         relative_entropy /= log(2)
00380         return relative_entropy
00381 
00382 
00383 class LogOddsMatrix(SeqMat):
00384     """Log odds matrix"""
00385 
00386     def calculate_relative_entropy(self,obs_freq_mat):
00387         """Calculate and return the relative entropy with respect to an
00388         observed frequency matrix"""
00389         relative_entropy = 0.
00390         for key, value in self.iteritems():
00391             relative_entropy += obs_freq_mat[key]*value/log(2)
00392         return relative_entropy
00393 
00394 
00395 def _build_obs_freq_mat(acc_rep_mat):
00396    """
00397    build_obs_freq_mat(acc_rep_mat):
00398    Build the observed frequency matrix, from an accepted replacements matrix
00399    The acc_rep_mat matrix should be generated by the user.
00400    """
00401    # Note: acc_rep_mat should already be a half_matrix!!
00402    total = float(sum(acc_rep_mat.values()))
00403    obs_freq_mat = ObservedFrequencyMatrix(alphabet=acc_rep_mat.alphabet,
00404                                           build_later=1)
00405    for i in acc_rep_mat:
00406       obs_freq_mat[i] = acc_rep_mat[i]/total
00407    return obs_freq_mat
00408 
00409 def _exp_freq_table_from_obs_freq(obs_freq_mat):
00410    exp_freq_table = {}
00411    for i in obs_freq_mat.alphabet.letters:
00412       exp_freq_table[i] = 0.
00413    for i in obs_freq_mat:
00414       if i[0] == i[1]:
00415          exp_freq_table[i[0]] += obs_freq_mat[i]
00416       else:
00417          exp_freq_table[i[0]] += obs_freq_mat[i] / 2.
00418          exp_freq_table[i[1]] += obs_freq_mat[i] / 2.
00419    return FreqTable.FreqTable(exp_freq_table,FreqTable.FREQ)
00420 
00421 def _build_exp_freq_mat(exp_freq_table):
00422    """Build an expected frequency matrix
00423    exp_freq_table: should be a FreqTable instance
00424    """
00425    exp_freq_mat = ExpectedFrequencyMatrix(alphabet=exp_freq_table.alphabet,
00426                                           build_later=1)
00427    for i in exp_freq_mat:
00428       if i[0] == i[1]:
00429          exp_freq_mat[i] = exp_freq_table[i[0]]**2
00430       else:
00431          exp_freq_mat[i] = 2.0*exp_freq_table[i[0]]*exp_freq_table[i[1]]
00432    return exp_freq_mat
00433 #
00434 # Build the substitution matrix
00435 #
00436 def _build_subs_mat(obs_freq_mat,exp_freq_mat):
00437    """ Build the substitution matrix """
00438    if obs_freq_mat.ab_list != exp_freq_mat.ab_list:
00439       raise ValueError("Alphabet mismatch in passed matrices")
00440    subs_mat = SubstitutionMatrix(obs_freq_mat)
00441    for i in obs_freq_mat:
00442       subs_mat[i] = obs_freq_mat[i]/exp_freq_mat[i]
00443    return subs_mat
00444 
00445 #
00446 # Build a log-odds matrix
00447 #
00448 def _build_log_odds_mat(subs_mat,logbase=2,factor=10.0,round_digit=0,keep_nd=0):
00449    """_build_log_odds_mat(subs_mat,logbase=10,factor=10.0,round_digit=1):
00450    Build a log-odds matrix
00451    logbase=2: base of logarithm used to build (default 2)
00452    factor=10.: a factor by which each matrix entry is multiplied
00453    round_digit: roundoff place after decimal point
00454    keep_nd: if true, keeps the -999 value for non-determined values (for which there
00455    are no substitutions in the frequency substitutions matrix). If false, plants the
00456    minimum log-odds value of the matrix in entries containing -999
00457    """
00458    lo_mat = LogOddsMatrix(subs_mat)
00459    for key, value in subs_mat.iteritems():
00460       if value < EPSILON:
00461          lo_mat[key] = -999
00462       else:
00463          lo_mat[key] = round(factor*log(value)/log(logbase),round_digit)
00464    mat_min = min(lo_mat.values())
00465    if not keep_nd:
00466       for i in lo_mat:
00467          if lo_mat[i] <= -999:
00468             lo_mat[i] = mat_min
00469    return lo_mat
00470 
00471 #
00472 # External function. User provides an accepted replacement matrix, and,
00473 # optionally the following: expected frequency table, log base, mult. factor,
00474 # and rounding factor. Generates a log-odds matrix, calling internal SubsMat
00475 # functions.
00476 #
00477 def make_log_odds_matrix(acc_rep_mat,exp_freq_table=None,logbase=2,
00478                     factor=1.,round_digit=9,keep_nd=0):
00479    obs_freq_mat = _build_obs_freq_mat(acc_rep_mat)
00480    if not exp_freq_table:
00481       exp_freq_table = _exp_freq_table_from_obs_freq(obs_freq_mat)
00482    exp_freq_mat = _build_exp_freq_mat(exp_freq_table)
00483    subs_mat = _build_subs_mat(obs_freq_mat, exp_freq_mat)
00484    lo_mat = _build_log_odds_mat(subs_mat,logbase,factor,round_digit,keep_nd)
00485    return lo_mat
00486 
00487 def observed_frequency_to_substitution_matrix(obs_freq_mat):
00488    exp_freq_table = _exp_freq_table_from_obs_freq(obs_freq_mat)
00489    exp_freq_mat = _build_exp_freq_mat(exp_freq_table)
00490    subs_mat = _build_subs_mat(obs_freq_mat, exp_freq_mat)
00491    return subs_mat
00492 
00493 def read_text_matrix(data_file):
00494    matrix = {}
00495    tmp = data_file.read().split("\n")
00496    table=[]
00497    for i in tmp: 
00498       table.append(i.split())
00499    # remove records beginning with ``#''
00500    for rec in table[:]:
00501       if (rec.count('#') > 0):
00502          table.remove(rec)
00503 
00504    # remove null lists
00505    while (table.count([]) > 0):
00506       table.remove([])
00507    # build a dictionary
00508    alphabet = table[0]
00509    j = 0
00510    for rec in table[1:]:
00511       # print j
00512       row = alphabet[j]
00513       # row = rec[0]
00514       if re.compile('[A-z\*]').match(rec[0]):
00515          first_col = 1
00516       else:
00517          first_col = 0
00518       i = 0
00519       for field in rec[first_col:]:
00520          col = alphabet[i]
00521          matrix[(row,col)] = float(field)
00522          i += 1
00523       j += 1
00524    # delete entries with an asterisk
00525    for i in matrix.keys():
00526       if '*' in i: del(matrix[i])
00527    ret_mat = SeqMat(matrix)
00528    return ret_mat
00529 
00530 diagNO = 1
00531 diagONLY = 2
00532 diagALL = 3
00533 
00534 def two_mat_relative_entropy(mat_1,mat_2,logbase=2,diag=diagALL):
00535    rel_ent = 0.
00536    key_list_1 = sorted(mat_1)
00537    key_list_2 = sorted(mat_2)
00538    key_list = []
00539    sum_ent_1 = 0.; sum_ent_2 = 0.
00540    for i in key_list_1:
00541       if i in key_list_2:
00542          key_list.append(i)
00543    if len(key_list_1) != len(key_list_2):
00544       sys.stderr.write("Warning:first matrix has more entries than the second\n")
00545    if key_list_1 != key_list_2:
00546       sys.stderr.write("Warning: indices not the same between matrices\n")
00547    for key in key_list:
00548       if diag == diagNO and key[0] == key[1]:
00549          continue
00550       if diag == diagONLY and key[0] != key[1]:
00551          continue
00552       if mat_1[key] > EPSILON and mat_2[key] > EPSILON:
00553          sum_ent_1 += mat_1[key]
00554          sum_ent_2 += mat_2[key]
00555          
00556    for key in key_list:
00557       if diag == diagNO and key[0] == key[1]:
00558          continue
00559       if diag == diagONLY and key[0] != key[1]:
00560          continue
00561       if mat_1[key] > EPSILON and mat_2[key] > EPSILON:
00562          val_1 = mat_1[key] / sum_ent_1
00563          val_2 = mat_2[key] / sum_ent_2
00564 #         rel_ent += mat_1[key] * log(mat_1[key]/mat_2[key])/log(logbase)
00565          rel_ent += val_1 * log(val_1/val_2)/log(logbase)
00566    return rel_ent
00567 
00568 ## Gives the linear correlation coefficient between two matrices
00569 def two_mat_correlation(mat_1, mat_2):
00570     try:
00571         import numpy
00572     except ImportError:
00573         raise ImportError, "Please install Numerical Python (numpy) if you want to use this function"
00574     values = []
00575     assert mat_1.ab_list == mat_2.ab_list
00576     for ab_pair in mat_1:
00577        try:
00578           values.append((mat_1[ab_pair], mat_2[ab_pair]))
00579        except KeyError:
00580           raise ValueError, "%s is not a common key" % ab_pair
00581     correlation_matrix = numpy.corrcoef(values, rowvar=0)
00582     correlation = correlation_matrix[0,1]
00583     return correlation
00584 
00585 # Jensen-Shannon Distance
00586 # Need to input observed frequency matrices
00587 def two_mat_DJS(mat_1,mat_2,pi_1=0.5,pi_2=0.5):
00588    assert mat_1.ab_list == mat_2.ab_list
00589    assert pi_1 > 0 and pi_2 > 0 and pi_1< 1 and pi_2 <1
00590    assert not (pi_1 + pi_2 - 1.0 > EPSILON)
00591    sum_mat = SeqMat(build_later=1)
00592    sum_mat.ab_list = mat_1.ab_list
00593    for i in mat_1:
00594       sum_mat[i] = pi_1 * mat_1[i] + pi_2 * mat_2[i]
00595    sum_mat.make_entropy()
00596    mat_1.make_entropy()
00597    mat_2.make_entropy()
00598    # print mat_1.entropy, mat_2.entropy
00599    dJS = sum_mat.entropy - pi_1 * mat_1.entropy - pi_2 *mat_2.entropy
00600    return dJS
00601       
00602 """
00603 This isn't working yet. Boo hoo!
00604 def two_mat_print(mat_1, mat_2, f=None,alphabet=None,factor_1=1, factor_2=1,
00605                   format="%4d",bottomformat="%4s",topformat="%4s",
00606                   topindent=7*" ", bottomindent=1*" "):
00607    f = f or sys.stdout
00608    if not alphabet:
00609       assert mat_1.ab_list == mat_2.ab_list
00610       alphabet = mat_1.ab_list
00611    len_alphabet = len(alphabet)
00612    print_mat = {}
00613    topline = topindent
00614    bottomline = bottomindent
00615    for i in alphabet:
00616       bottomline += bottomformat % i
00617       topline += topformat % alphabet[len_alphabet-alphabet.index(i)-1]
00618    topline += '\n'
00619    bottomline += '\n'
00620    f.write(topline)
00621    for i in alphabet:
00622       for j in alphabet:
00623          print_mat[i,j] = -999
00624    diag_1 = {}; diag_2 = {}
00625    for i in alphabet:
00626       for j in alphabet[:alphabet.index(i)+1]:
00627          if i == j:
00628             diag_1[i] = mat_1[(i,i)] 
00629             diag_2[i] = mat_2[(alphabet[len_alphabet-alphabet.index(i)-1],
00630                    alphabet[len_alphabet-alphabet.index(i)-1])]
00631          else:
00632             if i > j:
00633                key = (j,i)
00634             else:
00635                key = (i,j)
00636             mat_2_key = [alphabet[len_alphabet-alphabet.index(key[0])-1],
00637                    alphabet[len_alphabet-alphabet.index(key[1])-1]]
00638             # print mat_2_key
00639             mat_2_key.sort(); mat_2_key = tuple(mat_2_key)
00640             # print key ,"||",  mat_2_key
00641             print_mat[key] = mat_2[mat_2_key] 
00642             print_mat[(key[1],key[0])] = mat_1[key]
00643    for i in alphabet:
00644       outline = i
00645       for j in alphabet:
00646          if i == j:
00647             if diag_1[i] == -999:
00648                val_1 = ' ND'
00649             else:
00650                val_1 = format % (diag_1[i]*factor_1)
00651             if diag_2[i] == -999:
00652                val_2 = ' ND'
00653             else:
00654                val_2 = format % (diag_2[i]*factor_2)
00655             cur_str = val_1 + "  " + val_2
00656          else:
00657             if print_mat[(i,j)] == -999:
00658                val = ' ND'
00659             elif alphabet.index(i) > alphabet.index(j):
00660                val = format % (print_mat[(i,j)]*factor_1)
00661             else:
00662                val = format % (print_mat[(i,j)]*factor_2)
00663             cur_str = val
00664          outline += cur_str
00665       outline += bottomformat % (alphabet[len_alphabet-alphabet.index(i)-1] +
00666                                  '\n')
00667       f.write(outline)
00668    f.write(bottomline)
00669 """