Back to index

python-biopython  1.60
Public Member Functions | Private Attributes
Bio.NeuralNetwork.Gene.Schema.Schema Class Reference

List of all members.

Public Member Functions

def __init__
def encode_motif
def find_ambiguous
def num_ambiguous
def find_matches
def num_matches
def all_unambiguous

Private Attributes

 _ambiguity_info
 _motif_cache

Detailed Description

Deal with motifs that have ambiguity characters in it.

This motif class allows specific ambiguity characters and tries to
speed up finding motifs using regular expressions.

This is likely to be a replacement for the Schema representation,
since it allows multiple ambiguity characters to be used.

Definition at line 33 of file Schema.py.


Constructor & Destructor Documentation

def Bio.NeuralNetwork.Gene.Schema.Schema.__init__ (   self,
  ambiguity_info 
)
Initialize with ambiguity information.

Arguments:

o ambiguity_info - A dictionary which maps letters in the motifs to
the ambiguous characters which they might represent. For example,
{'R' : 'AG'} specifies that Rs in the motif can match a A or a G.
All letters in the motif must be represented in the ambiguity_info
dictionary.

Definition at line 42 of file Schema.py.

00042 
00043     def __init__(self, ambiguity_info):
00044         """Initialize with ambiguity information.
00045 
00046         Arguments:
00047         
00048         o ambiguity_info - A dictionary which maps letters in the motifs to
00049         the ambiguous characters which they might represent. For example,
00050         {'R' : 'AG'} specifies that Rs in the motif can match a A or a G.
00051         All letters in the motif must be represented in the ambiguity_info
00052         dictionary.
00053         """
00054         self._ambiguity_info = ambiguity_info
00055 
00056         # a cache of all encoded motifs
00057         self._motif_cache = {}

Here is the caller graph for this function:


Member Function Documentation

Return a listing of all unambiguous letters allowed in motifs.

Definition at line 138 of file Schema.py.

00138 
00139     def all_unambiguous(self):
00140         """Return a listing of all unambiguous letters allowed in motifs.
00141         """
00142         all_letters = sorted(self._ambiguity_info)
00143         unambig_letters = []
00144 
00145         for letter in all_letters:
00146             possible_matches = self._ambiguity_info[letter]
00147             if len(possible_matches) == 1:
00148                 unambig_letters.append(letter)
00149 
00150         return unambig_letters
00151 
00152 # --- helper classes and functions for the default SchemaFinder
00153 
00154 # -- Alphabets

Encode the passed motif as a regular expression pattern object.

Arguments:

o motif - The motif we want to encode. This should be a string.

Returns:
A compiled regular expression pattern object that can be used
for searching strings.

Definition at line 58 of file Schema.py.

00058 
00059     def encode_motif(self, motif):
00060         """Encode the passed motif as a regular expression pattern object.
00061         
00062         Arguments:
00063 
00064         o motif - The motif we want to encode. This should be a string.
00065         
00066         Returns:
00067         A compiled regular expression pattern object that can be used
00068         for searching strings.
00069         """
00070         regexp_string = ""
00071 
00072         for motif_letter in motif:
00073             try:
00074                 letter_matches = self._ambiguity_info[motif_letter]
00075             except KeyError:
00076                 raise KeyError("No match information for letter %s"
00077                                % motif_letter)
00078 
00079             if len(letter_matches) > 1:
00080                 regexp_match = "[" + letter_matches + "]"
00081             elif len(letter_matches) == 1:
00082                 regexp_match = letter_matches
00083             else:
00084                 raise ValueError("Unexpected match information %s"
00085                                  % letter_matches)
00086 
00087             regexp_string += regexp_match
00088 
00089         return re.compile(regexp_string)

Return the location of ambiguous items in the motif.

This just checks through the motif and compares each letter
against the ambiguity information. If a letter stands for multiple
items, it is ambiguous.

Definition at line 90 of file Schema.py.

00090 
00091     def find_ambiguous(self, motif):
00092         """Return the location of ambiguous items in the motif.
00093 
00094         This just checks through the motif and compares each letter
00095         against the ambiguity information. If a letter stands for multiple
00096         items, it is ambiguous.
00097         """
00098         ambig_positions = []
00099         for motif_letter_pos in range(len(motif)):
00100             motif_letter = motif[motif_letter_pos]
00101             try:
00102                 letter_matches = self._ambiguity_info[motif_letter]
00103             except KeyError:
00104                 raise KeyError("No match information for letter %s"
00105                                % motif_letter)
00106 
00107             if len(letter_matches) > 1:
00108                 ambig_positions.append(motif_letter_pos)
00109 
00110         return ambig_positions

def Bio.NeuralNetwork.Gene.Schema.Schema.find_matches (   self,
  motif,
  query 
)
Return all non-overlapping motif matches in the query string.

This utilizes the regular expression findall function, and will
return a list of all non-overlapping occurances in query that
match the ambiguous motif.

Definition at line 117 of file Schema.py.

00117 
00118     def find_matches(self, motif, query):
00119         """Return all non-overlapping motif matches in the query string.
00120 
00121         This utilizes the regular expression findall function, and will
00122         return a list of all non-overlapping occurances in query that
00123         match the ambiguous motif.
00124         """
00125         try:
00126             motif_pattern = self._motif_cache[motif]
00127         except KeyError:
00128             motif_pattern = self.encode_motif(motif)
00129             self._motif_cache[motif] = motif_pattern
00130 
00131         return motif_pattern.findall(query)

Here is the call graph for this function:

Return the number of ambiguous letters in a given motif.

Definition at line 111 of file Schema.py.

00111 
00112     def num_ambiguous(self, motif):
00113         """Return the number of ambiguous letters in a given motif.
00114         """
00115         ambig_positions = self.find_ambiguous(motif)
00116         return len(ambig_positions)

Here is the call graph for this function:

def Bio.NeuralNetwork.Gene.Schema.Schema.num_matches (   self,
  motif,
  query 
)
Find the number of non-overlapping times motif occurs in query.

Definition at line 132 of file Schema.py.

00132 
00133     def num_matches(self, motif, query):
00134         """Find the number of non-overlapping times motif occurs in query.
00135         """
00136         all_matches = self.find_matches(motif, query)
00137         return len(all_matches)

Here is the call graph for this function:


Member Data Documentation

Definition at line 53 of file Schema.py.

Definition at line 56 of file Schema.py.


The documentation for this class was generated from the following file: