Back to index

python-biopython  1.60
Public Member Functions | Private Member Functions | Private Attributes
Bio.NeuralNetwork.Gene.Signature.SignatureFinder Class Reference

List of all members.

Public Member Functions

def __init__
def find

Private Member Functions

def _get_signature_dict
def _add_sig

Private Attributes

 _alphabet_strict

Detailed Description

Find Signatures in a group of sequence records.

In this simple implementation, signatures are just defined as a
two motifs separated by a gap. We need something a lot smarter than
this to find more complicated signatures.

Definition at line 16 of file Signature.py.


Constructor & Destructor Documentation

def Bio.NeuralNetwork.Gene.Signature.SignatureFinder.__init__ (   self,
  alphabet_strict = 1 
)
Initialize a finder to get signatures.

Arguments:

o alphabet_strict - Specify whether signatures should be required
to have all letters in the signature be consistent with the
alphabet of the original sequence. This requires that all Seqs
used have a consistent alphabet. This helps protect against getting
useless signatures full of ambiguity signals.

Definition at line 23 of file Signature.py.

00023 
00024     def __init__(self, alphabet_strict = 1):
00025         """Initialize a finder to get signatures.
00026 
00027         Arguments:
00028 
00029         o alphabet_strict - Specify whether signatures should be required
00030         to have all letters in the signature be consistent with the
00031         alphabet of the original sequence. This requires that all Seqs
00032         used have a consistent alphabet. This helps protect against getting
00033         useless signatures full of ambiguity signals.
00034         """
00035         self._alphabet_strict = alphabet_strict

Here is the caller graph for this function:


Member Function Documentation

def Bio.NeuralNetwork.Gene.Signature.SignatureFinder._add_sig (   self,
  sig_dict,
  sig_to_add 
) [private]
Add a signature to the given dictionary.

Definition at line 101 of file Signature.py.

00101 
00102     def _add_sig(self, sig_dict, sig_to_add):
00103         """Add a signature to the given dictionary.
00104         """
00105         # incrememt the count of the signature if it is already present
00106         if sig_to_add in sig_dict:
00107             sig_dict[sig_to_add] += 1
00108         # otherwise add it to the dictionary
00109         else:
00110             sig_dict[sig_to_add] = 1
00111 
00112         return sig_dict

Here is the caller graph for this function:

def Bio.NeuralNetwork.Gene.Signature.SignatureFinder._get_signature_dict (   self,
  seq_records,
  sig_size,
  max_gap 
) [private]
Return a dictionary with all signatures and their counts.

This internal function does all of the hard work for the
find_signatures function.

Definition at line 54 of file Signature.py.

00054 
00055     def _get_signature_dict(self, seq_records, sig_size, max_gap):
00056         """Return a dictionary with all signatures and their counts.
00057 
00058         This internal function does all of the hard work for the
00059         find_signatures function.
00060         """
00061         if self._alphabet_strict:
00062             alphabet = seq_records[0].seq.alphabet
00063         else:
00064             alphabet = None
00065 
00066         # loop through all records to find signatures
00067         all_sigs = {}
00068         for seq_record in seq_records:
00069             # if we are working with alphabets, make sure we are consistent
00070             if alphabet is not None:
00071                 assert seq_record.seq.alphabet == alphabet, \
00072                        "Working with alphabet %s and got %s" % \
00073                        (alphabet, seq_record.seq.alphabet)
00074 
00075             # now start finding signatures in the sequence
00076             largest_sig_size = sig_size * 2 + max_gap
00077             for start in range(len(seq_record.seq) - (largest_sig_size - 1)):
00078                 # find the first part of the signature
00079                 first_sig = seq_record.seq[start:start + sig_size].tostring()
00080 
00081                 # now find all of the second parts of the signature
00082                 for second in range(start + 1, (start + 1) + max_gap):
00083                     second_sig = seq_record.seq[second: second + sig_size].tostring()
00084 
00085                     # if we are being alphabet strict, make sure both parts
00086                     # of the sig fall within the specified alphabet
00087                     if alphabet is not None:
00088                         first_seq = Seq(first_sig, alphabet)
00089                         second_seq = Seq(second_sig, alphabet)
00090                         if _verify_alphabet(first_seq) \
00091                         and _verify_alphabet(second_seq):
00092                             all_sigs = self._add_sig(all_sigs,
00093                                                      (first_sig, second_sig))
00094 
00095                     # if we are not being strict, just add the motif
00096                     else:
00097                         all_sigs = self._add_sig(all_sigs,
00098                                                  (first_sig, second_sig))
00099 
00100         return all_sigs

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.NeuralNetwork.Gene.Signature.SignatureFinder.find (   self,
  seq_records,
  signature_size,
  max_gap 
)
Find all signatures in a group of sequences.

Arguments:

o seq_records - A list of SeqRecord objects we'll use the sequences
from to find signatures.

o signature_size - The size of each half of a signature (ie. if this
is set at 3, then the signature could be AGC-----GAC)

o max_gap - The maximum gap size between two parts of a signature.

Definition at line 36 of file Signature.py.

00036 
00037     def find(self, seq_records, signature_size, max_gap):
00038         """Find all signatures in a group of sequences.
00039 
00040         Arguments:
00041 
00042         o seq_records - A list of SeqRecord objects we'll use the sequences
00043         from to find signatures.
00044 
00045         o signature_size - The size of each half of a signature (ie. if this
00046         is set at 3, then the signature could be AGC-----GAC)
00047 
00048         o max_gap - The maximum gap size between two parts of a signature.
00049         """
00050         sig_info = self._get_signature_dict(seq_records, signature_size,
00051                                             max_gap)
00052 
00053         return PatternRepository(sig_info)

Here is the call graph for this function:


Member Data Documentation

Definition at line 34 of file Signature.py.


The documentation for this class was generated from the following file: