Back to index

python-biopython  1.60
Public Member Functions | Static Public Attributes | Private Member Functions | Private Attributes
Bio.AlignIO.PhylipIO.PhylipIterator Class Reference
Inheritance diagram for Bio.AlignIO.PhylipIO.PhylipIterator:
Inheritance graph
[legend]
Collaboration diagram for Bio.AlignIO.PhylipIO.PhylipIterator:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def next

Static Public Attributes

 id_width = _PHYLIP_ID_WIDTH

Private Member Functions

def _is_header
def _split_id

Private Attributes

 _header

Detailed Description

Reads a Phylip alignment file returning a MultipleSeqAlignment iterator.

Record identifiers are limited to at most 10 characters.

It only copes with interlaced phylip files!  Sequential files won't work
where the sequences are split over multiple lines.

For more information on the file format, please see:
http://evolution.genetics.washington.edu/phylip/doc/sequence.html
http://evolution.genetics.washington.edu/phylip/doc/main.html#inputfiles

Definition at line 150 of file PhylipIO.py.


Member Function Documentation

def Bio.AlignIO.PhylipIO.PhylipIterator._is_header (   self,
  line 
) [private]

Definition at line 166 of file PhylipIO.py.

00166 
00167     def _is_header(self, line):
00168         line = line.strip()
00169         parts = filter(None, line.split())
00170         if len(parts)!=2:
00171             return False # First line should have two integers
00172         try:
00173             number_of_seqs = int(parts[0])
00174             length_of_seqs = int(parts[1])
00175             return True
00176         except ValueError:
00177             return False # First line should have two integers

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.AlignIO.PhylipIO.PhylipIterator._split_id (   self,
  line 
) [private]
Extracts the sequence ID from a Phylip line, returning a tuple
containing:

    (sequence_id, sequence_residues)

The first 10 characters in the line are are the sequence id, the
remainder are sequence data.

Reimplemented in Bio.AlignIO.PhylipIO.RelaxedPhylipIterator.

Definition at line 178 of file PhylipIO.py.

00178 
00179     def _split_id(self, line):
00180         """
00181         Extracts the sequence ID from a Phylip line, returning a tuple
00182         containing:
00183 
00184             (sequence_id, sequence_residues)
00185 
00186         The first 10 characters in the line are are the sequence id, the
00187         remainder are sequence data.
00188         """
00189         seq_id = line[:self.id_width].strip()
00190         seq = line[self.id_width:].strip().replace(' ', '')
00191         return seq_id, seq

Here is the caller graph for this function:

Reimplemented in Bio.AlignIO.PhylipIO.SequentialPhylipIterator.

Definition at line 192 of file PhylipIO.py.

00192 
00193     def next(self):
00194         handle = self.handle
00195 
00196         try:
00197             #Header we saved from when we were parsing
00198             #the previous alignment.
00199             line = self._header
00200             del self._header
00201         except AttributeError:
00202             line = handle.readline()
00203 
00204         if not line:
00205             raise StopIteration
00206         line = line.strip()
00207         parts = filter(None, line.split())
00208         if len(parts)!=2:
00209             raise ValueError("First line should have two integers")
00210         try:
00211             number_of_seqs = int(parts[0])
00212             length_of_seqs = int(parts[1])
00213         except ValueError:
00214             raise ValueError("First line should have two integers")
00215 
00216         assert self._is_header(line)
00217 
00218         if self.records_per_alignment is not None \
00219         and self.records_per_alignment != number_of_seqs:
00220             raise ValueError("Found %i records in this alignment, told to expect %i" \
00221                              % (number_of_seqs, self.records_per_alignment))
00222 
00223         ids = []
00224         seqs = []
00225 
00226         # By default, expects STRICT truncation / padding to 10 characters.
00227         # Does not require any whitespace between name and seq.
00228         for i in xrange(number_of_seqs):
00229             line = handle.readline().rstrip()
00230             sequence_id, s = self._split_id(line)
00231             ids.append(sequence_id)
00232             if "." in s:
00233                 raise ValueError("PHYLIP format no longer allows dots in sequence")
00234             seqs.append([s])
00235 
00236         #Look for further blocks
00237         line=""
00238         while True:
00239             #Skip any blank lines between blocks...
00240             while ""==line.strip():
00241                 line = handle.readline()
00242                 if not line : break #end of file
00243             if not line : break #end of file
00244 
00245             if self._is_header(line):
00246                 #Looks like the start of a concatenated alignment
00247                 self._header = line
00248                 break
00249 
00250             #print "New block..."
00251             for i in xrange(number_of_seqs):
00252                 s = line.strip().replace(" ","")
00253                 if "." in s:
00254                     raise ValueError("PHYLIP format no longer allows dots in sequence")
00255                 seqs[i].append(s)
00256                 line = handle.readline()
00257                 if (not line) and i+1 < number_of_seqs:
00258                     raise ValueError("End of file mid-block")
00259             if not line : break #end of file
00260 
00261         records = (SeqRecord(Seq("".join(s), self.alphabet), \
00262                              id=i, name=i, description=i) \
00263                    for (i,s) in zip(ids, seqs))
00264         return MultipleSeqAlignment(records, self.alphabet)
00265 
# Relaxed Phylip

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Reimplemented in Bio.AlignIO.PhylipIO.SequentialPhylipIterator.

Definition at line 246 of file PhylipIO.py.

Definition at line 164 of file PhylipIO.py.


The documentation for this class was generated from the following file: