Back to index

python-biopython  1.60
Functions | Variables
Bio.SeqIO.PirIO Namespace Reference

Functions

def PirIterator

Variables

dictionary _pir_alphabets
string filename = "../../Tests/NBRF/%s.pir"
tuple records = list(PirIterator(open(filename)))
int count = 0
tuple parts = record.description.split()

Function Documentation

def Bio.SeqIO.PirIO.PirIterator (   handle)
Generator function to iterate over Fasta records (as SeqRecord objects).

handle - input file
alphabet - optional alphabet
title2ids - A function that, when given the title of the FASTA
file (without the beginning >), will return the id, name and
description (in that order) for the record as a tuple of strings.

If this is not given, then the entire title line will be used
as the description, and the first word as the id and name.

Note that use of title2ids matches that of Bio.Fasta.SequenceParser
but the defaults are slightly different.

Definition at line 106 of file PirIO.py.

00106 
00107 def PirIterator(handle):
00108     """Generator function to iterate over Fasta records (as SeqRecord objects).
00109 
00110     handle - input file
00111     alphabet - optional alphabet
00112     title2ids - A function that, when given the title of the FASTA
00113     file (without the beginning >), will return the id, name and
00114     description (in that order) for the record as a tuple of strings.
00115 
00116     If this is not given, then the entire title line will be used
00117     as the description, and the first word as the id and name.
00118 
00119     Note that use of title2ids matches that of Bio.Fasta.SequenceParser
00120     but the defaults are slightly different.
00121     """
00122     #Skip any text before the first record (e.g. blank lines, comments)
00123     while True:
00124         line = handle.readline()
00125         if line == "":
00126             return #Premature end of file, or just empty?
00127         if line[0] == ">":
00128             break
00129 
00130     while True:
00131         if line[0] != ">":
00132             raise ValueError(\
00133                 "Records in PIR files should start with '>' character")
00134         pir_type = line[1:3]
00135         if pir_type not in _pir_alphabets or line[3] != ";":
00136             raise ValueError(\
00137                 "Records should start with '>XX;' "
00138                 "where XX is a valid sequence type")
00139         identifier = line[4:].strip()
00140         description = handle.readline().strip()
00141         
00142             
00143         lines = []
00144         line = handle.readline()
00145         while True:
00146             if not line:
00147                 break
00148             if line[0] == ">":
00149                 break
00150             #Remove trailing whitespace, and any internal spaces
00151             lines.append(line.rstrip().replace(" ",""))
00152             line = handle.readline()
00153         seq = "".join(lines)
00154         if seq[-1] != "*":
00155             #Note the * terminator is present on nucleotide sequences too,
00156             #it is not a stop codon!
00157             raise ValueError(\
00158                 "Sequences in PIR files should include a * terminator!")
00159             
00160         #Return the record and then continue...
00161         record = SeqRecord(Seq(seq[:-1], _pir_alphabets[pir_type]),
00162                            id = identifier, name = identifier,
00163                            description = description)
00164         record.annotations["PIR-type"] = pir_type
00165         yield record
00166 
00167         if not line : return #StopIteration
00168     assert False, "Should not reach this line"


Variable Documentation

Initial value:
00001 {"P1" : generic_protein,
00002                   "F1" : generic_protein,
00003                   "D1" : generic_dna,
00004                   "DL" : generic_dna,
00005                   "DC" : generic_dna,
00006                   "RL" : generic_rna,
00007                   "RC" : generic_rna,
00008                   "N3" : generic_rna,
00009                   "XX" : single_letter_alphabet,
00010                   }

Definition at line 94 of file PirIO.py.

Definition at line 182 of file PirIO.py.

string Bio.SeqIO.PirIO.filename = "../../Tests/NBRF/%s.pir"

Definition at line 176 of file PirIO.py.

tuple Bio.SeqIO.PirIO.parts = record.description.split()

Definition at line 185 of file PirIO.py.

Definition at line 181 of file PirIO.py.