Back to index

python-biopython  1.60
Functions | Variables
Bio.SeqIO.IgIO Namespace Reference

Functions

def IgIterator

Variables

string path = "../../Tests/IntelliGenetics/"
tuple handle = open(os.path.join(path, filename))

Function Documentation

def Bio.SeqIO.IgIO.IgIterator (   handle,
  alphabet = single_letter_alphabet 
)
Iterate over IntelliGenetics records (as SeqRecord objects).

handle - input file
alphabet - optional alphabet

The optional free format file header lines (which start with two
semi-colons) are ignored.

The free format commentary lines at the start of each record (which
start with a semi-colon) are recorded as a single string with embedded
new line characters in the SeqRecord's annotations dictionary under the
key 'comment'.

Definition at line 19 of file IgIO.py.

00019 
00020 def IgIterator(handle, alphabet = single_letter_alphabet):
00021     """Iterate over IntelliGenetics records (as SeqRecord objects).
00022 
00023     handle - input file
00024     alphabet - optional alphabet
00025 
00026     The optional free format file header lines (which start with two
00027     semi-colons) are ignored.
00028 
00029     The free format commentary lines at the start of each record (which
00030     start with a semi-colon) are recorded as a single string with embedded
00031     new line characters in the SeqRecord's annotations dictionary under the
00032     key 'comment'.
00033     """
00034     #Skip any file header text before the first record (;; lines)
00035     while True:
00036         line = handle.readline()
00037         if not line : break #Premature end of file, or just empty?
00038         if not line.startswith(";;") : break
00039 
00040     while line:
00041         #Now iterate over the records
00042         if line[0] != ";":
00043             raise ValueError( \
00044                   "Records should start with ';' and not:\n%s" % repr(line))
00045 
00046         #Try and agree with SeqRecord convention from the GenBank parser,
00047         #(and followed in the SwissProt parser) which stores the comments
00048         #as a long string with newlines under annotations key 'comment'.
00049 
00050         #Note some examples use "; ..." and others ";..."
00051         comment_lines = []
00052         while line.startswith(";"):
00053             #TODO - Extract identifier from lines like "LOCUS\tB_SF2"?
00054             comment_lines.append(line[1:].strip())
00055             line = handle.readline()
00056         title = line.rstrip()
00057 
00058         seq_lines = []
00059         while True:
00060             line = handle.readline()
00061             if not line:
00062                 break
00063             if line[0] == ";":
00064                 break
00065             #Remove trailing whitespace, and any internal spaces
00066             seq_lines.append(line.rstrip().replace(" ",""))
00067         seq_str = "".join(seq_lines)
00068         if seq_str.endswith("1"):
00069             #Remove the optional terminator (digit one)
00070             seq_str = seq_str[:-1]
00071         if "1" in seq_str:
00072             raise ValueError(\
00073                 "Potential terminator digit one found within sequence.")
00074                 
00075         #Return the record and then continue...
00076         record = SeqRecord(Seq(seq_str, alphabet),
00077                            id = title, name = title)
00078         record.annotations['comment'] = "\n".join(comment_lines)
00079         yield record
00080     
00081     #We should be at the end of the file now
00082     assert not line


Variable Documentation

tuple Bio.SeqIO.IgIO.handle = open(os.path.join(path, filename))

Definition at line 94 of file IgIO.py.

string Bio.SeqIO.IgIO.path = "../../Tests/IntelliGenetics/"

Definition at line 87 of file IgIO.py.