Back to index

python-biopython  1.60
Public Member Functions | Public Attributes | Static Public Attributes
Bio.GenBank.Scanner._ImgtScanner Class Reference
Inheritance diagram for Bio.GenBank.Scanner._ImgtScanner:
Inheritance graph
[legend]
Collaboration diagram for Bio.GenBank.Scanner._ImgtScanner:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def parse_features
def parse_footer
def set_handle
def find_start
def parse_header
def parse_feature
def feed
def parse
def parse_records
def parse_cds_features

Public Attributes

 line
 debug
 handle

Static Public Attributes

list FEATURE_START_MARKERS
string RECORD_START = "ID "
int HEADER_WIDTH = 5
list FEATURE_END_MARKERS = ["XX"]
int FEATURE_QUALIFIER_INDENT = 21
string FEATURE_QUALIFIER_SPACER = "FT"
list SEQUENCE_HEADERS = ["SQ", "CO"]

Detailed Description

For extracting chunks of information in IMGT (EMBL like) files (PRIVATE).

IMGT files are like EMBL files but in order to allow longer feature types
the features should be indented by 25 characters not 21 characters. In
practice the IMGT flat files tend to use either 21 or 25 characters, so we
must cope with both.

This is private to encourage use of Bio.SeqIO rather than Bio.GenBank.

Definition at line 783 of file Scanner.py.


Member Function Documentation

def Bio.GenBank.Scanner.InsdcScanner.feed (   self,
  handle,
  consumer,
  do_features = True 
) [inherited]
Feed a set of data into the consumer.

This method is intended for use with the "old" code in Bio.GenBank

Arguments:
handle - A handle with the information to parse.
consumer - The consumer that should be informed of events.
do_features - Boolean, should the features be parsed?
      Skipping the features can be much faster.

Return values:
true  - Passed a record
false - Did not find a record

Definition at line 367 of file Scanner.py.

00367 
00368     def feed(self, handle, consumer, do_features=True):
00369         """Feed a set of data into the consumer.
00370 
00371         This method is intended for use with the "old" code in Bio.GenBank
00372 
00373         Arguments:
00374         handle - A handle with the information to parse.
00375         consumer - The consumer that should be informed of events.
00376         do_features - Boolean, should the features be parsed?
00377                       Skipping the features can be much faster.
00378 
00379         Return values:
00380         true  - Passed a record
00381         false - Did not find a record
00382         """        
00383         #Should work with both EMBL and GenBank files provided the
00384         #equivalent Bio.GenBank._FeatureConsumer methods are called...
00385         self.set_handle(handle)
00386         if not self.find_start():
00387             #Could not find (another) record
00388             consumer.data=None
00389             return False
00390                        
00391         #We use the above class methods to parse the file into a simplified format.
00392         #The first line, header lines and any misc lines after the features will be
00393         #dealt with by GenBank / EMBL specific derived classes.
00394 
00395         #First line and header:
00396         self._feed_first_line(consumer, self.line)
00397         self._feed_header_lines(consumer, self.parse_header())
00398 
00399         #Features (common to both EMBL and GenBank):
00400         if do_features:
00401             self._feed_feature_table(consumer, self.parse_features(skip=False))
00402         else:
00403             self.parse_features(skip=True) # ignore the data
00404         
00405         #Footer and sequence
00406         misc_lines, sequence_string = self.parse_footer()
00407         self._feed_misc_lines(consumer, misc_lines)
00408 
00409         consumer.sequence(sequence_string)
00410         #Calls to consumer.base_number() do nothing anyway
00411         consumer.record_end("//")
00412 
00413         assert self.line == "//"
00414 
00415         #And we are done
00416         return True

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.GenBank.Scanner.InsdcScanner.find_start (   self) [inherited]
Read in lines until find the ID/LOCUS line, which is returned.

Any preamble (such as the header used by the NCBI on *.seq.gz archives)
will we ignored.

Definition at line 66 of file Scanner.py.

00066 
00067     def find_start(self):
00068         """Read in lines until find the ID/LOCUS line, which is returned.
00069         
00070         Any preamble (such as the header used by the NCBI on *.seq.gz archives)
00071         will we ignored."""
00072         while True:
00073             if self.line:
00074                 line = self.line
00075                 self.line = ""
00076             else:
00077                 line = self.handle.readline()
00078             if not line:
00079                 if self.debug : print "End of file"
00080                 return None
00081             if line[:self.HEADER_WIDTH]==self.RECORD_START:
00082                 if self.debug > 1: print "Found the start of a record:\n" + line
00083                 break
00084             line = line.rstrip()
00085             if line == "//":
00086                 if self.debug > 1: print "Skipping // marking end of last record"
00087             elif line == "":
00088                 if self.debug > 1: print "Skipping blank line before record"
00089             else:
00090                 #Ignore any header before the first ID/LOCUS line.
00091                 if self.debug > 1:
00092                         print "Skipping header line before record:\n" + line
00093         self.line = line
00094         return line

Here is the caller graph for this function:

def Bio.GenBank.Scanner.InsdcScanner.parse (   self,
  handle,
  do_features = True 
) [inherited]
Returns a SeqRecord (with SeqFeatures if do_features=True)

See also the method parse_records() for use on multi-record files.

Definition at line 417 of file Scanner.py.

00417 
00418     def parse(self, handle, do_features=True):
00419         """Returns a SeqRecord (with SeqFeatures if do_features=True)
00420 
00421         See also the method parse_records() for use on multi-record files.
00422         """
00423         from Bio.GenBank import _FeatureConsumer
00424         from Bio.GenBank.utils import FeatureValueCleaner
00425 
00426         consumer = _FeatureConsumer(use_fuzziness = 1, 
00427                     feature_cleaner = FeatureValueCleaner())
00428 
00429         if self.feed(handle, consumer, do_features):
00430             return consumer.data
00431         else:
00432             return None
00433 
    

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.GenBank.Scanner.InsdcScanner.parse_cds_features (   self,
  handle,
  alphabet = generic_protein,
  tags2id = ('protein_id','locus_tag',
  product 
) [inherited]
Returns SeqRecord object iterator

Each CDS feature becomes a SeqRecord.

alphabet - Used for any sequence found in a translation field.
tags2id  - Tupple of three strings, the feature keys to use
   for the record id, name and description,

This method is intended for use in Bio.SeqIO

Definition at line 454 of file Scanner.py.

00454 
00455                            tags2id=('protein_id','locus_tag','product')):
00456         """Returns SeqRecord object iterator
00457 
00458         Each CDS feature becomes a SeqRecord.
00459 
00460         alphabet - Used for any sequence found in a translation field.
00461         tags2id  - Tupple of three strings, the feature keys to use
00462                    for the record id, name and description,
00463 
00464         This method is intended for use in Bio.SeqIO
00465         """
00466         self.set_handle(handle)
00467         while self.find_start():
00468             #Got an EMBL or GenBank record...
00469             self.parse_header() # ignore header lines!
00470             feature_tuples = self.parse_features()
00471             #self.parse_footer() # ignore footer lines!
00472             while True:
00473                 line = self.handle.readline()
00474                 if not line : break
00475                 if line[:2]=="//" : break
00476             self.line = line.rstrip()
00477 
00478             #Now go though those features...
00479             for key, location_string, qualifiers in feature_tuples:
00480                 if key=="CDS":
00481                     #Create SeqRecord
00482                     #================
00483                     #SeqRecord objects cannot be created with annotations, they
00484                     #must be added afterwards.  So create an empty record and
00485                     #then populate it:
00486                     record = SeqRecord(seq=None)
00487                     annotations = record.annotations
00488 
00489                     #Should we add a location object to the annotations?
00490                     #I *think* that only makes sense for SeqFeatures with their
00491                     #sub features...
00492                     annotations['raw_location'] = location_string.replace(' ','')
00493 
00494                     for (qualifier_name, qualifier_data) in qualifiers:
00495                         if qualifier_data is not None \
00496                         and qualifier_data[0]=='"' and qualifier_data[-1]=='"':
00497                             #Remove quotes
00498                             qualifier_data = qualifier_data[1:-1]
00499                         #Append the data to the annotation qualifier...
00500                         if qualifier_name == "translation":
00501                             assert record.seq is None, "Multiple translations!"
00502                             record.seq = Seq(qualifier_data.replace("\n",""), alphabet)
00503                         elif qualifier_name == "db_xref":
00504                             #its a list, possibly empty.  Its safe to extend
00505                             record.dbxrefs.append(qualifier_data)
00506                         else:
00507                             if qualifier_data is not None:
00508                                 qualifier_data = qualifier_data.replace("\n"," ").replace("  "," ")
00509                             try:
00510                                 annotations[qualifier_name] += " " + qualifier_data
00511                             except KeyError:
00512                                 #Not an addition to existing data, its the first bit
00513                                 annotations[qualifier_name]= qualifier_data
00514                         
00515                     #Fill in the ID, Name, Description
00516                     #=================================
00517                     try:
00518                         record.id = annotations[tags2id[0]]
00519                     except KeyError:
00520                         pass
00521                     try:
00522                         record.name = annotations[tags2id[1]]
00523                     except KeyError:
00524                         pass
00525                     try:
00526                         record.description = annotations[tags2id[2]]
00527                     except KeyError:
00528                         pass
00529 
00530                     yield record
00531 

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.GenBank.Scanner.InsdcScanner.parse_feature (   self,
  feature_key,
  lines 
) [inherited]
Expects a feature as a list of strings, returns a tuple (key, location, qualifiers)

For example given this GenBank feature:

     CDS             complement(join(490883..490885,1..879))
             /locus_tag="NEQ001"
             /note="conserved hypothetical [Methanococcus jannaschii];
             COG1583:Uncharacterized ACR; IPR001472:Bipartite nuclear
             localization signal; IPR002743: Protein of unknown
             function DUF57"
             /codon_start=1
             /transl_table=11
             /product="hypothetical protein"
             /protein_id="NP_963295.1"
             /db_xref="GI:41614797"
             /db_xref="GeneID:2732620"
             /translation="MRLLLELKALNSIDKKQLSNYLIQGFIYNILKNTEYSWLHNWKK
             EKYFNFTLIPKKDIIENKRYYLIISSPDKRFIEVLHNKIKDLDIITIGLAQFQLRKTK
             KFDPKLRFPWVTITPIVLREGKIVILKGDKYYKVFVKRLEELKKYNLIKKKEPILEEP
             IEISLNQIKDGWKIIDVKDRYYDFRNKSFSAFSNWLRDLKEQSLRKYNNFCGKNFYFE
             EAIFEGFTFYKTVSIRIRINRGEAVYIGTLWKELNVYRKLDKEEREFYKFLYDCGLGS
             LNSMGFGFVNTKKNSAR"

Then should give input key="CDS" and the rest of the data as a list of strings
lines=["complement(join(490883..490885,1..879))", ..., "LNSMGFGFVNTKKNSAR"]
where the leading spaces and trailing newlines have been removed.

Returns tuple containing: (key as string, location string, qualifiers as list)
as follows for this example:

key = "CDS", string
location = "complement(join(490883..490885,1..879))", string
qualifiers = list of string tuples:

[('locus_tag', '"NEQ001"'),
 ('note', '"conserved hypothetical [Methanococcus jannaschii];\nCOG1583:..."'),
 ('codon_start', '1'),
 ('transl_table', '11'),
 ('product', '"hypothetical protein"'),
 ('protein_id', '"NP_963295.1"'),
 ('db_xref', '"GI:41614797"'),
 ('db_xref', '"GeneID:2732620"'),
 ('translation', '"MRLLLELKALNSIDKKQLSNYLIQGFIYNILKNTEYSWLHNWKK\nEKYFNFT..."')]

In the above example, the "note" and "translation" were edited for compactness,
and they would contain multiple new line characters (displayed above as \n)

If a qualifier is quoted (in this case, everything except codon_start and
transl_table) then the quotes are NOT removed.

Note that no whitespace is removed.

Definition at line 192 of file Scanner.py.

00192 
00193     def parse_feature(self, feature_key, lines):
00194         """Expects a feature as a list of strings, returns a tuple (key, location, qualifiers)
00195 
00196         For example given this GenBank feature:
00197 
00198              CDS             complement(join(490883..490885,1..879))
00199                              /locus_tag="NEQ001"
00200                              /note="conserved hypothetical [Methanococcus jannaschii];
00201                              COG1583:Uncharacterized ACR; IPR001472:Bipartite nuclear
00202                              localization signal; IPR002743: Protein of unknown
00203                              function DUF57"
00204                              /codon_start=1
00205                              /transl_table=11
00206                              /product="hypothetical protein"
00207                              /protein_id="NP_963295.1"
00208                              /db_xref="GI:41614797"
00209                              /db_xref="GeneID:2732620"
00210                              /translation="MRLLLELKALNSIDKKQLSNYLIQGFIYNILKNTEYSWLHNWKK
00211                              EKYFNFTLIPKKDIIENKRYYLIISSPDKRFIEVLHNKIKDLDIITIGLAQFQLRKTK
00212                              KFDPKLRFPWVTITPIVLREGKIVILKGDKYYKVFVKRLEELKKYNLIKKKEPILEEP
00213                              IEISLNQIKDGWKIIDVKDRYYDFRNKSFSAFSNWLRDLKEQSLRKYNNFCGKNFYFE
00214                              EAIFEGFTFYKTVSIRIRINRGEAVYIGTLWKELNVYRKLDKEEREFYKFLYDCGLGS
00215                              LNSMGFGFVNTKKNSAR"
00216 
00217         Then should give input key="CDS" and the rest of the data as a list of strings
00218         lines=["complement(join(490883..490885,1..879))", ..., "LNSMGFGFVNTKKNSAR"]
00219         where the leading spaces and trailing newlines have been removed.
00220 
00221         Returns tuple containing: (key as string, location string, qualifiers as list)
00222         as follows for this example:
00223 
00224         key = "CDS", string
00225         location = "complement(join(490883..490885,1..879))", string
00226         qualifiers = list of string tuples:
00227 
00228         [('locus_tag', '"NEQ001"'),
00229          ('note', '"conserved hypothetical [Methanococcus jannaschii];\nCOG1583:..."'),
00230          ('codon_start', '1'),
00231          ('transl_table', '11'),
00232          ('product', '"hypothetical protein"'),
00233          ('protein_id', '"NP_963295.1"'),
00234          ('db_xref', '"GI:41614797"'),
00235          ('db_xref', '"GeneID:2732620"'),
00236          ('translation', '"MRLLLELKALNSIDKKQLSNYLIQGFIYNILKNTEYSWLHNWKK\nEKYFNFT..."')]
00237 
00238         In the above example, the "note" and "translation" were edited for compactness,
00239         and they would contain multiple new line characters (displayed above as \n)
00240 
00241         If a qualifier is quoted (in this case, everything except codon_start and
00242         transl_table) then the quotes are NOT removed.
00243 
00244         Note that no whitespace is removed.
00245         """
00246         #Skip any blank lines
00247         iterator = iter(filter(None, lines))
00248         try:
00249             line = iterator.next()
00250 
00251             feature_location = line.strip()
00252             while feature_location[-1:]==",":
00253                 #Multiline location, still more to come!
00254                 line = iterator.next()
00255                 feature_location += line.strip()
00256 
00257             qualifiers=[]
00258 
00259             for i, line in enumerate(iterator):
00260                 # check for extra wrapping of the location closing parentheses
00261                 if i == 0 and line.startswith(")"):
00262                     feature_location += line.strip()
00263                 elif line[0]=="/":
00264                     #New qualifier
00265                     i = line.find("=")
00266                     key = line[1:i] #does not work if i==-1
00267                     value = line[i+1:] #we ignore 'value' if i==-1
00268                     if i==-1:
00269                         #Qualifier with no key, e.g. /pseudo
00270                         key = line[1:]
00271                         qualifiers.append((key,None))
00272                     elif not value:
00273                         #ApE can output /note=
00274                         qualifiers.append((key,""))
00275                     elif value[0]=='"':
00276                         #Quoted...
00277                         if value[-1]!='"' or value!='"':
00278                             #No closing quote on the first line...
00279                             while value[-1] != '"':
00280                                 value += "\n" + iterator.next()
00281                         else:
00282                             #One single line (quoted)
00283                             assert value == '"'
00284                             if self.debug : print "Quoted line %s:%s" % (key, value)
00285                         #DO NOT remove the quotes...
00286                         qualifiers.append((key,value))
00287                     else:
00288                         #Unquoted
00289                         #if debug : print "Unquoted line %s:%s" % (key,value)
00290                         qualifiers.append((key,value))
00291                 else:
00292                     #Unquoted continuation
00293                     assert len(qualifiers) > 0
00294                     assert key==qualifiers[-1][0]
00295                     #if debug : print "Unquoted Cont %s:%s" % (key, line)
00296                     qualifiers[-1] = (key, qualifiers[-1][1] + "\n" + line)
00297             return (feature_key, feature_location, qualifiers)
00298         except StopIteration:
00299             #Bummer
00300             raise ValueError("Problem with '%s' feature:\n%s" \
00301                               % (feature_key, "\n".join(lines)))

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.GenBank.Scanner._ImgtScanner.parse_features (   self,
  skip = False 
)
Return list of tuples for the features (if present)

Each feature is returned as a tuple (key, location, qualifiers)
where key and location are strings (e.g. "CDS" and
"complement(join(490883..490885,1..879))") while qualifiers
is a list of two string tuples (feature qualifier keys and values).

Assumes you have already read to the start of the features table.

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 799 of file Scanner.py.

00799 
00800     def parse_features(self, skip=False):
00801         """Return list of tuples for the features (if present)
00802 
00803         Each feature is returned as a tuple (key, location, qualifiers)
00804         where key and location are strings (e.g. "CDS" and
00805         "complement(join(490883..490885,1..879))") while qualifiers
00806         is a list of two string tuples (feature qualifier keys and values).
00807 
00808         Assumes you have already read to the start of the features table.
00809         """
00810         if self.line.rstrip() not in self.FEATURE_START_MARKERS:
00811             if self.debug : print "Didn't find any feature table"
00812             return []
00813         
00814         while self.line.rstrip() in self.FEATURE_START_MARKERS:
00815             self.line = self.handle.readline()
00816 
00817         bad_position_re = re.compile(r'([0-9]+)>{1}')
00818         
00819         features = []
00820         line = self.line
00821         while True:
00822             if not line:
00823                 raise ValueError("Premature end of line during features table")
00824             if line[:self.HEADER_WIDTH].rstrip() in self.SEQUENCE_HEADERS:
00825                 if self.debug : print "Found start of sequence"
00826                 break
00827             line = line.rstrip()
00828             if line == "//":
00829                 raise ValueError("Premature end of features table, marker '//' found")
00830             if line in self.FEATURE_END_MARKERS:
00831                 if self.debug : print "Found end of features"
00832                 line = self.handle.readline()
00833                 break
00834             if line[2:self.FEATURE_QUALIFIER_INDENT].strip() == "":
00835                 #This is an empty feature line between qualifiers. Empty
00836                 #feature lines within qualifiers are handled below (ignored).
00837                 line = self.handle.readline()
00838                 continue
00839 
00840             if skip:
00841                 line = self.handle.readline()
00842                 while line[:self.FEATURE_QUALIFIER_INDENT] == self.FEATURE_QUALIFIER_SPACER:
00843                     line = self.handle.readline()
00844             else:
00845                 assert line[:2] == "FT"
00846                 try:
00847                     feature_key, location_start = line[2:].strip().split()
00848                 except ValueError:
00849                     #e.g. "FT   TRANSMEMBRANE-REGION2163..2240\n"
00850                     #Assume indent of 25 as per IMGT spec, with the location
00851                     #start in column 26 (one-based).
00852                     feature_key = line[2:25].strip()
00853                     location_start = line[25:].strip()
00854                 feature_lines = [location_start]
00855                 line = self.handle.readline()
00856                 while line[:self.FEATURE_QUALIFIER_INDENT] == self.FEATURE_QUALIFIER_SPACER \
00857                 or line.rstrip() == "" : # cope with blank lines in the midst of a feature
00858                     #Use strip to remove any harmless trailing white space AND and leading
00859                     #white space (copes with 21 or 26 indents and orther variants)
00860                     assert line[:2] == "FT"
00861                     feature_lines.append(line[self.FEATURE_QUALIFIER_INDENT:].strip())
00862                     line = self.handle.readline()
00863                 feature_key, location, qualifiers = \
00864                                 self.parse_feature(feature_key, feature_lines)
00865                 #Try to handle known problems with IMGT locations here:
00866                 if ">" in location:
00867                     #Nasty hack for common IMGT bug, should be >123 not 123>
00868                     #in a location string. At least here the meaning is clear, 
00869                     #and since it is so common I don't want to issue a warning
00870                     #warnings.warn("Feature location %s is invalid, "
00871                     #              "moving greater than sign before position"
00872                     #              % location)
00873                     location = bad_position_re.sub(r'>\1',location)
00874                 features.append((feature_key, location, qualifiers))
00875         self.line = line
00876         return features

returns a tuple containing a list of any misc strings, and the sequence

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 543 of file Scanner.py.

00543 
00544     def parse_footer(self):
00545         """returns a tuple containing a list of any misc strings, and the sequence"""
00546         assert self.line[:self.HEADER_WIDTH].rstrip() in self.SEQUENCE_HEADERS, \
00547             "Eh? '%s'" % self.line
00548 
00549         #Note that the SQ line can be split into several lines...
00550         misc_lines = []
00551         while self.line[:self.HEADER_WIDTH].rstrip() in self.SEQUENCE_HEADERS:
00552             misc_lines.append(self.line)
00553             self.line = self.handle.readline()
00554             if not self.line:
00555                 raise ValueError("Premature end of file")
00556             self.line = self.line.rstrip()
00557 
00558         assert self.line[:self.HEADER_WIDTH] == " " * self.HEADER_WIDTH \
00559                or self.line.strip() == '//', repr(self.line)
00560         
00561         seq_lines = []
00562         line = self.line
00563         while True:
00564             if not line:
00565                 raise ValueError("Premature end of file in sequence data")
00566             line = line.strip()
00567             if not line:
00568                 raise ValueError("Blank line in sequence data")
00569             if line=='//':
00570                 break
00571             assert self.line[:self.HEADER_WIDTH] == " " * self.HEADER_WIDTH, \
00572                    repr(self.line)
00573             #Remove tailing number now, remove spaces later
00574             seq_lines.append(line.rsplit(None,1)[0])
00575             line = self.handle.readline()
00576         self.line = line
00577         return (misc_lines, "".join(seq_lines).replace(" ", ""))

Return list of strings making up the header

New line characters are removed.

Assumes you have just read in the ID/LOCUS line.

Definition at line 95 of file Scanner.py.

00095 
00096     def parse_header(self):
00097         """Return list of strings making up the header
00098 
00099         New line characters are removed.
00100 
00101         Assumes you have just read in the ID/LOCUS line.
00102         """
00103         assert self.line[:self.HEADER_WIDTH]==self.RECORD_START, \
00104                "Not at start of record"
00105         
00106         header_lines = []
00107         while True:
00108             line = self.handle.readline()
00109             if not line:
00110                 raise ValueError("Premature end of line during sequence data")
00111             line = line.rstrip()
00112             if line in self.FEATURE_START_MARKERS:
00113                 if self.debug : print "Found header table"
00114                 break
00115             #if line[:self.HEADER_WIDTH]==self.FEATURE_START_MARKER[:self.HEADER_WIDTH]:
00116             #    if self.debug : print "Found header table (?)"
00117             #    break
00118             if line[:self.HEADER_WIDTH].rstrip() in self.SEQUENCE_HEADERS:
00119                 if self.debug : print "Found start of sequence"
00120                 break
00121             if line == "//":
00122                 raise ValueError("Premature end of sequence data marker '//' found")
00123             header_lines.append(line)
00124         self.line = line
00125         return header_lines

Here is the caller graph for this function:

def Bio.GenBank.Scanner.InsdcScanner.parse_records (   self,
  handle,
  do_features = True 
) [inherited]
Returns a SeqRecord object iterator

Each record (from the ID/LOCUS line to the // line) becomes a SeqRecord

The SeqRecord objects include SeqFeatures if do_features=True

This method is intended for use in Bio.SeqIO

Definition at line 434 of file Scanner.py.

00434 
00435     def parse_records(self, handle, do_features=True):
00436         """Returns a SeqRecord object iterator
00437 
00438         Each record (from the ID/LOCUS line to the // line) becomes a SeqRecord
00439 
00440         The SeqRecord objects include SeqFeatures if do_features=True
00441         
00442         This method is intended for use in Bio.SeqIO
00443         """
00444         #This is a generator function
00445         while True:
00446             record = self.parse(handle, do_features)
00447             if record is None : break
00448             assert record.id is not None
00449             assert record.name != "<unknown name>"
00450             assert record.description != "<unknown description>"
00451             yield record

Here is the call graph for this function:

def Bio.GenBank.Scanner.InsdcScanner.set_handle (   self,
  handle 
) [inherited]

Definition at line 62 of file Scanner.py.

00062 
00063     def set_handle(self, handle):
00064         self.handle = handle
00065         self.line = ""

Here is the caller graph for this function:


Member Data Documentation

Definition at line 59 of file Scanner.py.

list Bio.GenBank.Scanner.EmblScanner.FEATURE_END_MARKERS = ["XX"] [static, inherited]

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 538 of file Scanner.py.

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 539 of file Scanner.py.

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 540 of file Scanner.py.

Initial value:
["FH   Key             Location/Qualifiers",
                             "FH   Key             Location/Qualifiers (from EMBL)",
                             "FH   Key                 Location/Qualifiers",
                             "FH"]

Reimplemented from Bio.GenBank.Scanner.EmblScanner.

Definition at line 794 of file Scanner.py.

Definition at line 63 of file Scanner.py.

int Bio.GenBank.Scanner.EmblScanner.HEADER_WIDTH = 5 [static, inherited]

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 536 of file Scanner.py.

Reimplemented from Bio.GenBank.Scanner.EmblScanner.

Definition at line 814 of file Scanner.py.

string Bio.GenBank.Scanner.EmblScanner.RECORD_START = "ID " [static, inherited]

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 535 of file Scanner.py.

list Bio.GenBank.Scanner.EmblScanner.SEQUENCE_HEADERS = ["SQ", "CO"] [static, inherited]

Reimplemented from Bio.GenBank.Scanner.InsdcScanner.

Definition at line 541 of file Scanner.py.


The documentation for this class was generated from the following file: