Back to index

python-biopython  1.60
Public Member Functions | Public Attributes
BioSQL.BioSeqDatabase.BioSeqDatabase Class Reference

List of all members.

Public Member Functions

def __init__
def __repr__
def get_Seq_by_id
def get_Seq_by_acc
def get_Seq_by_ver
def get_Seqs_by_acc
def get_all_primary_ids
def __getitem__
def __delitem__
def __len__
def __contains__
def __iter__
def keys
def values
def items
def iterkeys
def itervalues
def iteritems
def keys
def values
def items
def lookup
def get_Seq_by_primary_id
def load

Public Attributes

 adaptor
 name
 dbid

Detailed Description

Represents a namespace (sub-database) within the BioSQL database.

i.e. One row in the biodatabase table, and all all rows in the bioentry
table associated with it.

Definition at line 442 of file BioSeqDatabase.py.


Constructor & Destructor Documentation

def BioSQL.BioSeqDatabase.BioSeqDatabase.__init__ (   self,
  adaptor,
  name 
)

Definition at line 448 of file BioSeqDatabase.py.

00448 
00449     def __init__(self, adaptor, name):
00450         self.adaptor = adaptor
00451         self.name = name
00452         self.dbid = self.adaptor.fetch_dbid_by_dbname(name)


Member Function Documentation

Check if a primary (internal) id is this namespace (sub database).

Definition at line 534 of file BioSeqDatabase.py.

00534 
00535     def __contains__(self, value):
00536         """Check if a primary (internal) id is this namespace (sub database)."""
00537         sql = "SELECT COUNT(bioentry_id) FROM bioentry " + \
00538               "WHERE biodatabase_id=%s AND bioentry_id=%s;"
00539         #The bioentry_id field is an integer in the schema.
00540         #PostgreSQL will throw an error if we use a non integer in the query.
00541         try:
00542             bioentry_id = int(value)
00543         except ValueError:
00544             return False
00545         return bool(self.adaptor.execute_and_fetch_col0(sql,
00546                                                   (self.dbid, bioentry_id))[0])
    
Remove an entry and all its annotation.

Definition at line 519 of file BioSeqDatabase.py.

00519 
00520     def __delitem__(self, key):
00521         """Remove an entry and all its annotation."""
00522         if key not in self:
00523             raise KeyError(key)
00524         #Assuming this will automatically cascade to the other tables...
00525         sql = "DELETE FROM bioentry " + \
00526               "WHERE biodatabase_id=%s AND bioentry_id=%s;"
00527         self.adaptor.execute(sql, (self.dbid,key))

Definition at line 516 of file BioSeqDatabase.py.

00516 
00517     def __getitem__(self, key):
00518         return BioSeq.DBSeqRecord(self.adaptor, key)

Iterate over ids (which may not be meaningful outside this database).

Definition at line 547 of file BioSeqDatabase.py.

00547 
00548     def __iter__(self):
00549         """Iterate over ids (which may not be meaningful outside this database)."""
00550         #TODO - Iterate over the cursor, much more efficient
00551         return iter(self.adaptor.list_bioentry_ids(self.dbid))        

Number of records in this namespace (sub database).

Definition at line 528 of file BioSeqDatabase.py.

00528 
00529     def __len__(self):
00530         """Number of records in this namespace (sub database)."""
00531         sql = "SELECT COUNT(bioentry_id) FROM bioentry " + \
00532               "WHERE biodatabase_id=%s;"
00533         return int(self.adaptor.execute_and_fetch_col0(sql, (self.dbid,))[0])

Definition at line 453 of file BioSeqDatabase.py.

00453 
00454     def __repr__(self):
00455         return "BioSeqDatabase(%r, %r)" % (self.adaptor, self.name)
        
All the primary_ids of the sequences in the database (OBSOLETE).

These maybe ids (display style) or accession numbers or
something else completely different - they *are not*
meaningful outside of this database implementation.

Please use .keys() instead of .get_all_primary_ids()

Definition at line 501 of file BioSeqDatabase.py.

00501 
00502     def get_all_primary_ids(self):
00503         """All the primary_ids of the sequences in the database (OBSOLETE).
00504 
00505         These maybe ids (display style) or accession numbers or
00506         something else completely different - they *are not*
00507         meaningful outside of this database implementation.
00508         
00509         Please use .keys() instead of .get_all_primary_ids()
00510         """
00511         import warnings
00512         warnings.warn("Use bio_seq_database.keys() instead of "
00513                       "bio_seq_database.get_all_primary_ids()",
00514                       PendingDeprecationWarning)
00515         return self.keys()

Here is the call graph for this function:

Gets a DBSeqRecord object by accession number

Example: seq_rec = db.get_Seq_by_acc('X77802')

The name of this method is misleading since it returns a DBSeqRecord
rather than a DBSeq ojbect, and presumably was to mirror BioPerl.

Definition at line 467 of file BioSeqDatabase.py.

00467 
00468     def get_Seq_by_acc(self, name):
00469         """Gets a DBSeqRecord object by accession number
00470 
00471         Example: seq_rec = db.get_Seq_by_acc('X77802')
00472 
00473         The name of this method is misleading since it returns a DBSeqRecord
00474         rather than a DBSeq ojbect, and presumably was to mirror BioPerl.
00475         """
00476         seqid = self.adaptor.fetch_seqid_by_accession(self.dbid, name)
00477         return BioSeq.DBSeqRecord(self.adaptor, seqid)

Gets a DBSeqRecord object by its name

Example: seq_rec = db.get_Seq_by_id('ROA1_HUMAN')

The name of this method is misleading since it returns a DBSeqRecord
rather than a DBSeq ojbect, and presumably was to mirror BioPerl.

Definition at line 456 of file BioSeqDatabase.py.

00456 
00457     def get_Seq_by_id(self, name):
00458         """Gets a DBSeqRecord object by its name
00459 
00460         Example: seq_rec = db.get_Seq_by_id('ROA1_HUMAN')
00461         
00462         The name of this method is misleading since it returns a DBSeqRecord
00463         rather than a DBSeq ojbect, and presumably was to mirror BioPerl.
00464         """
00465         seqid = self.adaptor.fetch_seqid_by_display_id(self.dbid, name)
00466         return BioSeq.DBSeqRecord(self.adaptor, seqid)

Get a DBSeqRecord by the primary (internal) id (OBSOLETE).

Rather than db.get_Seq_by_primary_id(my_id) use db[my_id]

The name of this method is misleading since it returns a DBSeqRecord
rather than a DBSeq ojbect, and presumably was to mirror BioPerl.

Definition at line 607 of file BioSeqDatabase.py.

00607 
00608     def get_Seq_by_primary_id(self, seqid):
00609         """Get a DBSeqRecord by the primary (internal) id (OBSOLETE).
00610         
00611         Rather than db.get_Seq_by_primary_id(my_id) use db[my_id]
00612         
00613         The name of this method is misleading since it returns a DBSeqRecord
00614         rather than a DBSeq ojbect, and presumably was to mirror BioPerl.
00615         """
00616         import warnings
00617         warnings.warn("Use bio_seq_database[my_id] instead of "
00618                       "bio_seq_database.get_Seq_by_primary_id(my_id)",
00619                       PendingDeprecationWarning)
00620         return self[seqid]

Gets a DBSeqRecord object by version number

Example: seq_rec = db.get_Seq_by_ver('X77802.1')

The name of this method is misleading since it returns a DBSeqRecord
rather than a DBSeq ojbect, and presumably was to mirror BioPerl.

Definition at line 478 of file BioSeqDatabase.py.

00478 
00479     def get_Seq_by_ver(self, name):
00480         """Gets a DBSeqRecord object by version number
00481 
00482         Example: seq_rec = db.get_Seq_by_ver('X77802.1')
00483 
00484         The name of this method is misleading since it returns a DBSeqRecord
00485         rather than a DBSeq ojbect, and presumably was to mirror BioPerl.
00486         """
00487         seqid = self.adaptor.fetch_seqid_by_version(self.dbid, name)
00488         return BioSeq.DBSeqRecord(self.adaptor, seqid)

Gets a list of DBSeqRecord objects by accession number

Example: seq_recs = db.get_Seq_by_acc('X77802')

The name of this method is misleading since it returns a list of
DBSeqRecord objects rather than a list of DBSeq ojbects, and presumably
was to mirror BioPerl.

Definition at line 489 of file BioSeqDatabase.py.

00489 
00490     def get_Seqs_by_acc(self, name):
00491         """Gets a list of DBSeqRecord objects by accession number
00492 
00493         Example: seq_recs = db.get_Seq_by_acc('X77802')
00494 
00495         The name of this method is misleading since it returns a list of
00496         DBSeqRecord objects rather than a list of DBSeq ojbects, and presumably
00497         was to mirror BioPerl.
00498         """
00499         seqids = self.adaptor.fetch_seqids_by_accession(self.dbid, name)
00500         return [BioSeq.DBSeqRecord(self.adaptor, seqid) for seqid in seqids]

List of (id, DBSeqRecord) for the namespace (sub database).

Definition at line 562 of file BioSeqDatabase.py.

00562 
00563         def items(self):
00564             """List of (id, DBSeqRecord) for the namespace (sub database)."""
00565             return [(key, self[key]) for key in self.keys()]
        

Here is the call graph for this function:

Here is the caller graph for this function:

Iterate over (id, DBSeqRecord) for the namespace (sub database).

Definition at line 590 of file BioSeqDatabase.py.

00590 
00591         def items(self):
00592             """Iterate over (id, DBSeqRecord) for the namespace (sub database)."""
00593             for key in self:
00594                 yield key, self[key]

Here is the call graph for this function:

Here is the caller graph for this function:

Iterate over (id, DBSeqRecord) for the namespace (sub database).

Definition at line 575 of file BioSeqDatabase.py.

00575 
00576         def iteritems(self):
00577             """Iterate over (id, DBSeqRecord) for the namespace (sub database)."""
00578             for key in self:
                yield key, self[key]

Here is the caller graph for this function:

Iterate over ids (which may not be meaningful outside this database).

Definition at line 566 of file BioSeqDatabase.py.

00566 
00567         def iterkeys(self):
00568             """Iterate over ids (which may not be meaningful outside this database)."""
00569             return iter(self)
    
Iterate over DBSeqRecord objects in the namespace (sub database).

Definition at line 570 of file BioSeqDatabase.py.

00570 
00571         def itervalues(self):
00572             """Iterate over DBSeqRecord objects in the namespace (sub database)."""
00573             for key in self:
00574                 yield self[key]
            

Here is the caller graph for this function:

List of ids which may not be meaningful outside this database.

Definition at line 554 of file BioSeqDatabase.py.

00554 
00555         def keys(self):
00556             """List of ids which may not be meaningful outside this database."""
00557             return self.adaptor.list_bioentry_ids(self.dbid)

Here is the caller graph for this function:

Iterate over ids (which may not be meaningful outside this database).

Definition at line 581 of file BioSeqDatabase.py.

00581 
00582         def keys(self):
00583             """Iterate over ids (which may not be meaningful outside this database)."""
00584             return iter(self)
            

Here is the call graph for this function:

Here is the caller graph for this function:

def BioSQL.BioSeqDatabase.BioSeqDatabase.load (   self,
  record_iterator,
  fetch_NCBI_taxonomy = False 
)
Load a set of SeqRecords into the BioSQL database.

record_iterator is either a list of SeqRecord objects, or an
Iterator object that returns SeqRecord objects (such as the
output from the Bio.SeqIO.parse() function), which will be
used to populate the database.

fetch_NCBI_taxonomy is boolean flag allowing or preventing
connection to the taxonomic database on the NCBI server
(via Bio.Entrez) to fetch a detailed taxonomy for each
SeqRecord.

Example:
from Bio import SeqIO
count = db.load(SeqIO.parse(open(filename), format))

Returns the number of records loaded.

Definition at line 621 of file BioSeqDatabase.py.

00621 
00622     def load(self, record_iterator, fetch_NCBI_taxonomy=False):
00623         """Load a set of SeqRecords into the BioSQL database.
00624 
00625         record_iterator is either a list of SeqRecord objects, or an
00626         Iterator object that returns SeqRecord objects (such as the
00627         output from the Bio.SeqIO.parse() function), which will be
00628         used to populate the database.
00629 
00630         fetch_NCBI_taxonomy is boolean flag allowing or preventing
00631         connection to the taxonomic database on the NCBI server
00632         (via Bio.Entrez) to fetch a detailed taxonomy for each
00633         SeqRecord.
00634 
00635         Example:
00636         from Bio import SeqIO
00637         count = db.load(SeqIO.parse(open(filename), format))
00638 
00639         Returns the number of records loaded.
00640         """
00641         db_loader = Loader.DatabaseLoader(self.adaptor, self.dbid, \
00642                                           fetch_NCBI_taxonomy)
00643         num_records = 0
00644         global _POSTGRES_RULES_PRESENT
00645         for cur_record in record_iterator:
00646             num_records += 1
00647             #Hack to work arround BioSQL Bug 2839 - If using PostgreSQL and
00648             #the RULES are present check for a duplicate record before loading
00649             if _POSTGRES_RULES_PRESENT:
00650                 #Recreate what the Loader's _load_bioentry_table will do:
00651                 if cur_record.id.count(".") == 1:
00652                     accession, version = cur_record.id.split('.')
00653                     try:
00654                         version = int(version)
00655                     except ValueError:
00656                         accession = cur_record.id
00657                         version = 0
00658                 else:
00659                     accession = cur_record.id
00660                     version = 0
00661                 gi = cur_record.annotations.get("gi", None)
00662                 sql = "SELECT bioentry_id FROM bioentry WHERE (identifier " + \
00663                       "= '%s' AND biodatabase_id = '%s') OR (accession = " + \
00664                       "'%s' AND version = '%s' AND biodatabase_id = '%s')"
00665                 self.adaptor.execute(sql % (gi, self.dbid, accession, version, self.dbid))
00666                 if self.adaptor.cursor.fetchone():
00667                     raise self.adaptor.conn.IntegrityError("Duplicate record " 
00668                                      "detected: record has not been inserted")
00669             #End of hack
00670             db_loader.load_seqrecord(cur_record)
00671         return num_records
def BioSQL.BioSeqDatabase.BioSeqDatabase.lookup (   self,
  kwargs 
)

Definition at line 595 of file BioSeqDatabase.py.

00595 
00596     def lookup(self, **kwargs):
00597         if len(kwargs) != 1:
00598             raise TypeError("single key/value parameter expected")
00599         k, v = kwargs.items()[0]
00600         if k not in _allowed_lookups:
00601             raise TypeError("lookup() expects one of %s, not %r" % \
00602                             (repr(_allowed_lookups.keys())[1:-1], repr(k)))
00603         lookup_name = _allowed_lookups[k]
00604         lookup_func = getattr(self.adaptor, lookup_name)
00605         seqid = lookup_func(self.dbid, v)
00606         return BioSeq.DBSeqRecord(self.adaptor, seqid)
        
List of DBSeqRecord objects in the namespace (sub database).

Definition at line 558 of file BioSeqDatabase.py.

00558 
00559         def values(self):
00560             """List of DBSeqRecord objects in the namespace (sub database)."""
00561             return [self[key] for key in self.keys()]
    

Here is the call graph for this function:

Here is the caller graph for this function:

Iterate over DBSeqRecord objects in the namespace (sub database).

Definition at line 585 of file BioSeqDatabase.py.

00585 
00586         def values(self):
00587             """Iterate over DBSeqRecord objects in the namespace (sub database)."""
00588             for key in self:
00589                 yield self[key]
    

Here is the call graph for this function:


Member Data Documentation

Definition at line 449 of file BioSeqDatabase.py.

Definition at line 451 of file BioSeqDatabase.py.

Definition at line 450 of file BioSeqDatabase.py.


The documentation for this class was generated from the following file: