Back to index

python-biopython  1.60
Functions | Variables
Bio.SeqUtils.CheckSum Namespace Reference

Functions

def crc32
def _init_table_h
def crc64
def gcg
def seguid

Variables

tuple _table_h = _init_table_h()
string str_light_chain_one = "QSALTQPASVSGSPGQSITISCTGTSSDVGSYNLVSWYQQHPGK"
string str_light_chain_two = "QSALTQPASVSGSPGQSITISCTGTSSDVGSYNLVSWYQQHPGK"

Function Documentation

Definition at line 28 of file CheckSum.py.

00028 
00029 def _init_table_h():
00030     _table_h = []
00031     for i in range(256):
00032         l = i
00033         part_h = 0
00034         for j in range(8):
00035             rflag = l & 1
00036             l >>= 1
00037             if part_h & 1: l |= (1L << 31)
00038             part_h >>= 1L
00039             if rflag: part_h ^= 0xd8000000L
00040         _table_h.append(part_h)
00041     return _table_h
00042 
# Initialisation
Returns the crc32 checksum for a sequence (string or Seq object).

Definition at line 16 of file CheckSum.py.

00016 
00017 def crc32(seq):
00018     """Returns the crc32 checksum for a sequence (string or Seq object)."""
00019     #NOTE - On Python 2 returns a signed int, on Python 3 it is unsigned
00020     #Docs suggest should use crc32(x) & 0xffffffff for consistency.
00021     #TODO - Should we return crc32(x) & 0xffffffff here?
00022     try:
00023         #Assume its a Seq object
00024         return _crc32(_as_bytes(seq.tostring()))
00025     except AttributeError:
00026         #Assume its a string/unicode
00027         return _crc32(_as_bytes(seq))

Here is the call graph for this function:

Here is the caller graph for this function:

Returns the crc64 checksum for a sequence (string or Seq object).

Definition at line 45 of file CheckSum.py.

00045 
00046 def crc64(s):
00047     """Returns the crc64 checksum for a sequence (string or Seq object)."""
00048     crcl = 0
00049     crch = 0
00050     for c in s:
00051         shr = (crch & 0xFF) << 24
00052         temp1h = crch >> 8
00053         temp1l = (crcl >> 8) | shr
00054         idx  = (crcl ^ ord(c)) & 0xFF
00055         crch = temp1h ^ _table_h[idx]
00056         crcl = temp1l
00057 
00058     return "CRC-%08X%08X" % (crch, crcl)
00059 

Here is the caller graph for this function:

Returns the GCG checksum (int) for a sequence (string or Seq object).

Given a nucleotide or amino-acid secuence (or any string),
returns the GCG checksum (int). Checksum used by GCG program.
seq type = str.
Based on BioPerl GCG_checksum. Adapted by Sebastian Bassi
with the help of John Lenton, Pablo Ziliani, and Gabriel Genellina.
All sequences are converted to uppercase 

Definition at line 60 of file CheckSum.py.

00060 
00061 def gcg(seq):
00062     """Returns the GCG checksum (int) for a sequence (string or Seq object).
00063 
00064     Given a nucleotide or amino-acid secuence (or any string),
00065     returns the GCG checksum (int). Checksum used by GCG program.
00066     seq type = str.
00067     Based on BioPerl GCG_checksum. Adapted by Sebastian Bassi
00068     with the help of John Lenton, Pablo Ziliani, and Gabriel Genellina.
00069     All sequences are converted to uppercase """
00070     try:
00071         #Assume its a Seq object
00072         seq = seq.tostring()
00073     except AttributeError:
00074         #Assume its a string
00075         pass
00076     index = checksum = 0
00077     for char in seq:
00078         index += 1
00079         checksum += index * ord(char.upper())
00080         if index == 57: index = 0
00081     return checksum % 10000

Returns the SEGUID (string) for a sequence (string or Seq object).

Given a nucleotide or amino-acid secuence (or any string),
returns the SEGUID string (A SEquence Globally Unique IDentifier).
seq type = str. 
For more information about SEGUID, see:
http://bioinformatics.anl.gov/seguid/
DOI: 10.1002/pmic.200600032 

Definition at line 82 of file CheckSum.py.

00082 
00083 def seguid(seq):
00084     """Returns the SEGUID (string) for a sequence (string or Seq object).
00085     
00086     Given a nucleotide or amino-acid secuence (or any string),
00087     returns the SEGUID string (A SEquence Globally Unique IDentifier).
00088     seq type = str. 
00089     For more information about SEGUID, see:
00090     http://bioinformatics.anl.gov/seguid/
00091     DOI: 10.1002/pmic.200600032 """
00092     try:
00093         #Python 2.5 sha1 is in hashlib
00094         import hashlib
00095         m = hashlib.sha1()
00096     except:
00097         #For older versions 
00098         import sha
00099         m = sha.new()
00100     import base64
00101     try:
00102         #Assume its a Seq object
00103         seq = seq.tostring()
00104     except AttributeError:
00105         #Assume its a string
00106         pass
00107     m.update(_as_bytes(seq.upper()))
00108     try:
00109         #For Python 3+
00110         return base64.encodebytes(m.digest()).decode().replace("\n","").rstrip("=")
00111     except AttributeError:
00112         pass
00113     try:
00114         #For Python 2.5+
00115         return base64.b64encode(m.digest()).rstrip("=")
00116     except:
00117         #For older versions
00118         import os
00119         #Note: Using os.linesep doesn't work on Windows,
00120         #where os.linesep= "\r\n" but the encoded string
00121         #contains "\n" but not "\r\n"
00122         return base64.encodestring(m.digest()).replace("\n","").rstrip("=")

Here is the call graph for this function:

Here is the caller graph for this function:


Variable Documentation

Definition at line 43 of file CheckSum.py.

string Bio.SeqUtils.CheckSum.str_light_chain_one = "QSALTQPASVSGSPGQSITISCTGTSSDVGSYNLVSWYQQHPGK"

Definition at line 126 of file CheckSum.py.

string Bio.SeqUtils.CheckSum.str_light_chain_two = "QSALTQPASVSGSPGQSITISCTGTSSDVGSYNLVSWYQQHPGK"

Definition at line 130 of file CheckSum.py.