Back to index

python-biopython  1.60
Public Member Functions
test_prodoc.TestProdocParse Class Reference

List of all members.

Public Member Functions

def test_parse_pdoc

Detailed Description

Definition at line 837 of file test_prodoc.py.


Member Function Documentation

Definition at line 839 of file test_prodoc.py.

00839 
00840     def test_parse_pdoc(self):
00841         "Parsing an excerpt of prosite.doc" 
00842         filename = os.path.join( 'Prosite', 'Doc', 'prosite.excerpt.doc')
00843         handle = open(filename)
00844         records = Prodoc.parse(handle)
00845 
00846         # Testing the first parsed record
00847         record = records.next()
00848         self.assertEqual(record.accession, "PDOC00000")
00849         self.assertEqual(len(record.prosite_refs), 0)
00850         self.assertEqual(record.text, """\
00851 **********************************
00852 *** PROSITE documentation file ***
00853 **********************************
00854 
00855 Release 20.43 of 10-Feb-2009.
00856 
00857 PROSITE is developed by the Swiss Institute of Bioinformatics (SIB) under
00858 the responsability of Amos Bairoch and Nicolas Hulo.
00859 
00860 This release was prepared by: Nicolas Hulo, Virginie Bulliard, Petra
00861 Langendijk-Genevaux and Christian Sigrist with the help of Edouard
00862 de Castro, Lorenzo Cerutti, Corinne Lachaize and Amos Bairoch.
00863 
00864 
00865 See: http://www.expasy.org/prosite/
00866 Email: prosite@expasy.org
00867 
00868 Acknowledgements:
00869 
00870  - To all those mentioned in this document who have reviewed the entry(ies)
00871    for which they are listed as experts. With specific thanks to Rein Aasland,
00872    Mark Boguski, Peer Bork, Josh Cherry, Andre Chollet, Frank Kolakowski,
00873    David Landsman, Bernard Henrissat, Eugene Koonin, Steve Henikoff, Manuel
00874    Peitsch and Jonathan Reizer.
00875  - Jim Apostolopoulos is the author of the PDOC00699 entry.
00876  - Brigitte Boeckmann is the author of the PDOC00691, PDOC00703, PDOC00829,
00877    PDOC00796, PDOC00798, PDOC00799, PDOC00906, PDOC00907, PDOC00908,
00878    PDOC00912, PDOC00913, PDOC00924, PDOC00928, PDOC00929, PDOC00955,
00879    PDOC00961, PDOC00966, PDOC00988 and PDOC50020 entries.
00880  - Jean-Louis Boulay is the author of the PDOC01051, PDOC01050, PDOC01052,
00881    PDOC01053 and PDOC01054 entries.
00882  - Ryszard Brzezinski is the author of the PDOC60000 entry.
00883  - Elisabeth Coudert is the author of the PDOC00373 entry.
00884  - Kirill Degtyarenko is the author of the PDOC60001 entry.
00885  - Christian Doerig is the author of the PDOC01049 entry.
00886  - Kay Hofmann is the author of the PDOC50003, PDOC50006, PDOC50007 and
00887    PDOC50017 entries.
00888  - Chantal Hulo is the author of the PDOC00987 entry.
00889  - Karine Michoud is the author of the PDOC01044 and PDOC01042 entries.
00890  - Yuri Panchin is the author of the PDOC51013 entry.
00891  - S. Ramakumar is the author of the PDOC51052, PDOC60004, PDOC60010,
00892    PDOC60011, PDOC60015, PDOC60016, PDOC60018, PDOC60020, PDOC60021,
00893    PDOC60022, PDOC60023, PDOC60024, PDOC60025, PDOC60026, PDOC60027,
00894    PDOC60028, PDOC60029 and PDOC60030 entries.
00895  - Keith Robison is the author of the PDOC00830 and PDOC00861 entries.
00896 
00897    ------------------------------------------------------------------------
00898    PROSITE is copyright.   It  is  produced  by  the  Swiss  Institute   of
00899    Bioinformatics (SIB). There are no restrictions on its use by non-profit
00900    institutions as long as its  content is in no way modified. Usage by and
00901    for commercial  entities requires a license agreement.   For information
00902    about  the  licensing  scheme   send  an  email to license@isb-sib.ch or
00903    see: http://www.expasy.org/prosite/prosite_license.htm.
00904    ------------------------------------------------------------------------
00905 
00906 """)
00907 
00908         # Testing the second parsed record"
00909         record = records.next()
00910         self.assertEqual(record.accession, "PDOC00001")
00911         self.assertEqual(len(record.prosite_refs), 1)
00912         self.assertEqual(record.prosite_refs[0], ("PS00001", "ASN_GLYCOSYLATION"))
00913         self.assertEqual(record.text, """\
00914 ************************
00915 * N-glycosylation site *
00916 ************************
00917 
00918 It has been known for a long time [1] that potential N-glycosylation sites are
00919 specific to the consensus sequence Asn-Xaa-Ser/Thr.  It must be noted that the
00920 presence of the consensus  tripeptide  is  not sufficient  to conclude that an
00921 asparagine residue is glycosylated, due to  the fact that the  folding of  the
00922 protein plays an important  role in the  regulation of N-glycosylation [2]. It
00923 has been shown [3] that  the  presence of proline between Asn and Ser/Thr will
00924 inhibit N-glycosylation; this  has  been confirmed by a recent [4] statistical
00925 analysis of glycosylation sites, which also  shows that about 50% of the sites
00926 that have a proline C-terminal to Ser/Thr are not glycosylated.
00927 
00928 It must also  be noted that there  are  a few  reported cases of glycosylation
00929 sites with the pattern Asn-Xaa-Cys; an  experimentally demonstrated occurrence
00930 of such a non-standard site is found in the plasma protein C [5].
00931 
00932 -Consensus pattern: N-{P}-[ST]-{P}
00933                     [N is the glycosylation site]
00934 -Last update: May 1991 / Text revised.
00935 
00936 """)
00937         self.assertEqual(record.references[ 0].number, "1")
00938         self.assertEqual(record.references[ 0].authors, "Marshall R.D.")
00939         self.assertEqual(record.references[ 0].citation, """\
00940 "Glycoproteins."
00941 Annu. Rev. Biochem. 41:673-702(1972).
00942 PubMed=4563441; DOI=10.1146/annurev.bi.41.070172.003325""")
00943         self.assertEqual(record.references[ 1].number, "2")
00944         self.assertEqual(record.references[ 1].authors, "Pless D.D., Lennarz W.J.")
00945         self.assertEqual(record.references[ 1].citation, """\
00946 "Enzymatic conversion of proteins to glycoproteins."
00947 Proc. Natl. Acad. Sci. U.S.A. 74:134-138(1977).
00948 PubMed=264667""")
00949         self.assertEqual(record.references[ 2].number, "3")
00950         self.assertEqual(record.references[ 2].authors, "Bause E.")
00951         self.assertEqual(record.references[ 2].citation, """\
00952 "Structural requirements of N-glycosylation of proteins. Studies with
00953 proline peptides as conformational probes."
00954 Biochem. J. 209:331-336(1983).
00955 PubMed=6847620""")
00956         self.assertEqual(record.references[ 3].number, "4")
00957         self.assertEqual(record.references[ 3].authors, "Gavel Y., von Heijne G.")
00958         self.assertEqual(record.references[ 3].citation, """\
00959 "Sequence differences between glycosylated and non-glycosylated
00960 Asn-X-Thr/Ser acceptor sites: implications for protein engineering."
00961 Protein Eng. 3:433-442(1990).
00962 PubMed=2349213""")
00963         self.assertEqual(record.references[ 4].number, "5")
00964         self.assertEqual(record.references[ 4].authors, "Miletich J.P., Broze G.J. Jr.")
00965         self.assertEqual(record.references[ 4].citation, """\
00966 "Beta protein C is not glycosylated at asparagine 329. The rate of
00967 translation may influence the frequency of usage at
00968 asparagine-X-cysteine sites."
00969 J. Biol. Chem. 265:11397-11404(1990).
00970 PubMed=1694179""")
00971 
00972         # Testing the third parsed record" 
00973         record = records.next()
00974         self.assertEqual(record.accession, "PDOC00004")
00975         self.assertEqual(len(record.prosite_refs), 1)
00976         self.assertEqual(record.prosite_refs[0], ("PS00004", "CAMP_PHOSPHO_SITE"))
00977         self.assertEqual(record.text, """\
00978 ****************************************************************
00979 * cAMP- and cGMP-dependent protein kinase phosphorylation site *
00980 ****************************************************************
00981 
00982 There has been a  number of studies  relative to the  specificity of cAMP- and
00983 cGMP-dependent protein kinases [1,2,3].  Both types of kinases appear to share
00984 a preference  for  the  phosphorylation  of serine or threonine residues found
00985 close to at least  two consecutive N-terminal  basic residues. It is important
00986 to note that there are quite a number of exceptions to this rule.
00987 
00988 -Consensus pattern: [RK](2)-x-[ST]
00989                     [S or T is the phosphorylation site]
00990 -Last update: June 1988 / First entry.
00991 
00992 """)
00993 
00994         self.assertEqual(record.references[ 0].number, "1")
00995         self.assertEqual(record.references[ 0].authors, "Fremisco J.R., Glass D.B., Krebs E.G.")
00996         self.assertEqual(record.references[ 0].citation, """\
00997 J. Biol. Chem. 255:4240-4245(1980).""")
00998         self.assertEqual(record.references[ 1].number, "2")
00999         self.assertEqual(record.references[ 1].authors, "Glass D.B., Smith S.B.")
01000         self.assertEqual(record.references[ 1].citation, """\
01001 "Phosphorylation by cyclic GMP-dependent protein kinase of a synthetic
01002 peptide corresponding to the autophosphorylation site in the enzyme."
01003 J. Biol. Chem. 258:14797-14803(1983).
01004 PubMed=6317673""")
01005         self.assertEqual(record.references[ 2].number, "3")
01006         self.assertEqual(record.references[ 2].authors, "Glass D.B., el-Maghrabi M.R., Pilkis S.J.")
01007         self.assertEqual(record.references[ 2].citation, """\
01008 "Synthetic peptides corresponding to the site phosphorylated in
01009 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase as substrates of
01010 cyclic nucleotide-dependent protein kinases."
01011 J. Biol. Chem. 261:2987-2993(1986).
01012 PubMed=3005275""")
01013 
01014         # Testing the fourth parsed record"
01015         record = records.next()
01016         self.assertEqual(record.accession, "PDOC60030")
01017         self.assertEqual(len(record.prosite_refs), 1)
01018         self.assertEqual(record.prosite_refs[0], ("PS60030", "BACTERIOCIN_IIA"))
01019         self.assertEqual(record.text, """\
01020 ******************************************
01021 * Bacteriocin class IIa family signature *
01022 ******************************************
01023 
01024 Many Gram-positive  bacteria  produce  ribosomally  synthesized  antimicrobial
01025 peptides, often  termed  bacteriocins. One important and well studied class of
01026 bacteriocins is the class IIa or pediocin-like bacteriocins produced by lactic
01027 acid bacteria.  All  class  IIa  bacteriocins  are produced by food-associated
01028 strains, isolated  from  a  variety of food products of industrial and natural
01029 origins, including  meat  products,  dairy  products and vegetables. Class IIa
01030 bacteriocins are all cationic, display anti-Listeria activity, and kill target
01031 cells by permeabilizing the cell membrane [1-3].
01032 
01033 Class IIa  bacteriocins  contain  between  37  and 48 residues. Based on their
01034 primary structures,  the  peptide  chains  of  class  IIa  bacteriocins may be
01035 divided roughly into two regions: a hydrophilic, cationic and highly conserved
01036 N-terminal region,  and  a  less  conserved hydrophobic/amphiphilic C-terminal
01037 region. The  N-terminal  region  contains  the conserved Y-G-N-G-V/L 'pediocin
01038 box' motif  and  two conserved cysteine residues joined by a disulfide bridge.
01039 It forms  a  three-stranded antiparallel beta-sheet supported by the conserved
01040 disulfide bridge  (see <PDB:1OG7>). This cationic N-terminal beta-sheet domain
01041 mediates binding of the class IIa bacteriocin to the target cell membrane. The
01042 C-terminal region forms a hairpin-like domain (see <PDB:1OG7>) that penetrates
01043 into the  hydrophobic  part  of  the  target  cell membrane, thereby mediating
01044 leakage through  the  membrane.  The  two domains are joined by a hinge, which
01045 enables movement of the domains relative to each other [2,3].
01046 
01047 Some proteins  known  to belong to the class IIa bacteriocin family are listed
01048 below:
01049 
01050  - Pediococcus acidilactici pediocin PA-1.
01051  - Leuconostoc mesenteroides mesentericin Y105.
01052  - Carnobacterium piscicola carnobacteriocin B2.
01053  - Lactobacillus sake sakacin P.
01054  - Enterococcus faecium enterocin A.
01055  - Enterococcus faecium enterocin P.
01056  - Leuconostoc gelidum leucocin A.
01057  - Lactobacillus curvatus curvacin A.
01058  - Listeria innocua listeriocin 743A.
01059 
01060 The pattern  we  developed  for  the  class  IIa bacteriocin family covers the
01061 'pediocin box' motif.
01062 
01063 -Conserved pattern: Y-G-N-G-[VL]-x-C-x(4)-C
01064 -Sequences known to belong to this class detected by the pattern: ALL.
01065 -Other sequence(s) detected in Swiss-Prot: NONE.
01066 
01067 -Expert(s) to contact by email:
01068            Ramakumar S.; ramak@physics.iisc.ernet.in
01069 
01070 -Last update: March 2006 / First entry.
01071 
01072 """)
01073 
01074         self.assertEqual(record.references[ 0].number, "1")
01075         self.assertEqual(record.references[ 0].authors, "Ennahar S., Sonomoto K., Ishizaki A.")
01076         self.assertEqual(record.references[ 0].citation, """\
01077 "Class IIa bacteriocins from lactic acid bacteria: antibacterial
01078 activity and food preservation."
01079 J. Biosci. Bioeng. 87:705-716(1999).
01080 PubMed=16232543""")
01081         self.assertEqual(record.references[ 1].number, "2")
01082         self.assertEqual(record.references[ 1].authors, "Johnsen L., Fimland G., Nissen-Meyer J.")
01083         self.assertEqual(record.references[ 1].citation, """\
01084 "The C-terminal domain of pediocin-like antimicrobial peptides (class
01085 IIa bacteriocins) is involved in specific recognition of the
01086 C-terminal part of cognate immunity proteins and in determining the
01087 antimicrobial spectrum."
01088 J. Biol. Chem. 280:9243-9250(2005).
01089 PubMed=15611086; DOI=10.1074/jbc.M412712200""")
01090         self.assertEqual(record.references[ 2].number, "3")
01091         self.assertEqual(record.references[ 2].authors, "Fimland G., Johnsen L., Dalhus B., Nissen-Meyer J.")
01092         self.assertEqual(record.references[ 2].citation, """\
01093 "Pediocin-like antimicrobial peptides (class IIa bacteriocins) and
01094 their immunity proteins: biosynthesis, structure, and mode of
01095 action."
01096 J. Pept. Sci. 11:688-696(2005).
01097 PubMed=16059970; DOI=10.1002/psc.699""")
01098 

Here is the call graph for this function:


The documentation for this class was generated from the following file: