Back to index

python-biopython  1.60
Public Member Functions
test_Medline.TestMedline Class Reference

List of all members.

Public Member Functions

def test_read
def test_parse

Detailed Description

Definition at line 11 of file

Member Function Documentation

Definition at line 47 of file

00048     def test_parse(self):
00049         handle = open("Medline/pubmed_result2.txt")
00050         records = Medline.parse(handle)
00051         record =
00052         self.assertEqual(record["PMID"], "16403221")
00053         self.assertEqual(record["OWN"], "NLM")
00054         self.assertEqual(record["STAT"], "MEDLINE")
00055         self.assertEqual(record["DA"], "20060220")
00056         self.assertEqual(record["DCOM"], "20060314")
00057         self.assertEqual(record["PUBM"], "Electronic")
00058         self.assertEqual(record["IS"], "1471-2105 (Electronic)")
00059         self.assertEqual(record["VI"], "7")
00060         self.assertEqual(record["DP"], "2006")
00061         self.assertEqual(record["TI"], "A high level interface to SCOP and ASTRAL implemented in python.")
00062         self.assertEqual(record["PG"], "10")
00063         self.assertEqual(record["AB"], "BACKGROUND: Benchmarking algorithms in structural bioinformatics often involves the construction of datasets of proteins with given sequence and structural properties. The SCOP database is a manually curated structural classification which groups together proteins on the basis of structural similarity. The ASTRAL compendium provides non redundant subsets of SCOP domains on the basis of sequence similarity such that no two domains in a given subset share more than a defined degree of sequence similarity. Taken together these two resources provide a 'ground truth' for assessing structural bioinformatics algorithms. We present a small and easy to use API written in python to enable construction of datasets from these resources. RESULTS: We have designed a set of python modules to provide an abstraction of the SCOP and ASTRAL databases. The modules are designed to work as part of the Biopython distribution. Python users can now manipulate and use the SCOP hierarchy from within python programs, and use ASTRAL to return sequences of domains in SCOP, as well as clustered representations of SCOP from ASTRAL. CONCLUSION: The modules make the analysis and generation of datasets for use in structural genomics easier and more principled.")
00064         self.assertEqual(record["AD"], "Bioinformatics, Institute of Cell and Molecular Science, School of Medicine and Dentistry, Queen Mary, University of London, London EC1 6BQ, UK.")
00065         self.assertEqual(record["FAU"], ["Casbon, James A", "Crooks, Gavin E", "Saqi, Mansoor A S"])
00066         self.assertEqual(record["AU"], ["Casbon JA", "Crooks GE", "Saqi MA"])
00067         self.assertEqual(record["LA"], ["eng"])
00068         self.assertEqual(record["PT"], ["Evaluation Studies", "Journal Article"])
00069         self.assertEqual(record["DEP"], "20060110")
00070         self.assertEqual(record["PL"], "England")
00071         self.assertEqual(record["TA"], "BMC Bioinformatics")
00072         self.assertEqual(record["JT"], "BMC bioinformatics")
00073         self.assertEqual(record["JID"], "100965194")
00074         self.assertEqual(record["SB"], "IM")
00075         self.assertEqual(record["MH"], ["*Database Management Systems", "*Databases, Protein", "Information Storage and Retrieval/*methods", "Programming Languages", "Sequence Alignment/*methods", "Sequence Analysis, Protein/*methods", "Sequence Homology, Amino Acid", "*Software", "*User-Computer Interface"])
00076         self.assertEqual(record["PMC"], "PMC1373603")
00077         self.assertEqual(record["EDAT"], "2006/01/13 09:00")
00078         self.assertEqual(record["MHDA"], "2006/03/15 09:00")
00079         self.assertEqual(record["PHST"], ["2005/06/17 [received]", "2006/01/10 [accepted]", "2006/01/10 [aheadofprint]"])
00080         self.assertEqual(record["AID"], ["1471-2105-7-10 [pii]", "10.1186/1471-2105-7-10 [doi]"])
00081         self.assertEqual(record["PST"], "epublish")
00082         self.assertEqual(record["SO"], "BMC Bioinformatics. 2006 Jan 10;7:10.")
00083         record =
00084         self.assertEqual(record["PMID"], "16377612")
00085         self.assertEqual(record["OWN"], "NLM")
00086         self.assertEqual(record["STAT"], "MEDLINE")
00087         self.assertEqual(record["DA"], "20060223")
00088         self.assertEqual(record["DCOM"], "20060418")
00089         self.assertEqual(record["LR"], "20061115")
00090         self.assertEqual(record["PUBM"], "Print-Electronic")
00091         self.assertEqual(record["IS"], "1367-4803 (Print)")
00092         self.assertEqual(record["VI"], "22")
00093         self.assertEqual(record["IP"], "5")
00094         self.assertEqual(record["DP"], "2006 Mar 1")
00095         self.assertEqual(record["TI"], "GenomeDiagram: a python package for the visualization of large-scale genomic data.")
00096         self.assertEqual(record["PG"], "616-7")
00097         self.assertEqual(record["AB"], "SUMMARY: We present GenomeDiagram, a flexible, open-source Python module for the visualization of large-scale genomic, comparative genomic and other data with reference to a single chromosome or other biological sequence. GenomeDiagram may be used to generate publication-quality vector graphics, rastered images and in-line streamed graphics for webpages. The package integrates with datatypes from the BioPython project, and is available for Windows, Linux and Mac OS X systems. AVAILABILITY: GenomeDiagram is freely available as source code (under GNU Public License) at, and requires Python 2.3 or higher, and recent versions of the ReportLab and BioPython packages. SUPPLEMENTARY INFORMATION: A user manual, example code and images are available at")
00098         self.assertEqual(record["AD"], "Plant Pathogen Programme, Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, Scotland, UK.")
00099         self.assertEqual(record["FAU"], ["Pritchard, Leighton", "White, Jennifer A", "Birch, Paul R J", "Toth, Ian K"])
00100         self.assertEqual(record["AU"], ["Pritchard L", "White JA", "Birch PR", "Toth IK"])
00101         self.assertEqual(record["LA"], ["eng"])
00102         self.assertEqual(record["PT"], ["Journal Article", "Research Support, Non-U.S. Gov't"])
00103         self.assertEqual(record["DEP"], "20051223")
00104         self.assertEqual(record["PL"], "England")
00105         self.assertEqual(record["TA"], "Bioinformatics")
00106         self.assertEqual(record["JT"], "Bioinformatics (Oxford, England)")
00107         self.assertEqual(record["JID"], "9808944")
00108         self.assertEqual(record["SB"], "IM")
00109         self.assertEqual(record["MH"], ["Chromosome Mapping/*methods", "*Computer Graphics", "*Database Management Systems", "*Databases, Genetic", "Information Storage and Retrieval/methods", "*Programming Languages", "*Software", "*User-Computer Interface"])
00110         self.assertEqual(record["EDAT"], "2005/12/27 09:00")
00111         self.assertEqual(record["MHDA"], "2006/04/19 09:00")
00112         self.assertEqual(record["PHST"], ["2005/12/23 [aheadofprint]"])
00113         self.assertEqual(record["AID"], ["btk021 [pii]", "10.1093/bioinformatics/btk021 [doi]"])
00114         self.assertEqual(record["PST"], "ppublish")
00115         self.assertEqual(record["SO"], "Bioinformatics. 2006 Mar 1;22(5):616-7. Epub 2005 Dec 23.")
00116         record =
00117         self.assertEqual(record["PMID"], "14871861")
00118         self.assertEqual(record["OWN"], "NLM")
00119         self.assertEqual(record["STAT"], "MEDLINE")
00120         self.assertEqual(record["DA"], "20040611")
00121         self.assertEqual(record["DCOM"], "20050104")
00122         self.assertEqual(record["LR"], "20061115")
00123         self.assertEqual(record["PUBM"], "Print-Electronic")
00124         self.assertEqual(record["IS"], "1367-4803 (Print)")
00125         self.assertEqual(record["VI"], "20")
00126         self.assertEqual(record["IP"], "9")
00127         self.assertEqual(record["DP"], "2004 Jun 12")
00128         self.assertEqual(record["TI"], "Open source clustering software.")
00129         self.assertEqual(record["PG"], "1453-4")
00130         self.assertEqual(record["AB"], "SUMMARY: We have implemented k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs. Using this library, we have created an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C. AVAILABILITY: The C Clustering Library and the corresponding Python C extension module Pycluster were released under the Python License, while the Perl module Algorithm::Cluster was released under the Artistic License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well as the corresponding command-line program, were released under the same license as the original Cluster code. The complete source code is available at Alternatively, Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available as part of the Biopython distribution.")
00131         self.assertEqual(record["AD"], "Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639 Japan.")
00132         self.assertEqual(record["FAU"], ["de Hoon, M J L", "Imoto, S", "Nolan, J", "Miyano, S"])
00133         self.assertEqual(record["AU"], ["de Hoon MJ", "Imoto S", "Nolan J", "Miyano S"])
00134         self.assertEqual(record["LA"], ["eng"])
00135         self.assertEqual(record["PT"], ["Comparative Study", "Evaluation Studies",  "Journal Article", "Validation Studies"])
00136         self.assertEqual(record["DEP"], "20040210")
00137         self.assertEqual(record["PL"], "England")
00138         self.assertEqual(record["TA"], "Bioinformatics")
00139         self.assertEqual(record["JT"], "Bioinformatics (Oxford, England)")
00140         self.assertEqual(record["JID"], "9808944")
00141         self.assertEqual(record["SB"], "IM")
00142         self.assertEqual(record["MH"], ["*Algorithms", "*Cluster Analysis", "Gene Expression Profiling/*methods", "Pattern Recognition, Automated/methods", "*Programming Languages", "Sequence Alignment/*methods", "Sequence Analysis, DNA/*methods", "*Software"])
00143         self.assertEqual(record["EDAT"], "2004/02/12 05:00")
00144         self.assertEqual(record["MHDA"], "2005/01/05 09:00")
00145         self.assertEqual(record["PHST"], ["2004/02/10 [aheadofprint]"])
00146         self.assertEqual(record["AID"], ["10.1093/bioinformatics/bth078 [doi]", "bth078 [pii]"])
00147         self.assertEqual(record["PST"], "ppublish")
00148         self.assertEqual(record["SO"], "Bioinformatics. 2004 Jun 12;20(9):1453-4. Epub 2004 Feb 10.")
00149         record =
00150         self.assertEqual(record["PMID"], "14630660")
00151         self.assertEqual(record["OWN"], "NLM")
00152         self.assertEqual(record["STAT"], "MEDLINE")
00153         self.assertEqual(record["DA"], "20031121")
00154         self.assertEqual(record["DCOM"], "20040722")
00155         self.assertEqual(record["LR"], "20061115")
00156         self.assertEqual(record["PUBM"], "Print")
00157         self.assertEqual(record["IS"], "1367-4803 (Print)")
00158         self.assertEqual(record["VI"], "19")
00159         self.assertEqual(record["IP"], "17")
00160         self.assertEqual(record["DP"], "2003 Nov 22")
00161         self.assertEqual(record["TI"], "PDB file parser and structure class implemented in Python.")
00162         self.assertEqual(record["PG"], "2308-10")
00163         self.assertEqual(record["AB"], "The biopython project provides a set of bioinformatics tools implemented in Python. Recently, biopython was extended with a set of modules that deal with macromolecular structure. Biopython now contains a parser for PDB files that makes the atomic information available in an easy-to-use but powerful data structure. The parser and data structure deal with features that are often left out or handled inadequately by other packages, e.g. atom and residue disorder (if point mutants are present in the crystal), anisotropic B factors, multiple models and insertion codes. In addition, the parser performs some sanity checking to detect obvious errors. AVAILABILITY: The Biopython distribution (including source code and documentation) is freely available (under the Biopython license) from")
00164         self.assertEqual(record["AD"], "Department of Cellular and Molecular Interactions, Vlaams Interuniversitair Instituut voor Biotechnologie and Computational Modeling Lab, Department of Computer Science, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium.")
00165         self.assertEqual(record["FAU"], ["Hamelryck, Thomas", "Manderick, Bernard"])
00166         self.assertEqual(record["AU"], ["Hamelryck T", "Manderick B"])
00167         self.assertEqual(record["LA"], ["eng"])
00168         self.assertEqual(record["PT"], ["Comparative Study", "Evaluation Studies", "Journal Article", "Research Support, Non-U.S. Gov't", "Validation Studies"])
00169         self.assertEqual(record["PL"], "England")
00170         self.assertEqual(record["TA"], "Bioinformatics")
00171         self.assertEqual(record["JT"], "Bioinformatics (Oxford, England)")
00172         self.assertEqual(record["JID"], "9808944")
00173         self.assertEqual(record["RN"], ["0 (Macromolecular Substances)"])
00174         self.assertEqual(record["SB"], "IM")
00175         self.assertEqual(record["MH"], ["Computer Simulation", "Database Management Systems/*standards", "*Databases, Protein", "Information Storage and Retrieval/*methods/*standards", "Macromolecular Substances", "*Models, Molecular", "*Programming Languages", "Protein Conformation", "*Software"])
00176         self.assertEqual(record["EDAT"], "2003/11/25 05:00")
00177         self.assertEqual(record["MHDA"], "2004/07/23 05:00")
00178         self.assertEqual(record["PST"], "ppublish")
00179         self.assertEqual(record["SO"], "Bioinformatics. 2003 Nov 22;19(17):2308-10.")
00180         self.assertRaises(StopIteration,
00181         handle.close()

Here is the call graph for this function:

Definition at line 13 of file

00014     def test_read(self):
00015         handle = open("Medline/pubmed_result1.txt")
00016         record =
00017         handle.close()
00018         self.assertEqual(record["PMID"], "12230038")
00019         self.assertEqual(record["OWN"], "NLM")
00020         self.assertEqual(record["STAT"], "MEDLINE")
00021         self.assertEqual(record["DA"], "20020916")
00022         self.assertEqual(record["DCOM"], "20030606")
00023         self.assertEqual(record["LR"], "20041117")
00024         self.assertEqual(record["PUBM"], "Print")
00025         self.assertEqual(record["IS"], "1467-5463 (Print)")
00026         self.assertEqual(record["VI"], "3")
00027         self.assertEqual(record["IP"] , "3")
00028         self.assertEqual(record["DP"], "2002 Sep")
00029         self.assertEqual(record["TI"], "The Bio* toolkits--a brief overview.")
00030         self.assertEqual(record["PG"], "296-302")
00031         self.assertEqual(record["AB"], "Bioinformatics research is often difficult to do with commercial software. The Open Source BioPerl, BioPython and Biojava projects provide toolkits with multiple functionality that make it easier to create customised pipelines or analysis. This review briefly compares the quirks of the underlying languages and the functionality, documentation, utility and relative advantages of the Bio counterparts, particularly from the point of view of the beginning biologist programmer.")
00032         self.assertEqual(record["AD"], "tacg Informatics, Irvine, CA 92612, USA.")
00033         self.assertEqual(record["FAU"], ["Mangalam, Harry"])
00034         self.assertEqual(record["AU"], ["Mangalam H"])
00035         self.assertEqual(record["LA"], ["eng"])
00036         self.assertEqual(record["PT"], ["Journal Article"])
00037         self.assertEqual(record["PL"], "England")
00038         self.assertEqual(record["TA"], "Brief Bioinform")
00039         self.assertEqual(record["JT"], "Briefings in bioinformatics")
00040         self.assertEqual(record["JID"], "100912837")
00041         self.assertEqual(record["SB"], "IM")
00042         self.assertEqual(record["MH"], ["*Computational Biology", "Computer Systems", "Humans", "Internet", "*Programming Languages", "*Software", "User-Computer Interface"])
00043         self.assertEqual(record["EDAT"], "2002/09/17 10:00")
00044         self.assertEqual(record["MHDA"], "2003/06/07 05:00")
00045         self.assertEqual(record["PST"], "ppublish")
00046         self.assertEqual(record["SO"], "Brief Bioinform. 2002 Sep;3(3):296-302.")

Here is the call graph for this function:

The documentation for this class was generated from the following file: