Back to index

python-biopython  1.60
Public Member Functions | Public Attributes | Private Member Functions
Bio.PDB.PDBParser.PDBParser Class Reference

List of all members.

Public Member Functions

def __init__
def get_structure
def get_header
def get_trailer

Public Attributes

 structure_builder
 header
 trailer
 line_counter
 PERMISSIVE
 QUIET

Private Member Functions

def _parse
def _get_header
def _parse_coordinates
def _handle_PDB_exception

Detailed Description

Parse a PDB file and return a Structure object.

Definition at line 26 of file PDBParser.py.


Constructor & Destructor Documentation

def Bio.PDB.PDBParser.PDBParser.__init__ (   self,
  PERMISSIVE = True,
  get_header = False,
  structure_builder = None,
  QUIET = False 
)
The PDB parser call a number of standard methods in an aggregated
StructureBuilder object. Normally this object is instanciated by the
PDBParser object itself, but if the user provides his own StructureBuilder
object, the latter is used instead.

Arguments:

o PERMISSIVE - Evaluated as a Boolean. If false, exceptions in
constructing the SMCRA data structure are fatal. If true (DEFAULT),
the exceptions are caught, but some residues or atoms will be missing.
THESE EXCEPTIONS ARE DUE TO PROBLEMS IN THE PDB FILE!.

o structure_builder - an optional user implemented StructureBuilder class. 

o QUIET - Evaluated as a Boolean. If true, warnings issued in constructing
the SMCRA data will be supressed. If false (DEFAULT), they will be shown.
These warnings might be indicative of problems in the PDB file!        

Definition at line 32 of file PDBParser.py.

00032 
00033                  structure_builder=None, QUIET=False):
00034         """
00035         The PDB parser call a number of standard methods in an aggregated
00036         StructureBuilder object. Normally this object is instanciated by the
00037         PDBParser object itself, but if the user provides his own StructureBuilder
00038         object, the latter is used instead.
00039 
00040         Arguments:
00041         
00042         o PERMISSIVE - Evaluated as a Boolean. If false, exceptions in
00043         constructing the SMCRA data structure are fatal. If true (DEFAULT),
00044         the exceptions are caught, but some residues or atoms will be missing.
00045         THESE EXCEPTIONS ARE DUE TO PROBLEMS IN THE PDB FILE!.
00046 
00047         o structure_builder - an optional user implemented StructureBuilder class. 
00048 
00049         o QUIET - Evaluated as a Boolean. If true, warnings issued in constructing
00050         the SMCRA data will be supressed. If false (DEFAULT), they will be shown.
00051         These warnings might be indicative of problems in the PDB file!        
00052         """
00053         if structure_builder!=None:
00054             self.structure_builder=structure_builder
00055         else:
00056             self.structure_builder=StructureBuilder()
00057         self.header=None
00058         self.trailer=None
00059         self.line_counter=0
00060         self.PERMISSIVE=bool(PERMISSIVE)
00061         self.QUIET=bool(QUIET)


Member Function Documentation

def Bio.PDB.PDBParser.PDBParser._get_header (   self,
  header_coords_trailer 
) [private]

Definition at line 110 of file PDBParser.py.

00110 
00111     def _get_header(self, header_coords_trailer):
00112         "Get the header of the PDB file, return the rest."
00113         structure_builder=self.structure_builder
00114         i = 0
00115         for i in range(0, len(header_coords_trailer)):
00116             structure_builder.set_line_counter(i+1)
00117             line=header_coords_trailer[i]
00118             record_type=line[0:6] 
00119             if(record_type=='ATOM  ' or record_type=='HETATM' or record_type=='MODEL '):
00120                 break
00121         header=header_coords_trailer[0:i]
00122         # Return the rest of the coords+trailer for further processing
00123         self.line_counter=i
00124         coords_trailer=header_coords_trailer[i:]
00125         header_dict=_parse_pdb_header_list(header)
00126         return header_dict, coords_trailer
    

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.PDB.PDBParser.PDBParser._handle_PDB_exception (   self,
  message,
  line_counter 
) [private]
This method catches an exception that occurs in the StructureBuilder
object (if PERMISSIVE), or raises it again, this time adding the 
PDB line number to the error message.

Definition at line 269 of file PDBParser.py.

00269 
00270     def _handle_PDB_exception(self, message, line_counter):
00271         """
00272         This method catches an exception that occurs in the StructureBuilder
00273         object (if PERMISSIVE), or raises it again, this time adding the 
00274         PDB line number to the error message.
00275         """
00276         message="%s at line %i." % (message, line_counter)
00277         if self.PERMISSIVE:
00278             # just print a warning - some residues/atoms may be missing
00279             warnings.warn("PDBConstructionException: %s\n"
00280                           "Exception ignored.\n"
00281                           "Some atoms or residues may be missing in the data structure."
00282                           % message, PDBConstructionWarning)
00283         else:
00284             # exceptions are fatal - raise again with new message (including line nr)
00285             raise PDBConstructionException(message)
00286 

Here is the caller graph for this function:

def Bio.PDB.PDBParser.PDBParser._parse (   self,
  header_coords_trailer 
) [private]

Definition at line 103 of file PDBParser.py.

00103 
00104     def _parse(self, header_coords_trailer):
00105         "Parse the PDB file."
00106         # Extract the header; return the rest of the file
00107         self.header, coords_trailer=self._get_header(header_coords_trailer)
00108         # Parse the atomic data; return the PDB file trailer
00109         self.trailer=self._parse_coordinates(coords_trailer)
    

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.PDB.PDBParser.PDBParser._parse_coordinates (   self,
  coords_trailer 
) [private]

Definition at line 127 of file PDBParser.py.

00127 
00128     def _parse_coordinates(self, coords_trailer):
00129         "Parse the atomic data in the PDB file."
00130         local_line_counter=0
00131         structure_builder=self.structure_builder
00132         current_model_id=0
00133         # Flag we have an open model
00134         model_open=0
00135         current_chain_id=None
00136         current_segid=None
00137         current_residue_id=None
00138         current_resname=None
00139         for i in range(0, len(coords_trailer)):
00140             line=coords_trailer[i]
00141             record_type=line[0:6]
00142             global_line_counter=self.line_counter+local_line_counter+1
00143             structure_builder.set_line_counter(global_line_counter)
00144             if(record_type=='ATOM  ' or record_type=='HETATM'):
00145                 # Initialize the Model - there was no explicit MODEL record
00146                 if not model_open:
00147                     structure_builder.init_model(current_model_id)
00148                     current_model_id+=1
00149                     model_open=1
00150                 fullname=line[12:16]
00151                 # get rid of whitespace in atom names
00152                 split_list=fullname.split()
00153                 if len(split_list)!=1:
00154                     # atom name has internal spaces, e.g. " N B ", so
00155                     # we do not strip spaces
00156                     name=fullname
00157                 else:
00158                     # atom name is like " CA ", so we can strip spaces
00159                     name=split_list[0]
00160                 altloc=line[16:17]
00161                 resname=line[17:20]
00162                 chainid=line[21:22]
00163                 try:
00164                     serial_number=int(line[6:11])
00165                 except:
00166                     serial_number=0
00167                 resseq=int(line[22:26].split()[0])   # sequence identifier   
00168                 icode=line[26:27]           # insertion code
00169                 if record_type=='HETATM':       # hetero atom flag
00170                     if resname=="HOH" or resname=="WAT":
00171                         hetero_flag="W"
00172                     else:
00173                         hetero_flag="H"
00174                 else:
00175                     hetero_flag=" "
00176                 residue_id=(hetero_flag, resseq, icode)
00177                 # atomic coordinates
00178                 try:
00179                     x=float(line[30:38]) 
00180                     y=float(line[38:46]) 
00181                     z=float(line[46:54])
00182                 except:
00183                     #Should we allow parsing to continue in permissive mode?
00184                     #If so what coordindates should we default to?  Easier to abort!
00185                     raise PDBConstructionException(\
00186                         "Invalid or missing coordinate(s) at line %i." \
00187                         % global_line_counter)
00188                 coord=numpy.array((x, y, z), 'f')
00189                 # occupancy & B factor
00190                 try:
00191                     occupancy=float(line[54:60])
00192                 except:
00193                     self._handle_PDB_exception("Invalid or missing occupancy",
00194                                                global_line_counter)
00195                     occupancy = 0.0 #Is one or zero a good default?
00196                 try:
00197                     bfactor=float(line[60:66])
00198                 except:
00199                     self._handle_PDB_exception("Invalid or missing B factor",
00200                                                global_line_counter)
00201                     bfactor = 0.0 #The PDB use a default of zero if the data is missing
00202                 segid=line[72:76]
00203                 element=line[76:78].strip()
00204                 if current_segid!=segid:
00205                     current_segid=segid
00206                     structure_builder.init_seg(current_segid)
00207                 if current_chain_id!=chainid:
00208                     current_chain_id=chainid
00209                     structure_builder.init_chain(current_chain_id)
00210                     current_residue_id=residue_id
00211                     current_resname=resname
00212                     try:
00213                         structure_builder.init_residue(resname, hetero_flag, resseq, icode)
00214                     except PDBConstructionException, message:
00215                         self._handle_PDB_exception(message, global_line_counter)
00216                 elif current_residue_id!=residue_id or current_resname!=resname:
00217                     current_residue_id=residue_id
00218                     current_resname=resname
00219                     try:
00220                         structure_builder.init_residue(resname, hetero_flag, resseq, icode)
00221                     except PDBConstructionException, message:
00222                         self._handle_PDB_exception(message, global_line_counter) 
00223                 # init atom
00224                 try:
00225                     structure_builder.init_atom(name, coord, bfactor, occupancy, altloc,
00226                                                 fullname, serial_number, element)
00227                 except PDBConstructionException, message:
00228                     self._handle_PDB_exception(message, global_line_counter)
00229             elif(record_type=='ANISOU'):
00230                 anisou=map(float, (line[28:35], line[35:42], line[43:49], line[49:56], line[56:63], line[63:70]))
00231                 # U's are scaled by 10^4 
00232                 anisou_array=(numpy.array(anisou, 'f')/10000.0).astype('f')
00233                 structure_builder.set_anisou(anisou_array)
00234             elif(record_type=='MODEL '):
00235                 try:
00236                     serial_num=int(line[10:14])
00237                 except:
00238                     self._handle_PDB_exception("Invalid or missing model serial number",
00239                                                global_line_counter)
00240                     serial_num=0
00241                 structure_builder.init_model(current_model_id,serial_num)
00242                 current_model_id+=1
00243                 model_open=1
00244                 current_chain_id=None
00245                 current_residue_id=None
00246             elif(record_type=='END   ' or record_type=='CONECT'):
00247                 # End of atomic data, return the trailer
00248                 self.line_counter=self.line_counter+local_line_counter
00249                 return coords_trailer[local_line_counter:]
00250             elif(record_type=='ENDMDL'):
00251                 model_open=0
00252                 current_chain_id=None
00253                 current_residue_id=None
00254             elif(record_type=='SIGUIJ'):
00255                 # standard deviation of anisotropic B factor
00256                 siguij=map(float, (line[28:35], line[35:42], line[42:49], line[49:56], line[56:63], line[63:70]))
00257                 # U sigma's are scaled by 10^4
00258                 siguij_array=(numpy.array(siguij, 'f')/10000.0).astype('f')   
00259                 structure_builder.set_siguij(siguij_array)
00260             elif(record_type=='SIGATM'):
00261                 # standard deviation of atomic positions
00262                 sigatm=map(float, (line[30:38], line[38:45], line[46:54], line[54:60], line[60:66]))
00263                 sigatm_array=numpy.array(sigatm, 'f')
00264                 structure_builder.set_sigatm(sigatm_array)
00265             local_line_counter=local_line_counter+1
00266         # EOF (does not end in END or CONECT)
00267         self.line_counter=self.line_counter+local_line_counter
00268         return []

Here is the call graph for this function:

Here is the caller graph for this function:

Definition at line 93 of file PDBParser.py.

00093 
00094     def get_header(self):
00095         "Return the header."
00096         return self.header

def Bio.PDB.PDBParser.PDBParser.get_structure (   self,
  id,
  file 
)
Return the structure.

Arguments:
o id - string, the id that will be used for the structure
o file - name of the PDB file OR an open filehandle

Definition at line 64 of file PDBParser.py.

00064 
00065     def get_structure(self, id, file):
00066         """Return the structure.
00067 
00068         Arguments:
00069         o id - string, the id that will be used for the structure
00070         o file - name of the PDB file OR an open filehandle
00071         """
00072 
00073         if self.QUIET:
00074             warning_list = warnings.filters[:]
00075             warnings.filterwarnings('ignore', category=PDBConstructionWarning)
00076             
00077         self.header=None
00078         self.trailer=None
00079         # Make a StructureBuilder instance (pass id of structure as parameter)
00080         self.structure_builder.init_structure(id)
00081 
00082         with as_handle(file) as handle:
00083             self._parse(handle.readlines())
00084 
00085         self.structure_builder.set_header(self.header)
00086         # Return the Structure instance
00087         structure = self.structure_builder.get_structure()
00088         
00089         if self.QUIET:
00090             warnings.filters = warning_list
00091         
00092         return structure

Here is the call graph for this function:

Definition at line 97 of file PDBParser.py.

00097 
00098     def get_trailer(self):
00099         "Return the trailer."
00100         return self.trailer


Member Data Documentation

Definition at line 56 of file PDBParser.py.

Definition at line 58 of file PDBParser.py.

Definition at line 59 of file PDBParser.py.

Definition at line 60 of file PDBParser.py.

Definition at line 53 of file PDBParser.py.

Definition at line 57 of file PDBParser.py.


The documentation for this class was generated from the following file: