Back to index

python-biopython  1.60
Public Member Functions | Public Attributes | Private Attributes
Bio.SeqIO.QualityIO.FastqPhredWriter Class Reference
Inheritance diagram for Bio.SeqIO.QualityIO.FastqPhredWriter:
Inheritance graph
[legend]
Collaboration diagram for Bio.SeqIO.QualityIO.FastqPhredWriter:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def write_record
def write_header
def write_footer
def write_records
def write_file
def clean

Public Attributes

 handle

Private Attributes

 _record_written

Detailed Description

Class to write standard FASTQ format files (using PHRED quality scores).

Although you can use this class directly, you are strongly encouraged
to use the Bio.SeqIO.write() function instead via the format name "fastq"
or the alias "fastq-sanger".  For example, this code reads in a standard
Sanger style FASTQ file (using PHRED scores) and re-saves it as another
Sanger style FASTQ file:

>>> from Bio import SeqIO
>>> record_iterator = SeqIO.parse(open("Quality/example.fastq"), "fastq")
>>> out_handle = open("Quality/temp.fastq", "w")
>>> SeqIO.write(record_iterator, out_handle, "fastq")
3
>>> out_handle.close()

You might want to do this if the original file included extra line breaks,
which while valid may not be supported by all tools.  The output file from
Biopython will have each sequence on a single line, and each quality
string on a single line (which is considered desirable for maximum
compatibility).

In this next example, an old style Solexa/Illumina FASTQ file (using Solexa
quality scores) is converted into a standard Sanger style FASTQ file using
PHRED qualities:

>>> from Bio import SeqIO
>>> record_iterator = SeqIO.parse(open("Quality/solexa_example.fastq"), "fastq-solexa")
>>> out_handle = open("Quality/temp.fastq", "w")
>>> SeqIO.write(record_iterator, out_handle, "fastq")
5
>>> out_handle.close()

This code is also called if you use the .format("fastq") method of a
SeqRecord, or .format("fastq-sanger") if you prefer that alias.

Note that Sanger FASTQ files have an upper limit of PHRED quality 93, which is
encoded as ASCII 126, the tilde. If your quality scores are truncated to fit, a
warning is issued.

P.S. To avoid cluttering up your working directory, you can delete this
temporary file now:

>>> import os
>>> os.remove("Quality/temp.fastq")

Definition at line 1367 of file QualityIO.py.


Member Function Documentation

def Bio.SeqIO.Interfaces.SequenceWriter.clean (   self,
  text 
) [inherited]
Use this to avoid getting newlines in the output.

Definition at line 166 of file Interfaces.py.

00166 
00167     def clean(self, text):
00168         """Use this to avoid getting newlines in the output."""
00169         return text.replace("\n", " ").replace("\r", " ").replace("  ", " ")
    

Here is the caller graph for this function:

def Bio.SeqIO.Interfaces.SequentialSequenceWriter.write_file (   self,
  records 
) [inherited]
Use this to write an entire file containing the given records.

records - A list or iterator returning SeqRecord objects

This method can only be called once.  Returns the number of records
written.

Reimplemented from Bio.SeqIO.Interfaces.SequenceWriter.

Definition at line 262 of file Interfaces.py.

00262 
00263     def write_file(self, records):
00264         """Use this to write an entire file containing the given records.
00265 
00266         records - A list or iterator returning SeqRecord objects
00267 
00268         This method can only be called once.  Returns the number of records
00269         written.
00270         """
00271         self.write_header()
00272         count = self.write_records(records)
00273         self.write_footer()
00274         return count

Here is the call graph for this function:

Definition at line 218 of file Interfaces.py.

00218 
00219     def write_footer(self):
00220         assert self._header_written, "You must call write_header() first"
00221         assert self._record_written, "You have not called write_record() or write_records() yet"
00222         assert not self._footer_written, "You have aleady called write_footer()"
00223         self._footer_written = True

Here is the caller graph for this function:

Definition at line 212 of file Interfaces.py.

00212 
00213     def write_header(self):
00214         assert not self._header_written, "You have aleady called write_header()"
00215         assert not self._record_written, "You have aleady called write_record() or write_records()"
00216         assert not self._footer_written, "You have aleady called write_footer()"
00217         self._header_written = True
        

Here is the caller graph for this function:

Write a single FASTQ record to the file.

Reimplemented from Bio.SeqIO.Interfaces.SequentialSequenceWriter.

Definition at line 1415 of file QualityIO.py.

01415 
01416     def write_record(self, record):
01417         """Write a single FASTQ record to the file."""
01418         assert self._header_written
01419         assert not self._footer_written
01420         self._record_written = True
01421         #TODO - Is an empty sequence allowed in FASTQ format?
01422         if record.seq is None:
01423             raise ValueError("No sequence for record %s" % record.id)
01424         seq_str = str(record.seq)
01425         qualities_str = _get_sanger_quality_str(record)
01426         if len(qualities_str) != len(seq_str):
01427             raise ValueError("Record %s has sequence length %i but %i quality scores" \
01428                              % (record.id, len(seq_str), len(qualities_str)))
01429 
01430         #FASTQ files can include a description, just like FASTA files
01431         #(at least, this is what the NCBI Short Read Archive does)
01432         id = self.clean(record.id)
01433         description = self.clean(record.description)
01434         if description and description.split(None, 1)[0]==id:
01435             #The description includes the id at the start
01436             title = description
01437         elif description:
01438             title = "%s %s" % (id, description)
01439         else:
01440             title = id
01441         
01442         self.handle.write("@%s\n%s\n+\n%s\n" % (title, seq_str, qualities_str))

def Bio.SeqIO.Interfaces.SequentialSequenceWriter.write_records (   self,
  records 
) [inherited]
Write multiple record to the output file.

records - A list or iterator returning SeqRecord objects

Once you have called write_header() you can call write_record()
and/or write_records() as many times as needed.  Then call
write_footer() and close().

Returns the number of records written.

Definition at line 240 of file Interfaces.py.

00240 
00241     def write_records(self, records):
00242         """Write multiple record to the output file.
00243 
00244         records - A list or iterator returning SeqRecord objects
00245 
00246         Once you have called write_header() you can call write_record()
00247         and/or write_records() as many times as needed.  Then call
00248         write_footer() and close().
00249 
00250         Returns the number of records written.
00251         """
00252         #Default implementation:
00253         assert self._header_written, "You must call write_header() first"
00254         assert not self._footer_written, "You have already called write_footer()"
00255         count = 0
00256         for record in records:
00257             self.write_record(record)
00258             count += 1
00259         #Mark as true, even if there where no records
00260         self._record_written = True
00261         return count

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Reimplemented from Bio.SeqIO.Interfaces.SequentialSequenceWriter.

Definition at line 1419 of file QualityIO.py.

Reimplemented from Bio.SeqIO.Interfaces.SequenceWriter.

Definition at line 207 of file Interfaces.py.


The documentation for this class was generated from the following file: