Back to index

python-biopython  1.60
Public Member Functions | Public Attributes
Bio.Phylo.Applications._Phyml.PhymlCommandline Class Reference
Inheritance diagram for Bio.Phylo.Applications._Phyml.PhymlCommandline:
Inheritance graph
[legend]
Collaboration diagram for Bio.Phylo.Applications._Phyml.PhymlCommandline:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def __init__
def __str__
def __repr__
def set_parameter
def __setattr__
def __call__

Public Attributes

 parameters
 program_name

Detailed Description

Command-line wrapper for the tree inference program PhyML.

Homepage: http://www.atgc-montpellier.fr/phyml

Citations:

Guindon S, Gascuel O.
A simple, fast, and accurate algorithm to estimate large phylogenies by maximum
likelihood.
Systematic Biology, 2003 Oct;52(5):696-704.
PubMed PMID: 14530136.

Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O.
New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing
the Performance of PhyML 3.0.
Systematic Biology, 2010 59(3):307-21.

Definition at line 11 of file _Phyml.py.


Constructor & Destructor Documentation

def Bio.Phylo.Applications._Phyml.PhymlCommandline.__init__ (   self,
  cmd = 'phyml',
  kwargs 
)
Create a new instance of a command line wrapper object.

Reimplemented from Bio.Application.AbstractCommandline.

Definition at line 31 of file _Phyml.py.

00031 
00032     def __init__(self, cmd='phyml', **kwargs):
00033         self.parameters = [
00034             _Option(['-i', '--input', 'input'],
00035                 """Name of the nucleotide or amino-acid sequence file in PHYLIP
00036                 format.""",
00037                 filename=True,
00038                 is_required=True,
00039                 equate=False,
00040                 ),
00041 
00042             _Option(['-d', '--datatype', 'datatype'],
00043                 """Data type is 'nt' for nucleotide (default) and 'aa' for
00044                 amino-acid sequences.""",
00045                 checker_function=lambda x: x in ('nt', 'aa'),
00046                 equate=False,
00047                 ),
00048 
00049             _Switch(['-q', '--sequential', 'sequential'],
00050                 "Changes interleaved format (default) to sequential format."
00051                 ),
00052 
00053             _Option(['-n', '--multiple', 'multiple'],
00054                 "Number of data sets to analyse (integer).",
00055                 checker_function=(lambda x:
00056                     isinstance(x, int) or x.isdigit()),
00057                 equate=False,
00058                 ),
00059 
00060             _Switch(['-p', '--pars', 'pars'],
00061                 """Use a minimum parsimony starting tree.
00062                 
00063                 This option is taken into account when the '-u' option is absent
00064                 and when tree topology modifications are to be done.
00065                 """
00066                 ),
00067 
00068             _Option(['-b', '--bootstrap', 'bootstrap'],
00069                 """Number of bootstrap replicates, if value is > 0.
00070 
00071                 Otherwise: 
00072 
00073                  0: neither approximate likelihood ratio test nor bootstrap
00074                     values are computed.
00075                 -1: approximate likelihood ratio test returning aLRT statistics.
00076                 -2: approximate likelihood ratio test returning Chi2-based
00077                     parametric branch supports.
00078                 -4: SH-like branch supports alone.
00079                 """,
00080                 equate=False,
00081                 ),
00082 
00083             _Option(['-m', '--model', 'model'],
00084                 """Substitution model name.
00085 
00086                 Nucleotide-based models:
00087 
00088                 HKY85 (default) | JC69 | K80 | F81 | F84 | TN93 | GTR | custom
00089 
00090                 For the custom option, a string of six digits identifies the
00091                 model. For instance, 000000 corresponds to F81 (or JC69,
00092                 provided the distribution of nucleotide frequencies is uniform).
00093                 012345 corresponds to GTR. This option can be used for encoding
00094                 any model that is a nested within GTR.
00095 
00096                 Amino-acid based models:
00097 
00098                 LG (default) | WAG | JTT | MtREV | Dayhoff | DCMut | RtREV |
00099                 CpREV | VT | Blosum62 | MtMam | MtArt | HIVw | HIVb | custom
00100                 """,
00101                 checker_function=(lambda x: x in (
00102                     # Nucleotide models:
00103                     'HKY85', 'JC69', 'K80', 'F81', 'F84', 'TN93', 'GTR',
00104                     # Amino acid models:
00105                     'LG', 'WAG', 'JTT', 'MtREV', 'Dayhoff', 'DCMut',
00106                     'RtREV', 'CpREV', 'VT', 'Blosum62', 'MtMam', 'MtArt',
00107                     'HIVw', 'HIVb')
00108                     or isinstance(x, int)),
00109                 equate=False,
00110                 ),
00111 
00112             _Option(['-f', 'frequencies'],
00113                 """Character frequencies.
00114 
00115                 -f e, m, or "fA fC fG fT"
00116 
00117                 e : Empirical frequencies, determined as follows : 
00118 
00119                     - Nucleotide sequences: (Empirical) the equilibrium base
00120                       frequencies are estimated by counting the occurence of the
00121                       different bases in the alignment.
00122                     - Amino-acid sequences: (Empirical) the equilibrium
00123                       amino-acid frequencies are estimated by counting the
00124                       occurence of the different amino-acids in the alignment.
00125 
00126                 m : ML/model-based frequencies, determined as follows : 
00127 
00128                     - Nucleotide sequences: (ML) the equilibrium base
00129                       frequencies are estimated using maximum likelihood 
00130                     - Amino-acid sequences: (Model) the equilibrium amino-acid
00131                       frequencies are estimated using the frequencies defined by
00132                       the substitution model.
00133 
00134                 "fA fC fG fT" : only valid for nucleotide-based models.
00135                     fA, fC, fG and fT are floating-point numbers that correspond
00136                     to the frequencies of A, C, G and T, respectively.
00137                 """,
00138                 filename=True, # ensure ".25 .25 .25 .25" stays quoted
00139                 equate=False,
00140                 ),
00141 
00142             _Option(['-t', '--ts/tv', 'ts_tv_ratio'],
00143                 """Transition/transversion ratio. (DNA sequences only.)
00144 
00145                 Can be a fixed positive value (ex:4.0) or e to get the
00146                 maximum-likelihood estimate.
00147                 """,
00148                 equate=False,
00149                 ),
00150 
00151             _Option(['-v', '--pinv', 'prop_invar'],
00152                 """Proportion of invariable sites.
00153 
00154                 Can be a fixed value in the range [0,1], or 'e' to get the
00155                 maximum-likelihood estimate.
00156                 """,
00157                 equate=False,
00158                 ),
00159 
00160             _Option(['-c', '--nclasses', 'nclasses'],
00161                 """Number of relative substitution rate categories.
00162 
00163                 Default 1. Must be a positive integer.
00164                 """,
00165                 equate=False,
00166                 ),
00167 
00168             _Option(['-a', '--alpha', 'alpha'],
00169                 """Distribution of the gamma distribution shape parameter.
00170 
00171                 Can be a fixed positive value, or 'e' to get the
00172                 maximum-likelihood estimate.
00173                 """,
00174                 equate=False,
00175                 ),
00176 
00177             _Option(['-s', '--search', 'search'],
00178                 """Tree topology search operation option.
00179 
00180                 Can be one of:
00181 
00182                     NNI : default, fast
00183                     SPR : a bit slower than NNI
00184                     BEST : best of NNI and SPR search
00185                 """,
00186                 checker_function=lambda x: x in ('NNI', 'SPR', 'BEST'),
00187                 equate=False,
00188                 ),
00189 
00190             # alt name: user_tree_file
00191             _Option(['-u', '--inputtree', 'input_tree'],
00192                 "Starting tree filename. The tree must be in Newick format.",
00193                 filename=True,
00194                 equate=False,
00195                 ),
00196 
00197             _Option(['-o', 'optimize'],
00198                 """Specific parameter optimisation.
00199 
00200                 tlr : tree topology (t), branch length (l) and
00201                       rate parameters (r) are optimised.
00202                 tl  : tree topology and branch length are optimised.
00203                 lr  : branch length and rate parameters are optimised. 
00204                 l   : branch length are optimised.
00205                 r   : rate parameters are optimised.
00206                 n   : no parameter is optimised.
00207                 """,
00208                 equate=False,
00209                 ),
00210 
00211             _Switch(['--rand_start', 'rand_start'],
00212                 """Sets the initial tree to random.
00213 
00214                 Only valid if SPR searches are to be performed.
00215                 """,
00216                 ),
00217 
00218             _Option(['--n_rand_starts', 'n_rand_starts'],
00219                 """Number of initial random trees to be used.
00220 
00221                 Only valid if SPR searches are to be performed.
00222                 """,
00223                 equate=False,
00224                 ),
00225 
00226             _Option(['--r_seed', 'r_seed'],
00227                 """Seed used to initiate the random number generator.
00228 
00229                 Must be an integer.
00230                 """,
00231                 equate=False,
00232                 ),
00233 
00234             _Switch(['--print_site_lnl', 'print_site_lnl'],
00235                 "Print the likelihood for each site in file *_phyml_lk.txt."
00236                 ),
00237 
00238             _Switch(['--print_trace', 'print_trace'],
00239                 """Print each phylogeny explored during the tree search process
00240                 in file *_phyml_trace.txt."""
00241                 ),
00242 
00243             _Option(['--run_id', 'run_id'],
00244                 """Append the given string at the end of each PhyML output file.
00245 
00246                 This option may be useful when running simulations involving
00247                 PhyML.
00248                 """,
00249                 checker_function=lambda x: isinstance(x, basestring),
00250                 equate=False,
00251                 ),
00252 
00253             # XXX should this always be set to True?
00254             _Switch(['--quiet', 'quiet'],
00255                 "No interactive questions (for running in batch mode)."
00256                 ),
00257                 ]
00258         AbstractCommandline.__init__(self, cmd, **kwargs)
00259 

Member Function Documentation

def Bio.Application.AbstractCommandline.__call__ (   self,
  stdin = None,
  stdout = True,
  stderr = True,
  cwd = None,
  env = None 
) [inherited]
Executes the command, waits for it to finish, and returns output.

Runs the command line tool and waits for it to finish. If it returns
a non-zero error level, an exception is raised. Otherwise two strings
are returned containing stdout and stderr.

The optional stdin argument should be a string of data which will be
passed to the tool as standard input.

The optional stdout and stderr argument are treated as a booleans, and
control if the output should be captured (True, default), or ignored
by sending it to /dev/null to avoid wasting memory (False). In the
later case empty string(s) are returned.

The optional cwd argument is a string giving the working directory to
to run the command from. See Python's subprocess module documentation
for more details.

The optional env argument is a dictionary setting the environment
variables to be used in the new process. By default the current
process' environment variables are used. See Python's subprocess
module documentation for more details.

Default example usage:

from Bio.Emboss.Applications import WaterCommandline
water_cmd = WaterCommandline(gapopen=10, gapextend=0.5,
                     stdout=True, auto=True,
                     asequence="a.fasta", bsequence="b.fasta")
print "About to run:\n%s" % water_cmd
std_output, err_output = water_cmd()

This functionality is similar to subprocess.check_output() added in
Python 2.7. In general if you require more control over running the
command, use subprocess directly.

As of Biopython 1.56, when the program called returns a non-zero error
level, a custom ApplicationError exception is raised. This includes
any stdout and stderr strings captured as attributes of the exception
object, since they may be useful for diagnosing what went wrong.

Definition at line 368 of file __init__.py.

00368 
00369                  cwd=None, env=None):
00370         """Executes the command, waits for it to finish, and returns output.
00371         
00372         Runs the command line tool and waits for it to finish. If it returns
00373         a non-zero error level, an exception is raised. Otherwise two strings
00374         are returned containing stdout and stderr.
00375         
00376         The optional stdin argument should be a string of data which will be
00377         passed to the tool as standard input.
00378 
00379         The optional stdout and stderr argument are treated as a booleans, and
00380         control if the output should be captured (True, default), or ignored
00381         by sending it to /dev/null to avoid wasting memory (False). In the
00382         later case empty string(s) are returned.
00383 
00384         The optional cwd argument is a string giving the working directory to
00385         to run the command from. See Python's subprocess module documentation
00386         for more details.
00387 
00388         The optional env argument is a dictionary setting the environment
00389         variables to be used in the new process. By default the current
00390         process' environment variables are used. See Python's subprocess
00391         module documentation for more details.
00392 
00393         Default example usage:
00394 
00395         from Bio.Emboss.Applications import WaterCommandline
00396         water_cmd = WaterCommandline(gapopen=10, gapextend=0.5,
00397                                      stdout=True, auto=True,
00398                                      asequence="a.fasta", bsequence="b.fasta")
00399         print "About to run:\n%s" % water_cmd
00400         std_output, err_output = water_cmd()
00401 
00402         This functionality is similar to subprocess.check_output() added in
00403         Python 2.7. In general if you require more control over running the
00404         command, use subprocess directly.
00405         
00406         As of Biopython 1.56, when the program called returns a non-zero error
00407         level, a custom ApplicationError exception is raised. This includes
00408         any stdout and stderr strings captured as attributes of the exception
00409         object, since they may be useful for diagnosing what went wrong.
00410         """
00411         if stdout:
00412             stdout_arg = subprocess.PIPE
00413         else:
00414             stdout_arg = open(os.devnull)
00415         if stderr:
00416             stderr_arg = subprocess.PIPE
00417         else:
00418             stderr_arg = open(os.devnull)
00419         #We may not need to supply any piped input, but we setup the
00420         #standard input pipe anyway as a work around for a python
00421         #bug if this is called from a Windows GUI program.  For
00422         #details, see http://bugs.python.org/issue1124861
00423         #
00424         #Using universal newlines is important on Python 3, this
00425         #gives unicode handles rather than bytes handles.
00426         child_process = subprocess.Popen(str(self), stdin=subprocess.PIPE,
00427                                          stdout=stdout_arg, stderr=stderr_arg,
00428                                          universal_newlines=True,
00429                                          cwd=cwd, env=env,
00430                                          shell=(sys.platform!="win32"))
00431         #Use .communicate as can get deadlocks with .wait(), see Bug 2804
00432         stdout_str, stderr_str = child_process.communicate(stdin)
00433         if not stdout: assert not stdout_str
00434         if not stderr: assert not stderr_str
00435         return_code = child_process.returncode
00436         if return_code:
00437             raise ApplicationError(return_code, str(self),
00438                                    stdout_str, stderr_str)
00439         return stdout_str, stderr_str
00440 

Here is the call graph for this function:

Here is the caller graph for this function:

Return a representation of the command line object for debugging.

e.g.
>>> from Bio.Emboss.Applications import WaterCommandline
>>> cline = WaterCommandline(gapopen=10, gapextend=0.5)
>>> cline.asequence = "asis:ACCCGGGCGCGGT"
>>> cline.bsequence = "asis:ACCCGAGCGCGGT"
>>> cline.outfile = "temp_water.txt"
>>> print cline
water -outfile=temp_water.txt -asequence=asis:ACCCGGGCGCGGT -bsequence=asis:ACCCGAGCGCGGT -gapopen=10 -gapextend=0.5
>>> cline
WaterCommandline(cmd='water', outfile='temp_water.txt', asequence='asis:ACCCGGGCGCGGT', bsequence='asis:ACCCGAGCGCGGT', gapopen=10, gapextend=0.5)

Definition at line 251 of file __init__.py.

00251 
00252     def __repr__(self):
00253         """Return a representation of the command line object for debugging.
00254 
00255         e.g.
00256         >>> from Bio.Emboss.Applications import WaterCommandline
00257         >>> cline = WaterCommandline(gapopen=10, gapextend=0.5)
00258         >>> cline.asequence = "asis:ACCCGGGCGCGGT"
00259         >>> cline.bsequence = "asis:ACCCGAGCGCGGT"
00260         >>> cline.outfile = "temp_water.txt"
00261         >>> print cline
00262         water -outfile=temp_water.txt -asequence=asis:ACCCGGGCGCGGT -bsequence=asis:ACCCGAGCGCGGT -gapopen=10 -gapextend=0.5
00263         >>> cline
00264         WaterCommandline(cmd='water', outfile='temp_water.txt', asequence='asis:ACCCGGGCGCGGT', bsequence='asis:ACCCGAGCGCGGT', gapopen=10, gapextend=0.5)
00265         """
00266         answer = "%s(cmd=%s" % (self.__class__.__name__, repr(self.program_name))
00267         for parameter in self.parameters:
00268             if parameter.is_set:
00269                 if isinstance(parameter, _Switch):
00270                     answer += ", %s=True" % parameter.names[-1]
00271                 else:
00272                     answer += ", %s=%s" \
00273                               % (parameter.names[-1], repr(parameter.value))
00274         answer += ")"
00275         return answer

def Bio.Application.AbstractCommandline.__setattr__ (   self,
  name,
  value 
) [inherited]
Set attribute name to value (PRIVATE).

This code implements a workaround for a user interface issue.
Without this __setattr__ attribute-based assignment of parameters
will silently accept invalid parameters, leading to known instances
of the user assuming that parameters for the application are set,
when they are not.

>>> from Bio.Emboss.Applications import WaterCommandline
>>> cline = WaterCommandline(gapopen=10, gapextend=0.5, stdout=True)
>>> cline.asequence = "a.fasta"
>>> cline.bsequence = "b.fasta"
>>> cline.csequence = "c.fasta"
Traceback (most recent call last):
...
ValueError: Option name csequence was not found.
>>> print cline
water -stdout -asequence=a.fasta -bsequence=b.fasta -gapopen=10 -gapextend=0.5

This workaround uses a whitelist of object attributes, and sets the
object attribute list as normal, for these.  Other attributes are
assumed to be parameters, and passed to the self.set_parameter method
for validation and assignment.

Definition at line 337 of file __init__.py.

00337 
00338     def __setattr__(self, name, value):
00339         """Set attribute name to value (PRIVATE).
00340 
00341         This code implements a workaround for a user interface issue.
00342         Without this __setattr__ attribute-based assignment of parameters
00343         will silently accept invalid parameters, leading to known instances
00344         of the user assuming that parameters for the application are set,
00345         when they are not.
00346         
00347         >>> from Bio.Emboss.Applications import WaterCommandline
00348         >>> cline = WaterCommandline(gapopen=10, gapextend=0.5, stdout=True)
00349         >>> cline.asequence = "a.fasta"
00350         >>> cline.bsequence = "b.fasta"
00351         >>> cline.csequence = "c.fasta"
00352         Traceback (most recent call last):
00353         ...
00354         ValueError: Option name csequence was not found.
00355         >>> print cline
00356         water -stdout -asequence=a.fasta -bsequence=b.fasta -gapopen=10 -gapextend=0.5
00357 
00358         This workaround uses a whitelist of object attributes, and sets the
00359         object attribute list as normal, for these.  Other attributes are
00360         assumed to be parameters, and passed to the self.set_parameter method
00361         for validation and assignment.
00362         """
00363         if name in ['parameters', 'program_name']: # Allowed attributes
00364             self.__dict__[name] = value
00365         else:
00366             self.set_parameter(name, value)  # treat as a parameter
    

Here is the call graph for this function:

def Bio.Application.AbstractCommandline.__str__ (   self) [inherited]
Make the commandline string with the currently set options.

e.g.
>>> from Bio.Emboss.Applications import WaterCommandline
>>> cline = WaterCommandline(gapopen=10, gapextend=0.5)
>>> cline.asequence = "asis:ACCCGGGCGCGGT"
>>> cline.bsequence = "asis:ACCCGAGCGCGGT"
>>> cline.outfile = "temp_water.txt"
>>> print cline
water -outfile=temp_water.txt -asequence=asis:ACCCGGGCGCGGT -bsequence=asis:ACCCGAGCGCGGT -gapopen=10 -gapextend=0.5
>>> str(cline)
'water -outfile=temp_water.txt -asequence=asis:ACCCGGGCGCGGT -bsequence=asis:ACCCGAGCGCGGT -gapopen=10 -gapextend=0.5'

Definition at line 229 of file __init__.py.

00229 
00230     def __str__(self):
00231         """Make the commandline string with the currently set options.
00232 
00233         e.g.
00234         >>> from Bio.Emboss.Applications import WaterCommandline
00235         >>> cline = WaterCommandline(gapopen=10, gapextend=0.5)
00236         >>> cline.asequence = "asis:ACCCGGGCGCGGT"
00237         >>> cline.bsequence = "asis:ACCCGAGCGCGGT"
00238         >>> cline.outfile = "temp_water.txt"
00239         >>> print cline
00240         water -outfile=temp_water.txt -asequence=asis:ACCCGGGCGCGGT -bsequence=asis:ACCCGAGCGCGGT -gapopen=10 -gapextend=0.5
00241         >>> str(cline)
00242         'water -outfile=temp_water.txt -asequence=asis:ACCCGGGCGCGGT -bsequence=asis:ACCCGAGCGCGGT -gapopen=10 -gapextend=0.5'
00243         """
00244         self._validate()
00245         commandline = "%s " % self.program_name
00246         for parameter in self.parameters:
00247             if parameter.is_set:
00248                 #This will include a trailing space:
00249                 commandline += str(parameter)
00250         return commandline.strip() # remove trailing space

Here is the call graph for this function:

Here is the caller graph for this function:

def Bio.Application.AbstractCommandline.set_parameter (   self,
  name,
  value = None 
) [inherited]
Set a commandline option for a program.

Definition at line 297 of file __init__.py.

00297 
00298     def set_parameter(self, name, value = None):
00299         """Set a commandline option for a program.
00300         """
00301         set_option = False
00302         for parameter in self.parameters:
00303             if name in parameter.names:
00304                 if isinstance(parameter, _Switch):
00305                     if value is None:
00306                         import warnings
00307                         warnings.warn("For a switch type argument like %s, "
00308                                       "we expect a boolean.  None is treated "
00309                                       "as FALSE!" % parameter.names[-1])
00310                     parameter.is_set = bool(value)
00311                     set_option = True
00312                 else:
00313                     if value is not None:
00314                         self._check_value(value, name, parameter.checker_function)
00315                         parameter.value = value
00316                     parameter.is_set = True
00317                     set_option = True
00318         if not set_option:
00319             raise ValueError("Option name %s was not found." % name)

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Definition at line 32 of file _Phyml.py.

Reimplemented in Bio.Align.Applications._Dialign.DialignCommandline.

Definition at line 167 of file __init__.py.


The documentation for this class was generated from the following file: