Back to index

python-biopython  1.60
Public Member Functions | Public Attributes | Properties | Private Member Functions | Private Attributes
Bio.SeqFeature.FeatureLocation Class Reference

List of all members.

Public Member Functions

def __init__
def __str__
def __repr__
def __nonzero__
def __len__
def __contains__
def __iter__
def start
def end
def nofuzzy_start
def nofuzzy_end
def extract

Public Attributes

 ref
 ref_db

Properties

 strand

Private Member Functions

def _get_strand
def _set_strand
def _shift
def _flip

Private Attributes

 _start
 _end
 _strand

Detailed Description

Specify the location of a feature along a sequence.

This attempts to deal with fuzziness of position ends, but also
make it easy to get the start and end in the 'normal' case (no
fuzziness).

You should access the start and end attributes with
your_location.start and your_location.end. If the start and
end are exact, this will return the positions, if not, we'll return
the approriate Fuzzy class with info about the position and fuzziness.

Note that the start and end location numbering follow Python's scheme,
thus a GenBank entry of 123..150 (one based counting) becomes a location
of [122:150] (zero based counting).

Definition at line 485 of file SeqFeature.py.


Constructor & Destructor Documentation

def Bio.SeqFeature.FeatureLocation.__init__ (   self,
  start,
  end,
  strand = None,
  ref = None,
  ref_db = None 
)
Specify the start, end, strand etc of a sequence feature.

start and end arguments specify the values where the feature begins
and ends. These can either by any of the *Position objects that
inherit from AbstractPosition, or can just be integers specifying the
position. In the case of integers, the values are assumed to be
exact and are converted in ExactPosition arguments. This is meant
to make it easy to deal with non-fuzzy ends.

i.e. Short form:

>>> from Bio.SeqFeature import FeatureLocation
>>> loc = FeatureLocation(5, 10, strand=-1)
>>> print loc
[5:10](-)

Explicit form:

>>> from Bio.SeqFeature import FeatureLocation, ExactPosition
>>> loc = FeatureLocation(ExactPosition(5), ExactPosition(10), strand=-1)
>>> print loc
[5:10](-)

Other fuzzy positions are used similarly,

>>> from Bio.SeqFeature import FeatureLocation
>>> from Bio.SeqFeature import BeforePosition, AfterPosition
>>> loc2 = FeatureLocation(BeforePosition(5), AfterPosition(10), strand=-1)
>>> print loc2
[<5:>10](-)

For nucleotide features you will also want to specify the strand,
use 1 for the forward (plus) strand, -1 for the reverse (negative)
strand, 0 for stranded but strand unknown (? in GFF3), or None for
when the strand does not apply (dot in GFF3), e.g. features on
proteins.

>>> loc = FeatureLocation(5, 10, strand=+1)
>>> print loc
[5:10](+)
>>> print loc.strand
1

Normally feature locations are given relative to the parent
sequence you are working with, but an explicit accession can
be given with the optional ref and db_ref strings:

>>> loc = FeatureLocation(105172, 108462, ref="AL391218.9", strand=1)
>>> print loc
AL391218.9[105172:108462](+)
>>> print loc.ref
AL391218.9

Definition at line 501 of file SeqFeature.py.

00501 
00502     def __init__(self, start, end, strand=None, ref=None, ref_db=None):
00503         """Specify the start, end, strand etc of a sequence feature.
00504 
00505         start and end arguments specify the values where the feature begins
00506         and ends. These can either by any of the *Position objects that
00507         inherit from AbstractPosition, or can just be integers specifying the
00508         position. In the case of integers, the values are assumed to be
00509         exact and are converted in ExactPosition arguments. This is meant
00510         to make it easy to deal with non-fuzzy ends.
00511 
00512         i.e. Short form:
00513         
00514         >>> from Bio.SeqFeature import FeatureLocation
00515         >>> loc = FeatureLocation(5, 10, strand=-1)
00516         >>> print loc
00517         [5:10](-)
00518         
00519         Explicit form:
00520 
00521         >>> from Bio.SeqFeature import FeatureLocation, ExactPosition
00522         >>> loc = FeatureLocation(ExactPosition(5), ExactPosition(10), strand=-1)
00523         >>> print loc
00524         [5:10](-)
00525 
00526         Other fuzzy positions are used similarly,
00527 
00528         >>> from Bio.SeqFeature import FeatureLocation
00529         >>> from Bio.SeqFeature import BeforePosition, AfterPosition
00530         >>> loc2 = FeatureLocation(BeforePosition(5), AfterPosition(10), strand=-1)
00531         >>> print loc2
00532         [<5:>10](-)
00533 
00534         For nucleotide features you will also want to specify the strand,
00535         use 1 for the forward (plus) strand, -1 for the reverse (negative)
00536         strand, 0 for stranded but strand unknown (? in GFF3), or None for
00537         when the strand does not apply (dot in GFF3), e.g. features on
00538         proteins.
00539 
00540         >>> loc = FeatureLocation(5, 10, strand=+1)
00541         >>> print loc
00542         [5:10](+)
00543         >>> print loc.strand
00544         1
00545 
00546         Normally feature locations are given relative to the parent
00547         sequence you are working with, but an explicit accession can
00548         be given with the optional ref and db_ref strings:
00549 
00550         >>> loc = FeatureLocation(105172, 108462, ref="AL391218.9", strand=1)
00551         >>> print loc
00552         AL391218.9[105172:108462](+)
00553         >>> print loc.ref
00554         AL391218.9
00555 
00556         """
00557         if isinstance(start, AbstractPosition):
00558             self._start = start
00559         elif isinstance(start, int):
00560             self._start = ExactPosition(start)
00561         else:
00562             raise TypeError(start)
00563         if isinstance(end, AbstractPosition):
00564             self._end = end
00565         elif isinstance(end, int):
00566             self._end = ExactPosition(end)
00567         else:
00568             raise TypeError(end)
00569         self.strand = strand
00570         self.ref = ref
00571         self.ref_db = ref_db


Member Function Documentation

def Bio.SeqFeature.FeatureLocation.__contains__ (   self,
  value 
)
Check if an integer position is within the FeatureLocation.

Note that extra care may be needed for fuzzy locations, e.g.

>>> from Bio.SeqFeature import FeatureLocation
>>> from Bio.SeqFeature import BeforePosition, AfterPosition
>>> loc = FeatureLocation(BeforePosition(5),AfterPosition(10))
>>> len(loc)
5
>>> [i for i in range(15) if i in loc]
[5, 6, 7, 8, 9]

Definition at line 644 of file SeqFeature.py.

00644 
00645     def __contains__(self, value):
00646         """Check if an integer position is within the FeatureLocation.
00647 
00648         Note that extra care may be needed for fuzzy locations, e.g.
00649 
00650         >>> from Bio.SeqFeature import FeatureLocation
00651         >>> from Bio.SeqFeature import BeforePosition, AfterPosition
00652         >>> loc = FeatureLocation(BeforePosition(5),AfterPosition(10))
00653         >>> len(loc)
00654         5
00655         >>> [i for i in range(15) if i in loc]
00656         [5, 6, 7, 8, 9]
00657         """
00658         if not isinstance(value, int):
00659             raise ValueError("Currently we only support checking for integer "
00660                              "positions being within a FeatureLocation.")
00661         if value < self._start or value >= self._end:
00662             return False
00663         else:
00664             return True

Iterate over the parent positions within the FeatureLocation.

>>> from Bio.SeqFeature import FeatureLocation
>>> from Bio.SeqFeature import BeforePosition, AfterPosition
>>> loc = FeatureLocation(BeforePosition(5),AfterPosition(10))
>>> len(loc)
5
>>> for i in loc: print i
5
6
7
8
9
>>> list(loc)
[5, 6, 7, 8, 9]
>>> [i for i in range(15) if i in loc]
[5, 6, 7, 8, 9]

Note this is strand aware:

>>> loc = FeatureLocation(BeforePosition(5), AfterPosition(10), strand = -1)
>>> list(loc)
[9, 8, 7, 6, 5]

Definition at line 665 of file SeqFeature.py.

00665 
00666     def __iter__(self):
00667         """Iterate over the parent positions within the FeatureLocation.
00668 
00669         >>> from Bio.SeqFeature import FeatureLocation
00670         >>> from Bio.SeqFeature import BeforePosition, AfterPosition
00671         >>> loc = FeatureLocation(BeforePosition(5),AfterPosition(10))
00672         >>> len(loc)
00673         5
00674         >>> for i in loc: print i
00675         5
00676         6
00677         7
00678         8
00679         9
00680         >>> list(loc)
00681         [5, 6, 7, 8, 9]
00682         >>> [i for i in range(15) if i in loc]
00683         [5, 6, 7, 8, 9]
00684 
00685         Note this is strand aware:
00686 
00687         >>> loc = FeatureLocation(BeforePosition(5), AfterPosition(10), strand = -1)
00688         >>> list(loc)
00689         [9, 8, 7, 6, 5]
00690         """
00691         if self.strand == -1:
00692             for i in range(self._end - 1, self._start - 1, -1):
00693                 yield i
00694         else:
00695             for i in range(self._start, self._end):
00696                 yield i

Here is the caller graph for this function:

Returns the length of the region described by the FeatureLocation.

Note that extra care may be needed for fuzzy locations, e.g.

>>> from Bio.SeqFeature import FeatureLocation
>>> from Bio.SeqFeature import BeforePosition, AfterPosition
>>> loc = FeatureLocation(BeforePosition(5),AfterPosition(10))
>>> len(loc)
5

Definition at line 631 of file SeqFeature.py.

00631 
00632     def __len__(self):
00633         """Returns the length of the region described by the FeatureLocation.
00634         
00635         Note that extra care may be needed for fuzzy locations, e.g.
00636 
00637         >>> from Bio.SeqFeature import FeatureLocation
00638         >>> from Bio.SeqFeature import BeforePosition, AfterPosition
00639         >>> loc = FeatureLocation(BeforePosition(5),AfterPosition(10))
00640         >>> len(loc)
00641         5
00642         """
00643         return int(self._end) - int(self._start)

Returns True regardless of the length of the feature.

This behaviour is for backwards compatibility, since until the
__len__ method was added, a FeatureLocation always evaluated as True.

Note that in comparison, Seq objects, strings, lists, etc, will all
evaluate to False if they have length zero.

WARNING: The FeatureLocation may in future evaluate to False when its
length is zero (in order to better match normal python behaviour)!

Definition at line 617 of file SeqFeature.py.

00617 
00618     def __nonzero__(self):
00619         """Returns True regardless of the length of the feature.
00620 
00621         This behaviour is for backwards compatibility, since until the
00622         __len__ method was added, a FeatureLocation always evaluated as True.
00623 
00624         Note that in comparison, Seq objects, strings, lists, etc, will all
00625         evaluate to False if they have length zero.
00626 
00627         WARNING: The FeatureLocation may in future evaluate to False when its
00628         length is zero (in order to better match normal python behaviour)!
00629         """
00630         return True

A string representation of the location for debugging.

Definition at line 605 of file SeqFeature.py.

00605 
00606     def __repr__(self):
00607         """A string representation of the location for debugging."""
00608         optional = ""
00609         if self.strand is not None:
00610             optional += ", strand=%r" % self.strand
00611         if self.ref is not None:
00612             optional += ", ref=%r" % self.ref
00613         if self.ref_db is not None:
00614             optional += ", ref_db=%r" % self.ref_db
00615         return "%s(%r, %r%s)" \
00616                    % (self.__class__.__name__, self.start, self.end, optional)

Here is the call graph for this function:

Returns a representation of the location (with python counting).

For the simple case this uses the python splicing syntax, [122:150]
(zero based counting) which GenBank would call 123..150 (one based
counting).

Definition at line 582 of file SeqFeature.py.

00582 
00583     def __str__(self):
00584         """Returns a representation of the location (with python counting).
00585 
00586         For the simple case this uses the python splicing syntax, [122:150]
00587         (zero based counting) which GenBank would call 123..150 (one based
00588         counting).
00589         """
00590         answer = "[%s:%s]" % (self._start, self._end)
00591         if self.ref and self.ref_db:
00592             answer = "%s:%s%s" % (self.ref_db, self.ref, answer)
00593         elif self.ref:
00594             answer = self.ref + answer
00595         #Is ref_db without ref meaningful?
00596         if self.strand is None:
00597             return answer
00598         elif self.strand == +1:
00599             return answer + "(+)"
00600         elif self.strand == -1:
00601             return answer + "(-)"
00602         else:
00603             #strand = 0, stranded but strand unknown, ? in GFF3
00604             return answer + "(?)"

def Bio.SeqFeature.FeatureLocation._flip (   self,
  length 
) [private]
Returns a copy of the location after the parent is reversed (PRIVATE).

Definition at line 706 of file SeqFeature.py.

00706 
00707     def _flip(self, length):
00708         """Returns a copy of the location after the parent is reversed (PRIVATE)."""
00709         if self.ref or self.ref_db:
00710             #TODO - Return self?
00711             raise ValueError("Feature references another sequence.")
00712         #Note this will flip the start and end too!
00713         if self.strand == +1:
00714             flip_strand = -1
00715         elif self.strand == -1:
00716             flip_strand = +1
00717         else:
00718             #0 or None
00719             flip_strand = self.strand
00720         return FeatureLocation(start = self._end._flip(length),
00721                                end = self._start._flip(length),
00722                                strand = flip_strand)

Definition at line 572 of file SeqFeature.py.

00572 
00573     def _get_strand(self):
        return self._strand
def Bio.SeqFeature.FeatureLocation._set_strand (   self,
  value 
) [private]

Definition at line 574 of file SeqFeature.py.

00574 
00575     def _set_strand(self, value):
00576         if value not in [+1, -1, 0, None]:
00577             raise ValueError("Strand should be +1, -1, 0 or None, not %r" \
00578                              % value)
        self._strand = value
def Bio.SeqFeature.FeatureLocation._shift (   self,
  offset 
) [private]
Returns a copy of the location shifted by the offset (PRIVATE).

Definition at line 697 of file SeqFeature.py.

00697 
00698     def _shift(self, offset):
00699         """Returns a copy of the location shifted by the offset (PRIVATE)."""
00700         if self.ref or self.ref_db:
00701             #TODO - Return self?
00702             raise ValueError("Feature references another sequence.")
00703         return FeatureLocation(start = self._start._shift(offset),
00704                                end = self._end._shift(offset),
00705                                strand = self.strand)

End location (integer like, possibly a fuzzy position, read only).

Definition at line 729 of file SeqFeature.py.

00729 
00730     def end(self):
00731         """End location (integer like, possibly a fuzzy position, read only)."""
00732         return self._end

Here is the caller graph for this function:

def Bio.SeqFeature.FeatureLocation.extract (   self,
  parent_sequence 
)
Extract feature sequence from the supplied parent sequence.

Definition at line 754 of file SeqFeature.py.

00754 
00755     def extract(self, parent_sequence):
00756         """Extract feature sequence from the supplied parent sequence."""
00757         if self.ref or self.ref_db:
00758             #TODO - Take a dictionary as an optional argument?
00759             raise ValueError("Feature references another sequence.")
00760         if isinstance(parent_sequence, MutableSeq):
00761             #This avoids complications with reverse complements
00762             #(the MutableSeq reverse complement acts in situ)
00763            parent_sequence = parent_sequence.toseq()
00764         f_seq = parent_sequence[self.nofuzzy_start:self.nofuzzy_end]
00765         if self.strand == -1:
00766             try:
00767                 f_seq = f_seq.reverse_complement()
00768             except AttributeError:
00769                 assert isinstance(f_seq, str)
00770                 f_seq = reverse_complement(f_seq)
00771         return f_seq

Here is the call graph for this function:

End position (integer, approximated if fuzzy, read only) (OBSOLETE).

This is now a alias for int(feature.end), which should be
used in preference -- unless you are trying to support old
versions of Biopython.  

Definition at line 744 of file SeqFeature.py.

00744 
00745     def nofuzzy_end(self):
00746         """End position (integer, approximated if fuzzy, read only) (OBSOLETE).
00747 
00748         This is now a alias for int(feature.end), which should be
00749         used in preference -- unless you are trying to support old
00750         versions of Biopython.  
00751         """
00752         return int(self._end)
00753 

Here is the caller graph for this function:

Start position (integer, approximated if fuzzy, read only) (OBSOLETE).

This is now a alias for int(feature.start), which should be
used in preference -- unless you are trying to support old
versions of Biopython.

Definition at line 734 of file SeqFeature.py.

00734 
00735     def nofuzzy_start(self):
00736         """Start position (integer, approximated if fuzzy, read only) (OBSOLETE).
00737 
00738         This is now a alias for int(feature.start), which should be
00739         used in preference -- unless you are trying to support old
00740         versions of Biopython.
00741         """
00742         return int(self._start)

Here is the caller graph for this function:

Start location (integer like, possibly a fuzzy position, read only).

Definition at line 724 of file SeqFeature.py.

00724 
00725     def start(self):
00726         """Start location (integer like, possibly a fuzzy position, read only)."""
00727         return self._start

Here is the caller graph for this function:


Member Data Documentation

Definition at line 563 of file SeqFeature.py.

Definition at line 557 of file SeqFeature.py.

Definition at line 578 of file SeqFeature.py.

Definition at line 569 of file SeqFeature.py.

Definition at line 570 of file SeqFeature.py.


Property Documentation

Initial value:
property(fget = _get_strand, fset = _set_strand,
                      doc = "Strand of the location (+1, -1, 0 or None).")

Definition at line 579 of file SeqFeature.py.


The documentation for this class was generated from the following file: