Back to index

python3.2  3.2.2
Functions | Variables
heapq Namespace Reference

Functions

def heappush
def heappop
def heapreplace
def heappushpop
def heapify
def nlargest
def nsmallest
def _siftdown
def _siftup
def merge
def nsmallest
def nlargest

Variables

string __about__
list __all__
 _nsmallest = nsmallest
 _nlargest = nlargest
list heap = []
list data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
list sort = []

Detailed Description

Heap queue algorithm (a.k.a. priority queue).

Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
all k, counting elements from 0.  For the sake of comparison,
non-existing elements are considered to be infinite.  The interesting
property of a heap is that a[0] is always its smallest element.

Usage:

heap = []            # creates an empty heap
heappush(heap, item) # pushes a new item on the heap
item = heappop(heap) # pops the smallest item from the heap
item = heap[0]       # smallest item on the heap without popping it
heapify(x)           # transforms list into a heap, in-place, in linear time
item = heapreplace(heap, item) # pops and returns smallest item, and adds
                       # new item; the heap size is unchanged

Our API differs from textbook heap algorithms as follows:

- We use 0-based indexing.  This makes the relationship between the
  index for a node and the indexes for its children slightly less
  obvious, but is more suitable since Python uses 0-based indexing.

- Our heappop() method returns the smallest item, not the largest.

These two make it possible to view the heap as a regular Python list
without surprises: heap[0] is the smallest item, and heap.sort()
maintains the heap invariant!

Function Documentation

def heapq._siftdown (   heap,
  startpos,
  pos 
) [private]

Definition at line 232 of file heapq.py.

00232 
00233 def _siftdown(heap, startpos, pos):
00234     newitem = heap[pos]
00235     # Follow the path to the root, moving parents down until finding a place
00236     # newitem fits.
00237     while pos > startpos:
00238         parentpos = (pos - 1) >> 1
00239         parent = heap[parentpos]
00240         if newitem < parent:
00241             heap[pos] = parent
00242             pos = parentpos
00243             continue
00244         break
00245     heap[pos] = newitem
00246 
00247 # The child indices of heap index pos are already heaps, and we want to make
00248 # a heap at index pos too.  We do this by bubbling the smaller child of
00249 # pos up (and so on with that child's children, etc) until hitting a leaf,
00250 # then using _siftdown to move the oddball originally at index pos into place.
00251 #
00252 # We *could* break out of the loop as soon as we find a pos where newitem <=
00253 # both its children, but turns out that's not a good idea, and despite that
00254 # many books write the algorithm that way.  During a heap pop, the last array
00255 # element is sifted in, and that tends to be large, so that comparing it
00256 # against values starting from the root usually doesn't pay (= usually doesn't
00257 # get us out of the loop early).  See Knuth, Volume 3, where this is
00258 # explained and quantified in an exercise.
00259 #
00260 # Cutting the # of comparisons is important, since these routines have no
00261 # way to extract "the priority" from an array element, so that intelligence
00262 # is likely to be hiding in custom comparison methods, or in array elements
00263 # storing (priority, record) tuples.  Comparisons are thus potentially
00264 # expensive.
00265 #
00266 # On random arrays of length 1000, making this change cut the number of
00267 # comparisons made by heapify() a little, and those made by exhaustive
00268 # heappop() a lot, in accord with theory.  Here are typical results from 3
00269 # runs (3 just to demonstrate how small the variance is):
00270 #
00271 # Compares needed by heapify     Compares needed by 1000 heappops
00272 # --------------------------     --------------------------------
00273 # 1837 cut to 1663               14996 cut to 8680
00274 # 1855 cut to 1659               14966 cut to 8678
00275 # 1847 cut to 1660               15024 cut to 8703
00276 #
00277 # Building the heap by using heappush() 1000 times instead required
00278 # 2198, 2148, and 2219 compares:  heapify() is more efficient, when
00279 # you can use it.
00280 #
00281 # The total compares needed by list.sort() on the same lists were 8627,
00282 # 8627, and 8632 (this should be compared to the sum of heapify() and
00283 # heappop() compares):  list.sort() is (unsurprisingly!) more efficient
00284 # for sorting.

Here is the caller graph for this function:

def heapq._siftup (   heap,
  pos 
) [private]

Definition at line 285 of file heapq.py.

00285 
00286 def _siftup(heap, pos):
00287     endpos = len(heap)
00288     startpos = pos
00289     newitem = heap[pos]
00290     # Bubble up the smaller child until hitting a leaf.
00291     childpos = 2*pos + 1    # leftmost child position
00292     while childpos < endpos:
00293         # Set childpos to index of smaller child.
00294         rightpos = childpos + 1
00295         if rightpos < endpos and not heap[childpos] < heap[rightpos]:
00296             childpos = rightpos
00297         # Move the smaller child up.
00298         heap[pos] = heap[childpos]
00299         pos = childpos
00300         childpos = 2*pos + 1
00301     # The leaf at pos is empty now.  Put newitem there, and bubble it up
00302     # to its final resting place (by sifting its parents down).
00303     heap[pos] = newitem
00304     _siftdown(heap, startpos, pos)
00305 
00306 # If available, use C implementation
00307 try:
    from _heapq import *

Here is the call graph for this function:

Here is the caller graph for this function:

def heapq.heapify (   x)
Transform list into a heap, in-place, in O(len(x)) time.

Definition at line 172 of file heapq.py.

00172 
00173 def heapify(x):
00174     """Transform list into a heap, in-place, in O(len(x)) time."""
00175     n = len(x)
00176     # Transform bottom-up.  The largest index there's any point to looking at
00177     # is the largest with a child index in-range, so must have 2*i + 1 < n,
00178     # or i < (n-1)/2.  If n is even = 2*j, this is (2*j-1)/2 = j-1/2 so
00179     # j-1 is the largest, which is n//2 - 1.  If n is odd = 2*j+1, this is
00180     # (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1.
00181     for i in reversed(range(n//2)):
00182         _siftup(x, i)

Here is the call graph for this function:

Here is the caller graph for this function:

def heapq.heappop (   heap)
Pop the smallest item off the heap, maintaining the heap invariant.

Definition at line 138 of file heapq.py.

00138 
00139 def heappop(heap):
00140     """Pop the smallest item off the heap, maintaining the heap invariant."""
00141     lastelt = heap.pop()    # raises appropriate IndexError if heap is empty
00142     if heap:
00143         returnitem = heap[0]
00144         heap[0] = lastelt
00145         _siftup(heap, 0)
00146     else:
00147         returnitem = lastelt
00148     return returnitem

Here is the call graph for this function:

def heapq.heappush (   heap,
  item 
)
Push item onto heap, maintaining the heap invariant.

Definition at line 133 of file heapq.py.

00133 
00134 def heappush(heap, item):
00135     """Push item onto heap, maintaining the heap invariant."""
00136     heap.append(item)
00137     _siftdown(heap, 0, len(heap)-1)

Here is the call graph for this function:

Here is the caller graph for this function:

def heapq.heappushpop (   heap,
  item 
)
Fast version of a heappush followed by a heappop.

Definition at line 165 of file heapq.py.

00165 
00166 def heappushpop(heap, item):
00167     """Fast version of a heappush followed by a heappop."""
00168     if heap and heap[0] < item:
00169         item, heap[0] = heap[0], item
00170         _siftup(heap, 0)
00171     return item

Here is the call graph for this function:

def heapq.heapreplace (   heap,
  item 
)
Pop and return the current smallest value, and add the new item.

This is more efficient than heappop() followed by heappush(), and can be
more appropriate when using a fixed-size heap.  Note that the value
returned may be larger than item!  That constrains reasonable uses of
this routine unless written as part of a conditional replacement:

    if item > heap[0]:
        item = heapreplace(heap, item)

Definition at line 149 of file heapq.py.

00149 
00150 def heapreplace(heap, item):
00151     """Pop and return the current smallest value, and add the new item.
00152 
00153     This is more efficient than heappop() followed by heappush(), and can be
00154     more appropriate when using a fixed-size heap.  Note that the value
00155     returned may be larger than item!  That constrains reasonable uses of
00156     this routine unless written as part of a conditional replacement:
00157 
00158         if item > heap[0]:
00159             item = heapreplace(heap, item)
00160     """
00161     returnitem = heap[0]    # raises appropriate IndexError if heap is empty
00162     heap[0] = item
00163     _siftup(heap, 0)
00164     return returnitem

Here is the call graph for this function:

def heapq.merge (   iterables)
Merge multiple sorted inputs into a single sorted output.

Similar to sorted(itertools.chain(*iterables)) but returns a generator,
does not pull the data into memory all at once, and assumes that each of
the input streams is already sorted (smallest to largest).

>>> list(merge([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
[0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]

Definition at line 311 of file heapq.py.

00311 
00312 def merge(*iterables):
00313     '''Merge multiple sorted inputs into a single sorted output.
00314 
00315     Similar to sorted(itertools.chain(*iterables)) but returns a generator,
00316     does not pull the data into memory all at once, and assumes that each of
00317     the input streams is already sorted (smallest to largest).
00318 
00319     >>> list(merge([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
00320     [0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]
00321 
00322     '''
00323     _heappop, _heapreplace, _StopIteration = heappop, heapreplace, StopIteration
00324 
00325     h = []
00326     h_append = h.append
00327     for itnum, it in enumerate(map(iter, iterables)):
00328         try:
00329             next = it.__next__
00330             h_append([next(), itnum, next])
00331         except _StopIteration:
00332             pass
00333     heapify(h)
00334 
00335     while 1:
00336         try:
00337             while 1:
00338                 v, itnum, next = s = h[0]   # raises IndexError when h is empty
00339                 yield v
00340                 s[0] = next()               # raises StopIteration when exhausted
00341                 _heapreplace(h, s)          # restore heap condition
00342         except _StopIteration:
00343             _heappop(h)                     # remove empty iterator
00344         except IndexError:
00345             return
00346 
# Extend the implementations of nsmallest and nlargest to use a key= argument

Here is the call graph for this function:

def heapq.nlargest (   n,
  iterable 
)
Find the n largest elements in a dataset.

Equivalent to:  sorted(iterable, reverse=True)[:n]

Definition at line 183 of file heapq.py.

00183 
00184 def nlargest(n, iterable):
00185     """Find the n largest elements in a dataset.
00186 
00187     Equivalent to:  sorted(iterable, reverse=True)[:n]
00188     """
00189     it = iter(iterable)
00190     result = list(islice(it, n))
00191     if not result:
00192         return result
00193     heapify(result)
00194     _heappushpop = heappushpop
00195     for elem in it:
00196         _heappushpop(result, elem)
00197     result.sort(reverse=True)
00198     return result

Here is the call graph for this function:

Here is the caller graph for this function:

def heapq.nlargest (   n,
  iterable,
  key = None 
)
Find the n largest elements in a dataset.

Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]

Definition at line 385 of file heapq.py.

00385 
00386 def nlargest(n, iterable, key=None):
00387     """Find the n largest elements in a dataset.
00388 
00389     Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]
00390     """
00391 
00392     # Short-cut for n==1 is to use max() when len(iterable)>0
00393     if n == 1:
00394         it = iter(iterable)
00395         head = list(islice(it, 1))
00396         if not head:
00397             return []
00398         if key is None:
00399             return [max(chain(head, it))]
00400         return [max(chain(head, it), key=key)]
00401 
00402     # When n>=size, it's faster to use sorted()
00403     try:
00404         size = len(iterable)
00405     except (TypeError, AttributeError):
00406         pass
00407     else:
00408         if n >= size:
00409             return sorted(iterable, key=key, reverse=True)[:n]
00410 
00411     # When key is none, use simpler decoration
00412     if key is None:
00413         it = zip(iterable, count(0,-1))                     # decorate
00414         result = _nlargest(n, it)
00415         return [r[0] for r in result]                       # undecorate
00416 
00417     # General case, slowest method
00418     in1, in2 = tee(iterable)
00419     it = zip(map(key, in1), count(0,-1), in2)               # decorate
00420     result = _nlargest(n, it)
00421     return [r[2] for r in result]                           # undecorate

Here is the call graph for this function:

Here is the caller graph for this function:

def heapq.nsmallest (   n,
  iterable 
)
Find the n smallest elements in a dataset.

Equivalent to:  sorted(iterable)[:n]

Definition at line 199 of file heapq.py.

00199 
00200 def nsmallest(n, iterable):
00201     """Find the n smallest elements in a dataset.
00202 
00203     Equivalent to:  sorted(iterable)[:n]
00204     """
00205     if hasattr(iterable, '__len__') and n * 10 <= len(iterable):
00206         # For smaller values of n, the bisect method is faster than a minheap.
00207         # It is also memory efficient, consuming only n elements of space.
00208         it = iter(iterable)
00209         result = sorted(islice(it, 0, n))
00210         if not result:
00211             return result
00212         insort = bisect.insort
00213         pop = result.pop
00214         los = result[-1]    # los --> Largest of the nsmallest
00215         for elem in it:
00216             if elem < los:
00217                 insort(result, elem)
00218                 pop()
00219                 los = result[-1]
00220         return result
00221     # An alternative approach manifests the whole iterable in memory but
00222     # saves comparisons by heapifying all at once.  Also, saves time
00223     # over bisect.insort() which has O(n) data movement time for every
00224     # insertion.  Finding the n smallest of an m length iterable requires
00225     #    O(m) + O(n log m) comparisons.
00226     h = list(iterable)
00227     heapify(h)
00228     return list(map(heappop, repeat(h, min(n, len(h)))))
00229 
00230 # 'heap' is a heap at all indices >= startpos, except possibly for pos.  pos
00231 # is the index of a leaf with a possibly out-of-order value.  Restore the
# heap invariant.

Here is the call graph for this function:

Here is the caller graph for this function:

def heapq.nsmallest (   n,
  iterable,
  key = None 
)
Find the n smallest elements in a dataset.

Equivalent to:  sorted(iterable, key=key)[:n]

Definition at line 348 of file heapq.py.

00348 
00349 def nsmallest(n, iterable, key=None):
00350     """Find the n smallest elements in a dataset.
00351 
00352     Equivalent to:  sorted(iterable, key=key)[:n]
00353     """
00354     # Short-cut for n==1 is to use min() when len(iterable)>0
00355     if n == 1:
00356         it = iter(iterable)
00357         head = list(islice(it, 1))
00358         if not head:
00359             return []
00360         if key is None:
00361             return [min(chain(head, it))]
00362         return [min(chain(head, it), key=key)]
00363 
00364     # When n>=size, it's faster to use sorted()
00365     try:
00366         size = len(iterable)
00367     except (TypeError, AttributeError):
00368         pass
00369     else:
00370         if n >= size:
00371             return sorted(iterable, key=key)[:n]
00372 
00373     # When key is none, use simpler decoration
00374     if key is None:
00375         it = zip(iterable, count())                         # decorate
00376         result = _nsmallest(n, it)
00377         return [r[0] for r in result]                       # undecorate
00378 
00379     # General case, slowest method
00380     in1, in2 = tee(iterable)
00381     it = zip(map(key, in1), count(), in2)                   # decorate
00382     result = _nsmallest(n, it)
00383     return [r[2] for r in result]                           # undecorate

Here is the call graph for this function:


Variable Documentation

Definition at line 33 of file heapq.py.

Initial value:
00001 ['heappush', 'heappop', 'heapify', 'heapreplace', 'merge',
00002            'nlargest', 'nsmallest', 'heappushpop']

Definition at line 127 of file heapq.py.

Definition at line 384 of file heapq.py.

Definition at line 347 of file heapq.py.

list heapq.data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]

Definition at line 425 of file heapq.py.

list heapq.heap = []

Definition at line 424 of file heapq.py.

list heapq.sort = []

Definition at line 428 of file heapq.py.