Back to index

plone3  3.1.7
volatile.py
Go to the documentation of this file.
00001 """A flexible caching decorator.
00002 
00003 This module provides a cache decorator `cache` that you can use to
00004 cache results of your functions or methods.  Let's say we have a class
00005 with an expensive method `pow` that we want to cache:
00006 
00007   >>> class MyClass:
00008   ...     def pow(self, first, second):
00009   ...         print 'Someone or something called me'
00010   ...         return first ** second
00011 
00012 Okay, we know that if the `first` and `second` arguments are the same,
00013 the result is going to be the same, always.  We'll use a cache key
00014 calculator to tell the `cache` decorator about this assertion.  What's
00015 this cache key calculator?  It's a function that takes the original
00016 function plus the same arguments as the original function that we're
00017 caching:
00018 
00019   >>> def cache_key(method, self, first, second):
00020   ...     return (first, second)
00021 
00022 For performances and security reasons, no hash is done on the key in
00023 this example. You may consider using a cryptographic hash (MD5 or even
00024 better SHA1) if your parameters can hold big amount of data.
00025 
00026 The cache decorator is really simple to use.  Let's define our first
00027 class again, this time with a cached `pow` method:
00028 
00029   >>> class MyClass:
00030   ...     @cache(cache_key)
00031   ...     def pow(self, first, second):
00032   ...         print 'Someone or something called me'
00033   ...         return first ** second
00034 
00035 The results:
00036 
00037   >>> obj = MyClass()
00038   >>> obj.pow(3, 2)
00039   Someone or something called me
00040   9
00041   >>> obj.pow(3, 2)
00042   9
00043 
00044 Did you see that?  The method was called only once.
00045 
00046 Now to where this cache is stored: That's actually variable.  The
00047 cache decorator takes an optional second argument with which you can
00048 define the where the cache is stored to.
00049 
00050 By default, the cache stores its values on the first argument to the
00051 function.  For our method, this is self, which is perfectly fine.  For
00052 normal functions, the first argument is maybe not the best place to
00053 store the cache.
00054 
00055 The default cache container function stores a dictionary on the
00056 instance as a *volatile* attribute.  That is, it's prefixed with
00057 ``_v_``.  In Zope, this means that the cache is not persisted.
00058 
00059   >>> ATTR
00060   '_v_memoize_cache'
00061   >>> cache_container = getattr(obj, ATTR)
00062 
00063 This cache container maps our key, including the function's dotted
00064 name, to the return value.
00065 
00066   >>> cache_container # doctest: +ELLIPSIS
00067   {'plone.memoize.volatile.pow:...': 9}
00068   >>> len(cache_container)
00069   1
00070   >>> k = 'plone.memoize.volatile.pow:%s' % str(cache_key(MyClass.pow, None, 3, 2))
00071   >>> cache_container[k]
00072   9
00073 
00074 Okay, on to storing the cache somewhere else.  The function we'll have
00075 to provide is really similar to the cache key function we defined
00076 earlier.
00077 
00078 Like the cache key function, the storage function takes the same
00079 amount of arguments as the original cached function.  We'll use a
00080 global for caching this time:
00081 
00082   >>> my_cache = {}
00083   >>> def cache_storage(fun, *args, **kwargs):
00084   ...     return my_cache
00085 
00086 This time, instead of caching a method, we'll cache a normal function.
00087 For this, we'll need to change our cache key function to take the
00088 correct number of arguments:
00089 
00090   >>> def cache_key(fun, first, second):
00091   ...     return (first, second)
00092 
00093 Note how we provide both the cache key generator and the cache storage
00094 as arguments to the `cache` decorator:
00095 
00096   >>> @cache(cache_key, cache_storage)
00097   ... def pow(first, second):
00098   ...     print 'Someone or something called me'
00099   ...     return first ** second
00100 
00101 Let's try it out:
00102 
00103   >>> pow(3, 2)
00104   Someone or something called me
00105   9
00106   >>> pow(3, 2)
00107   9
00108   >>> pow(3, 2)
00109   9
00110   >>> pow(3, 3)
00111   Someone or something called me
00112   27
00113   >>> pow(3, 3)
00114   27
00115   >>> my_cache.clear()
00116 
00117 It works!
00118 
00119 A cache key generator may also raise DontCache to indicate that no
00120 caching should be applied:
00121 
00122   >>> def cache_key(fun, first, second):
00123   ...     if first == second:
00124   ...         raise DontCache
00125   ...     else:
00126   ...         return (first, second)
00127   >>> @cache(cache_key, cache_storage)
00128   ... def pow(first, second):
00129   ...     print 'Someone or something called me'
00130   ...     return first ** second
00131 
00132   >>> pow(3, 2)
00133   Someone or something called me
00134   9
00135   >>> pow(3, 2)
00136   9
00137   >>> pow(3, 3)
00138   Someone or something called me
00139   27
00140   >>> pow(3, 3)
00141   Someone or something called me
00142   27
00143 
00144 Caveats
00145 -------
00146 
00147 Be careful when you have multiple methods with the same name in a
00148 single module:
00149 
00150   >>> def cache_key(fun, instance, *args):
00151   ...     return args
00152   >>> cache_container = {}
00153   >>> class A:
00154   ...     @cache(cache_key, lambda *args: cache_container)
00155   ...     def somemet(self, one, two):
00156   ...         return one + two
00157   >>> class B:
00158   ...     @cache(cache_key, lambda *args: cache_container)
00159   ...     def somemet(self, one, two):
00160   ...         return one - two
00161   >>> a = A()
00162   >>> a.somemet(1, 2)
00163   3
00164   >>> cache_container
00165   {'plone.memoize.volatile.somemet:(1, 2)': 3}
00166 
00167 The following call should really return -1, but since the default
00168 cache key isn't clever enough to include the function's name, it'll
00169 return 3:
00170 
00171   >>> B().somemet(1, 2)
00172   3
00173   >>> len(cache_container)
00174   1
00175   >>> cache_container.clear()
00176 
00177 Ouch!  The fix for this is to e.g. include your class' name in the key
00178 when you create it:
00179 
00180   >>> def cache_key(fun, instance, *args):
00181   ...     return (instance.__class__,) + args
00182   >>> class A:
00183   ...     @cache(cache_key, lambda *args: cache_container)
00184   ...     def somemet(self, one, two):
00185   ...         return one + two
00186   >>> class B:
00187   ...     @cache(cache_key, lambda *args: cache_container)
00188   ...     def somemet(self, one, two):
00189   ...         return one - two
00190   >>> a = A()
00191   >>> a.somemet(1, 2)
00192   3
00193   >>> B().somemet(1, 2)
00194   -1
00195   >>> len(cache_container)
00196   2
00197 """
00198 
00199 import time
00200 
00201 class CleanupDict(dict):
00202     """A dict that automatically cleans up items that haven't been
00203     accessed in a given timespan on *set*.
00204 
00205     This implementation is a bit naive, since it's not associated with
00206     any policy that the user can configure, and it doesn't provide
00207     statistics like RAMCache, but at least it helps make sure our
00208     volatile attribute doesn't grow stale entries indefinitely.
00209 
00210       >>> d = CleanupDict()
00211       >>> d['spam'] = 'bar'
00212       >>> d['spam']
00213       'bar'
00214 
00215       >>> d = CleanupDict(0)
00216       >>> d['spam'] = 'bar'
00217       >>> d['spam'] # doctest: +ELLIPSIS
00218       Traceback (most recent call last):
00219       ...
00220       KeyError: 'spam'
00221     """
00222     cleanup_period = 60 * 60 * 24 * 3 # 3 days
00223 
00224     def __init__(self, cleanup_period=None):
00225         super(CleanupDict, self).__init__()
00226         self._last_access = {}
00227         if cleanup_period is not None:
00228             self.cleanup_period = cleanup_period
00229 
00230     def __getitem__(self, key):
00231         value = super(CleanupDict, self).__getitem__(key)
00232         self._last_access[key] = time.time()
00233         return value
00234 
00235     def __setitem__(self, key, value):
00236         super(CleanupDict, self).__setitem__(key, value)
00237         self._last_access[key] = time.time()
00238         self._cleanup()
00239 
00240     def _cleanup(self):
00241         now = time.time()
00242         okay = now - self.cleanup_period
00243         for key, timestamp in self._last_access.items():
00244             if timestamp < okay:
00245                 del self._last_access[key]
00246                 super(CleanupDict, self).__delitem__(key)
00247     
00248 ATTR = '_v_memoize_cache'
00249 CONTAINER_FACTORY = CleanupDict
00250 _marker = object()
00251 
00252 class DontCache(Exception):
00253     pass
00254 
00255 def store_on_self(method, obj, *args, **kwargs):
00256     return obj.__dict__.setdefault(ATTR, CONTAINER_FACTORY())
00257 
00258 def store_on_context(method, obj, *args, **kwargs):
00259     return obj.context.__dict__.setdefault(ATTR, CONTAINER_FACTORY())
00260 
00261 def cache(get_key, get_cache=store_on_self):
00262     def decorator(fun):
00263         def replacement(*args, **kwargs):
00264             try:
00265                 key = get_key(fun, *args, **kwargs)
00266             except DontCache:
00267                 return fun(*args, **kwargs)
00268             key = '%s.%s:%s' % (fun.__module__, fun.__name__, key)
00269             cache = get_cache(fun, *args, **kwargs)
00270             cached_value = cache.get(key, _marker)
00271             if cached_value is _marker:
00272                 cached_value = cache[key] = fun(*args, **kwargs)
00273             return cached_value
00274         return replacement
00275     return decorator