Back to index

moin  1.9.0~rc2
Classes | Public Member Functions | Static Public Attributes | Private Member Functions | Private Attributes | Static Private Attributes
MoinMoin.support.xappy.searchconnection.SearchConnection Class Reference

List of all members.

Classes

class  ExpandDecider

Public Member Functions

def __init__
def __del__
def append_close_handler
def reopen
def close
def get_doccount
def query_composite
def query_multweight
def query_filter
def query_adjust
def query_range
def query_facet
def query_parse
def query_field
def query_similar
def significant_terms
def query_all
def query_none
def spell_correct
def can_collapse_on
def can_sort_on
def search
def iterids
def get_document
def iter_synonyms
def get_metadata

Static Public Attributes

 OP_AND = _xapian.Query.OP_AND
 OP_OR = _xapian.Query.OP_OR

Private Member Functions

def _get_sort_type
def _load_config
def _prepare_queryparser
def _query_parse_with_prefix
def _query_parse_with_fallback
def _get_eterms
def _perform_expand
def _get_prefix_from_term
def _facet_query_never

Private Attributes

 _indexpath
 _close_handlers
 _field_actions
 _field_mappings
 _facet_hierarchy
 _facet_query_table

Static Private Attributes

 _qp_flags_base = _xapian.QueryParser.FLAG_LOVEHATE
 _qp_flags_phrase = _xapian.QueryParser.FLAG_PHRASE
tuple _qp_flags_synonym
 _qp_flags_bool = _xapian.QueryParser.FLAG_BOOLEAN
 _index = None

Detailed Description

A connection to the search engine for searching.

The connection will access a view of the database.

Definition at line 717 of file searchconnection.py.


Constructor & Destructor Documentation

Create a new connection to the index for searching.

There may only an arbitrary number of search connections for a
particular database open at a given time (regardless of whether there
is a connection for indexing open as well).

If the database doesn't exist, an exception will be raised.

Definition at line 731 of file searchconnection.py.

00731 
00732     def __init__(self, indexpath):
00733         """Create a new connection to the index for searching.
00734 
00735         There may only an arbitrary number of search connections for a
00736         particular database open at a given time (regardless of whether there
00737         is a connection for indexing open as well).
00738 
00739         If the database doesn't exist, an exception will be raised.
00740 
00741         """
00742         self._index = _log(_xapian.Database, indexpath)
00743         self._indexpath = indexpath
00744 
00745         # Read the actions.
00746         self._load_config()
00747 
00748         self._close_handlers = []

Here is the call graph for this function:

Definition at line 749 of file searchconnection.py.

00749 
00750     def __del__(self):
00751         self.close()

Here is the call graph for this function:


Member Function Documentation

def MoinMoin.support.xappy.searchconnection.SearchConnection._facet_query_never (   self,
  facet,
  query_type 
) [private]
Check if a facet must never be returned by a particular query type.

Returns True if the facet must never be returned.

Returns False if the facet may be returned - either becuase there is no
entry for the query type, or because the entry is not
FacetQueryType_Never.

Definition at line 1537 of file searchconnection.py.

01537 
01538     def _facet_query_never(self, facet, query_type):
01539         """Check if a facet must never be returned by a particular query type.
01540 
01541         Returns True if the facet must never be returned.
01542 
01543         Returns False if the facet may be returned - either becuase there is no
01544         entry for the query type, or because the entry is not
01545         FacetQueryType_Never.
01546 
01547         """
01548         if query_type is None:
01549             return False
01550         if query_type not in self._facet_query_table:
01551             return False
01552         if facet not in self._facet_query_table[query_type]:
01553             return False
01554         return self._facet_query_table[query_type][facet] == _indexerconnection.IndexerConnection.FacetQueryType_Never

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection._get_eterms (   self,
  ids,
  allow,
  deny,
  simterms 
) [private]
Get a set of terms for an expand

Definition at line 1350 of file searchconnection.py.

01350 
01351     def _get_eterms(self, ids, allow, deny, simterms):
01352         """Get a set of terms for an expand
01353 
01354         """
01355         if self._index is None:
01356             raise _errors.SearchError("SearchConnection has been closed")
01357         if allow is not None and deny is not None:
01358             raise _errors.SearchError("Cannot specify both `allow` and `deny`")
01359 
01360         if isinstance(ids, basestring):
01361             ids = (ids, )
01362         if isinstance(allow, basestring):
01363             allow = (allow, )
01364         if isinstance(deny, basestring):
01365             deny = (deny, )
01366 
01367         # Set "allow" to contain a list of all the fields to use.
01368         if allow is None:
01369             allow = [key for key in self._field_actions]
01370         if deny is not None:
01371             allow = [key for key in allow if key not in deny]
01372 
01373         # Set "prefixes" to contain a list of all the prefixes to use.
01374         prefixes = {}
01375         for field in allow:
01376             try:
01377                 actions = self._field_actions[field]._actions
01378             except KeyError:
01379                 actions = {}
01380             for action, kwargslist in actions.iteritems():
01381                 if action == FieldActions.INDEX_FREETEXT:
01382                     prefixes[self._field_mappings.get_prefix(field)] = field
01383 
01384         # Repeat the expand until we don't get a DatabaseModifiedError
01385         while True:
01386             try:
01387                 eterms = self._perform_expand(ids, prefixes, simterms)
01388                 break;
01389             except _xapian.DatabaseModifiedError, e:
01390                 self.reopen()
01391         return eterms, prefixes

Here is the call graph for this function:

Here is the caller graph for this function:

Get the prefix of a term.
   
Prefixes are any initial capital letters, with the exception that R always
ends a prefix, even if followed by capital letters.

Definition at line 1523 of file searchconnection.py.

01523 
01524     def _get_prefix_from_term(self, term):
01525         """Get the prefix of a term.
01526    
01527         Prefixes are any initial capital letters, with the exception that R always
01528         ends a prefix, even if followed by capital letters.
01529         
01530         """
01531         for p in xrange(len(term)):
01532             if term[p].islower():
01533                 return term[:p]
01534             elif term[p] == 'R':
01535                 return term[:p+1]
01536         return term

Here is the caller graph for this function:

Get the sort type that should be used for a given field.

Definition at line 769 of file searchconnection.py.

00769 
00770     def _get_sort_type(self, field):
00771         """Get the sort type that should be used for a given field.
00772 
00773         """
00774         try:
00775             actions = self._field_actions[field]._actions
00776         except KeyError:
00777             actions = {}
00778         for action, kwargslist in actions.iteritems():
00779             if action == FieldActions.SORT_AND_COLLAPSE:
00780                 for kwargs in kwargslist:
00781                     return kwargs['type']

Here is the caller graph for this function:

Load the configuration for the database.

Definition at line 782 of file searchconnection.py.

00782 
00783     def _load_config(self):
00784         """Load the configuration for the database.
00785 
00786         """
00787         # Note: this code is basically duplicated in the IndexerConnection
00788         # class.  Move it to a shared location.
00789         assert self._index is not None
00790 
00791         config_str = _log(self._index.get_metadata, '_xappy_config')
00792         if len(config_str) == 0:
00793             self._field_actions = {}
00794             self._field_mappings = _fieldmappings.FieldMappings()
00795             self._facet_hierarchy = {}
00796             self._facet_query_table = {}
00797             return
00798 
00799         try:
00800             (self._field_actions, mappings, self._facet_hierarchy, self._facet_query_table, self._next_docid) = _cPickle.loads(config_str)
00801         except ValueError:
00802             # Backwards compatibility - configuration used to lack _facet_hierarchy and _facet_query_table
00803             (self._field_actions, mappings, self._next_docid) = _cPickle.loads(config_str)
00804             self._facet_hierarchy = {}
00805             self._facet_query_table = {}
00806         self._field_mappings = _fieldmappings.FieldMappings(mappings)

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection._perform_expand (   self,
  ids,
  prefixes,
  simterms 
) [private]
Perform an expand operation to get the terms for a similarity
search, given a set of ids (and a set of prefixes to restrict the
similarity operation to).

Definition at line 1407 of file searchconnection.py.

01407 
01408     def _perform_expand(self, ids, prefixes, simterms):
01409         """Perform an expand operation to get the terms for a similarity
01410         search, given a set of ids (and a set of prefixes to restrict the
01411         similarity operation to).
01412 
01413         """
01414         # Set idquery to be a query which returns the documents listed in
01415         # "ids".
01416         idquery = _log(_xapian.Query, _xapian.Query.OP_OR, ['Q' + id for id in ids])
01417 
01418         enq = _log(_xapian.Enquire, self._index)
01419         enq.set_query(idquery)
01420         rset = _log(_xapian.RSet)
01421         for id in ids:
01422             pl = self._index.postlist('Q' + id)
01423             try:
01424                 xapid = pl.next()
01425                 rset.add_document(xapid.docid)
01426             except StopIteration:
01427                 pass
01428 
01429         expanddecider = _log(self.ExpandDecider, prefixes)
01430         eset = enq.get_eset(simterms, rset, 0, 1.0, expanddecider)
01431         return [term.term for term in eset]

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection._prepare_queryparser (   self,
  allow,
  deny,
  default_op,
  default_allow,
  default_deny 
) [private]
Prepare (and return) a query parser using the specified fields and
operator.

Definition at line 1049 of file searchconnection.py.

01049 
01050                              default_deny):
01051         """Prepare (and return) a query parser using the specified fields and
01052         operator.
01053 
01054         """
01055         if self._index is None:
01056             raise _errors.SearchError("SearchConnection has been closed")
01057 
01058         if isinstance(allow, basestring):
01059             allow = (allow, )
01060         if isinstance(deny, basestring):
01061             deny = (deny, )
01062         if allow is not None and len(allow) == 0:
01063             allow = None
01064         if deny is not None and len(deny) == 0:
01065             deny = None
01066         if allow is not None and deny is not None:
01067             raise _errors.SearchError("Cannot specify both `allow` and `deny` "
01068                                       "(got %r and %r)" % (allow, deny))
01069 
01070         if isinstance(default_allow, basestring):
01071             default_allow = (default_allow, )
01072         if isinstance(default_deny, basestring):
01073             default_deny = (default_deny, )
01074         if default_allow is not None and len(default_allow) == 0:
01075             default_allow = None
01076         if default_deny is not None and len(default_deny) == 0:
01077             default_deny = None
01078         if default_allow is not None and default_deny is not None:
01079             raise _errors.SearchError("Cannot specify both `default_allow` and `default_deny` "
01080                                       "(got %r and %r)" % (default_allow, default_deny))
01081 
01082         qp = _log(_xapian.QueryParser)
01083         qp.set_database(self._index)
01084         qp.set_default_op(default_op)
01085 
01086         if allow is None:
01087             allow = [key for key in self._field_actions]
01088         if deny is not None:
01089             allow = [key for key in allow if key not in deny]
01090 
01091         for field in allow:
01092             try:
01093                 actions = self._field_actions[field]._actions
01094             except KeyError:
01095                 actions = {}
01096             for action, kwargslist in actions.iteritems():
01097                 if action == FieldActions.INDEX_EXACT:
01098                     # FIXME - need patched version of xapian to add exact prefixes
01099                     #qp.add_exact_prefix(field, self._field_mappings.get_prefix(field))
01100                     qp.add_prefix(field, self._field_mappings.get_prefix(field))
01101                 if action == FieldActions.INDEX_FREETEXT:
01102                     allow_field_specific = True
01103                     for kwargs in kwargslist:
01104                         allow_field_specific = allow_field_specific or kwargs.get('allow_field_specific', True)
01105                     if not allow_field_specific:
01106                         continue
01107                     qp.add_prefix(field, self._field_mappings.get_prefix(field))
01108                     for kwargs in kwargslist:
01109                         try:
01110                             lang = kwargs['language']
01111                             my_stemmer = _log(_xapian.Stem, lang)
01112                             qp.my_stemmer = my_stemmer
01113                             qp.set_stemmer(my_stemmer)
01114                             qp.set_stemming_strategy(qp.STEM_SOME)
01115                         except KeyError:
01116                             pass
01117 
01118         if default_allow is not None or default_deny is not None:
01119             if default_allow is None:
01120                 default_allow = [key for key in self._field_actions]
01121             if default_deny is not None:
01122                 default_allow = [key for key in default_allow if key not in default_deny]
01123             for field in default_allow:
01124                 try:
01125                     actions = self._field_actions[field]._actions
01126                 except KeyError:
01127                     actions = {}
01128                 for action, kwargslist in actions.iteritems():
01129                     if action == FieldActions.INDEX_FREETEXT:
01130                         qp.add_prefix('', self._field_mappings.get_prefix(field))
01131                         # FIXME - set stemming options for the default prefix
01132 
01133         return qp

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection._query_parse_with_fallback (   self,
  qp,
  string,
  prefix = None 
) [private]
Parse a query with various flags.

If the initial boolean pass fails, fall back to not using boolean
operators.

Definition at line 1143 of file searchconnection.py.

01143 
01144     def _query_parse_with_fallback(self, qp, string, prefix=None):
01145         """Parse a query with various flags.
01146         
01147         If the initial boolean pass fails, fall back to not using boolean
01148         operators.
01149 
01150         """
01151         try:
01152             q1 = self._query_parse_with_prefix(qp, string,
01153                                                self._qp_flags_base |
01154                                                self._qp_flags_phrase |
01155                                                self._qp_flags_synonym |
01156                                                self._qp_flags_bool,
01157                                                prefix)
01158         except _xapian.QueryParserError, e:
01159             # If we got a parse error, retry without boolean operators (since
01160             # these are the usual cause of the parse error).
01161             q1 = self._query_parse_with_prefix(qp, string,
01162                                                self._qp_flags_base |
01163                                                self._qp_flags_phrase |
01164                                                self._qp_flags_synonym,
01165                                                prefix)
01166 
01167         qp.set_stemming_strategy(qp.STEM_NONE)
01168         try:
01169             q2 = self._query_parse_with_prefix(qp, string,
01170                                                self._qp_flags_base |
01171                                                self._qp_flags_bool,
01172                                                prefix)
01173         except _xapian.QueryParserError, e:
01174             # If we got a parse error, retry without boolean operators (since
01175             # these are the usual cause of the parse error).
01176             q2 = self._query_parse_with_prefix(qp, string,
01177                                                self._qp_flags_base,
01178                                                prefix)
01179 
01180         return _log(_xapian.Query, _xapian.Query.OP_AND_MAYBE, q1, q2)

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection._query_parse_with_prefix (   self,
  qp,
  string,
  flags,
  prefix 
) [private]
Parse a query, with an optional prefix.

Definition at line 1134 of file searchconnection.py.

01134 
01135     def _query_parse_with_prefix(self, qp, string, flags, prefix):
01136         """Parse a query, with an optional prefix.
01137 
01138         """
01139         if prefix is None:
01140             return qp.parse_query(string, flags)
01141         else:
01142             return qp.parse_query(string, flags, prefix)

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.append_close_handler (   self,
  handler,
  userdata = None 
)
Append a callback to the list of close handlers.

These will be called when the SearchConnection is closed.  This happens
when the close() method is called, or when the SearchConnection object
is deleted.  The callback will be passed two arguments: the path to the
SearchConnection object, and the userdata supplied to this method.

The handlers will be called in the order in which they were added.

The handlers will be called after the connection has been closed, so
cannot prevent it closing: their return value will be ignored.  In
addition, they should not raise any exceptions.

Definition at line 752 of file searchconnection.py.

00752 
00753     def append_close_handler(self, handler, userdata=None):
00754         """Append a callback to the list of close handlers.
00755 
00756         These will be called when the SearchConnection is closed.  This happens
00757         when the close() method is called, or when the SearchConnection object
00758         is deleted.  The callback will be passed two arguments: the path to the
00759         SearchConnection object, and the userdata supplied to this method.
00760 
00761         The handlers will be called in the order in which they were added.
00762 
00763         The handlers will be called after the connection has been closed, so
00764         cannot prevent it closing: their return value will be ignored.  In
00765         addition, they should not raise any exceptions.
00766 
00767         """
00768         self._close_handlers.append((handler, userdata))

Check if this database supports collapsing on a specified field.

Definition at line 1499 of file searchconnection.py.

01499 
01500     def can_collapse_on(self, field):
01501         """Check if this database supports collapsing on a specified field.
01502 
01503         """
01504         if self._index is None:
01505             raise _errors.SearchError("SearchConnection has been closed")
01506         try:
01507             self._field_mappings.get_slot(field, 'collsort')
01508         except KeyError:
01509             return False
01510         return True

Check if this database supports sorting on a specified field.

Definition at line 1511 of file searchconnection.py.

01511 
01512     def can_sort_on(self, field):
01513         """Check if this database supports sorting on a specified field.
01514 
01515         """
01516         if self._index is None:
01517             raise _errors.SearchError("SearchConnection has been closed")
01518         try:
01519             self._field_mappings.get_slot(field, 'collsort')
01520         except KeyError:
01521             return False
01522         return True
        
Close the connection to the database.

It is important to call this method before allowing the class to be
garbage collected to ensure that the connection is cleaned up promptly.

No other methods may be called on the connection after this has been
called.  (It is permissible to call close() multiple times, but
only the first call will have any effect.)

If an exception occurs, the database will be closed, but changes since
the last call to flush may be lost.

Definition at line 820 of file searchconnection.py.

00820 
00821     def close(self):
00822         """Close the connection to the database.
00823 
00824         It is important to call this method before allowing the class to be
00825         garbage collected to ensure that the connection is cleaned up promptly.
00826 
00827         No other methods may be called on the connection after this has been
00828         called.  (It is permissible to call close() multiple times, but
00829         only the first call will have any effect.)
00830 
00831         If an exception occurs, the database will be closed, but changes since
00832         the last call to flush may be lost.
00833 
00834         """
00835         if self._index is None:
00836             return
00837 
00838         # Remember the index path
00839         indexpath = self._indexpath
00840 
00841         # There is currently no "close()" method for xapian databases, so
00842         # we have to rely on the garbage collector.  Since we never copy
00843         # the _index property out of this class, there should be no cycles,
00844         # so the standard python implementation should garbage collect
00845         # _index straight away.  A close() method is planned to be added to
00846         # xapian at some point - when it is, we should call it here to make
00847         # the code more robust.
00848         self._index = None
00849         self._indexpath = None
00850         self._field_actions = None
00851         self._field_mappings = None
00852 
00853         # Call the close handlers.
00854         for handler, userdata in self._close_handlers:
00855             try:
00856                 handler(indexpath, userdata)
00857             except Exception, e:
00858                 import sys, traceback
00859                 print >>sys.stderr, "WARNING: unhandled exception in handler called by SearchConnection.close(): %s" % traceback.format_exception_only(type(e), e)

Here is the caller graph for this function:

Count the number of documents in the database.

This count will include documents which have been added or removed but
not yet flushed().

Definition at line 860 of file searchconnection.py.

00860 
00861     def get_doccount(self):
00862         """Count the number of documents in the database.
00863 
00864         This count will include documents which have been added or removed but
00865         not yet flushed().
00866 
00867         """
00868         if self._index is None:
00869             raise _errors.SearchError("SearchConnection has been closed")
00870         return self._index.get_doccount()

Here is the caller graph for this function:

Get the document with the specified unique ID.

Raises a KeyError if there is no such document.  Otherwise, it returns
a ProcessedDocument.

Definition at line 1795 of file searchconnection.py.

01795 
01796     def get_document(self, id):
01797         """Get the document with the specified unique ID.
01798 
01799         Raises a KeyError if there is no such document.  Otherwise, it returns
01800         a ProcessedDocument.
01801 
01802         """
01803         if self._index is None:
01804             raise _errors.SearchError("SearchConnection has been closed")
01805         while True:
01806             try:
01807                 postlist = self._index.postlist('Q' + id)
01808                 try:
01809                     plitem = postlist.next()
01810                 except StopIteration:
01811                     # Unique ID not found
01812                     raise KeyError('Unique ID %r not found' % id)
01813                 try:
01814                     postlist.next()
01815                     raise _errors.IndexerError("Multiple documents " #pragma: no cover
01816                                                "found with same unique ID")
01817                 except StopIteration:
01818                     # Only one instance of the unique ID found, as it should be.
01819                     pass
01820 
01821                 result = ProcessedDocument(self._field_mappings)
01822                 result.id = id
01823                 result._doc = self._index.get_document(plitem.docid)
01824                 return result
01825             except _xapian.DatabaseModifiedError, e:
01826                 self.reopen()

Here is the call graph for this function:

Get an item of metadata stored in the connection.

This returns a value stored by a previous call to
IndexerConnection.set_metadata.

If the value is not found, this will return the empty string.

Definition at line 1856 of file searchconnection.py.

01856 
01857     def get_metadata(self, key):
01858         """Get an item of metadata stored in the connection.
01859 
01860         This returns a value stored by a previous call to
01861         IndexerConnection.set_metadata.
01862 
01863         If the value is not found, this will return the empty string.
01864 
01865         """
01866         if self._index is None:
01867             raise _errors.IndexerError("SearchConnection has been closed")
01868         if not hasattr(self._index, 'get_metadata'):
01869             raise _errors.IndexerError("Version of xapian in use does not support metadata")
01870         return _log(self._index.get_metadata, key)

Here is the call graph for this function:

Get an iterator over the synonyms.

 - `prefix`: if specified, only synonym keys with this prefix will be
   returned.

The iterator returns 2-tuples, in which the first item is the key (ie,
a 2-tuple holding the term or terms which will be synonym expanded,
followed by the fieldname specified (or None if no fieldname)), and the
second item is a tuple of strings holding the synonyms for the first
item.

These return values are suitable for the dict() builtin, so you can
write things like:

 >>> conn = _indexerconnection.IndexerConnection('foo')
 >>> conn.add_synonym('foo', 'bar')
 >>> conn.add_synonym('foo bar', 'baz')
 >>> conn.add_synonym('foo bar', 'foo baz')
 >>> conn.flush()
 >>> conn = SearchConnection('foo')
 >>> dict(conn.iter_synonyms())
 {('foo', None): ('bar',), ('foo bar', None): ('baz', 'foo baz')}

Definition at line 1827 of file searchconnection.py.

01827 
01828     def iter_synonyms(self, prefix=""):
01829         """Get an iterator over the synonyms.
01830 
01831          - `prefix`: if specified, only synonym keys with this prefix will be
01832            returned.
01833 
01834         The iterator returns 2-tuples, in which the first item is the key (ie,
01835         a 2-tuple holding the term or terms which will be synonym expanded,
01836         followed by the fieldname specified (or None if no fieldname)), and the
01837         second item is a tuple of strings holding the synonyms for the first
01838         item.
01839 
01840         These return values are suitable for the dict() builtin, so you can
01841         write things like:
01842 
01843          >>> conn = _indexerconnection.IndexerConnection('foo')
01844          >>> conn.add_synonym('foo', 'bar')
01845          >>> conn.add_synonym('foo bar', 'baz')
01846          >>> conn.add_synonym('foo bar', 'foo baz')
01847          >>> conn.flush()
01848          >>> conn = SearchConnection('foo')
01849          >>> dict(conn.iter_synonyms())
01850          {('foo', None): ('bar',), ('foo bar', None): ('baz', 'foo baz')}
01851 
01852         """
01853         if self._index is None:
01854             raise _errors.SearchError("SearchConnection has been closed")
01855         return _indexerconnection.SynonymIter(self._index, self._field_mappings, prefix)

Get an iterator which returns all the ids in the database.

The unqiue_ids are currently returned in binary lexicographical sort
order, but this should not be relied on.

Note that the iterator returned by this method may raise a
xapian.DatabaseModifiedError exception if modifications are committed
to the database while the iteration is in progress.  If this happens,
the search connection must be reopened (by calling reopen) and the
iteration restarted.

Definition at line 1778 of file searchconnection.py.

01778 
01779     def iterids(self):
01780         """Get an iterator which returns all the ids in the database.
01781 
01782         The unqiue_ids are currently returned in binary lexicographical sort
01783         order, but this should not be relied on.
01784 
01785         Note that the iterator returned by this method may raise a
01786         xapian.DatabaseModifiedError exception if modifications are committed
01787         to the database while the iteration is in progress.  If this happens,
01788         the search connection must be reopened (by calling reopen) and the
01789         iteration restarted.
01790 
01791         """
01792         if self._index is None:
01793             raise _errors.SearchError("SearchConnection has been closed")
01794         return _indexerconnection.PrefixedTermIter('Q', self._index.allterms())

def MoinMoin.support.xappy.searchconnection.SearchConnection.query_adjust (   self,
  primary,
  secondary 
)
Adjust the weights of one query with a secondary query.

Documents will be returned from the resulting query if and only if they
match the primary query (specified by the "primary" parameter).
However, the weights (and hence, the relevance rankings) of the
documents will be adjusted by adding weights from the secondary query
(specified by the "secondary" parameter).

Definition at line 930 of file searchconnection.py.

00930 
00931     def query_adjust(self, primary, secondary):
00932         """Adjust the weights of one query with a secondary query.
00933 
00934         Documents will be returned from the resulting query if and only if they
00935         match the primary query (specified by the "primary" parameter).
00936         However, the weights (and hence, the relevance rankings) of the
00937         documents will be adjusted by adding weights from the secondary query
00938         (specified by the "secondary" parameter).
00939 
00940         """
00941         if self._index is None:
00942             raise _errors.SearchError("SearchConnection has been closed")
00943         return _log(_xapian.Query, _xapian.Query.OP_AND_MAYBE, primary, secondary)

Here is the call graph for this function:

A query which matches all the documents in the database.

Definition at line 1432 of file searchconnection.py.

01432 
01433     def query_all(self):
01434         """A query which matches all the documents in the database.
01435 
01436         """
01437         return _log(_xapian.Query, '')

Here is the call graph for this function:

Here is the caller graph for this function:

Build a composite query from a list of queries.

The queries are combined with the supplied operator, which is either
SearchConnection.OP_AND or SearchConnection.OP_OR.

Definition at line 873 of file searchconnection.py.

00873 
00874     def query_composite(self, operator, queries):
00875         """Build a composite query from a list of queries.
00876 
00877         The queries are combined with the supplied operator, which is either
00878         SearchConnection.OP_AND or SearchConnection.OP_OR.
00879 
00880         """
00881         if self._index is None:
00882             raise _errors.SearchError("SearchConnection has been closed")
00883         return _log(_xapian.Query, operator, list(queries))

Here is the call graph for this function:

Here is the caller graph for this function:

Create a query for a facet value.

This creates a query which matches only those documents which have a
facet value in the specified range.

For a numeric range facet, val should be a tuple holding the start and
end of the range, or a comma separated string holding two floating
point values.  For other facets, val should be the value to look
for.

The start and end values are both inclusive - any documents with a
value equal to start or end will be returned (unless end is less than
start, in which case no documents will be returned).

Definition at line 992 of file searchconnection.py.

00992 
00993     def query_facet(self, field, val):
00994         """Create a query for a facet value.
00995         
00996         This creates a query which matches only those documents which have a
00997         facet value in the specified range.
00998 
00999         For a numeric range facet, val should be a tuple holding the start and
01000         end of the range, or a comma separated string holding two floating
01001         point values.  For other facets, val should be the value to look
01002         for.
01003 
01004         The start and end values are both inclusive - any documents with a
01005         value equal to start or end will be returned (unless end is less than
01006         start, in which case no documents will be returned).
01007 
01008         """
01009         if self._index is None:
01010             raise _errors.SearchError("SearchConnection has been closed")
01011         if 'facets' in _checkxapian.missing_features:
01012             raise errors.SearchError("Facets unsupported with this release of xapian")
01013 
01014         try:
01015             actions = self._field_actions[field]._actions
01016         except KeyError:
01017             actions = {}
01018         facettype = None
01019         for action, kwargslist in actions.iteritems():
01020             if action == FieldActions.FACET:
01021                 for kwargs in kwargslist:
01022                     facettype = kwargs.get('type', None)
01023                     if facettype is not None:
01024                         break
01025             if facettype is not None:
01026                 break
01027 
01028         if facettype == 'float':
01029             if isinstance(val, basestring):
01030                 val = [float(v) for v in val.split(',', 2)]
01031             assert(len(val) == 2)
01032             try:
01033                 slot = self._field_mappings.get_slot(field, 'facet')
01034             except KeyError:
01035                 return _log(_xapian.Query)
01036             # FIXME - check that sorttype == self._get_sort_type(field)
01037             sorttype = 'float'
01038             marshaller = SortableMarshaller(False)
01039             fn = marshaller.get_marshall_function(field, sorttype)
01040             begin = fn(field, val[0])
01041             end = fn(field, val[1])
01042             return _log(_xapian.Query, _xapian.Query.OP_VALUE_RANGE, slot, begin, end)
01043         else:
01044             assert(facettype == 'string' or facettype is None)
01045             prefix = self._field_mappings.get_prefix(field)
01046             return _log(_xapian.Query, prefix + val.lower())
01047 

Here is the call graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.query_field (   self,
  field,
  value,
  default_op = OP_AND 
)
A query for a single field.

Definition at line 1236 of file searchconnection.py.

01236 
01237     def query_field(self, field, value, default_op=OP_AND):
01238         """A query for a single field.
01239 
01240         """
01241         if self._index is None:
01242             raise _errors.SearchError("SearchConnection has been closed")
01243         try:
01244             actions = self._field_actions[field]._actions
01245         except KeyError:
01246             actions = {}
01247 
01248         # need to check on field type, and stem / split as appropriate
01249         for action, kwargslist in actions.iteritems():
01250             if action in (FieldActions.INDEX_EXACT,
01251                           FieldActions.TAG,
01252                           FieldActions.FACET,):
01253                 prefix = self._field_mappings.get_prefix(field)
01254                 if len(value) > 0:
01255                     chval = ord(value[0])
01256                     if chval >= ord('A') and chval <= ord('Z'):
01257                         prefix = prefix + ':'
01258                 return _log(_xapian.Query, prefix + value)
01259             if action == FieldActions.INDEX_FREETEXT:
01260                 qp = _log(_xapian.QueryParser)
01261                 qp.set_default_op(default_op)
01262                 prefix = self._field_mappings.get_prefix(field)
01263                 for kwargs in kwargslist:
01264                     try:
01265                         lang = kwargs['language']
01266                         qp.set_stemmer(_log(_xapian.Stem, lang))
01267                         qp.set_stemming_strategy(qp.STEM_SOME)
01268                     except KeyError:
01269                         pass
01270                 return self._query_parse_with_fallback(qp, value, prefix)
01271 
01272         return _log(_xapian.Query)

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.query_filter (   self,
  query,
  filter,
  exclude = False 
)
Filter a query with another query.

If exclude is False (or not specified), documents will only match the
resulting query if they match the both the first and second query: the
results of the first query are "filtered" to only include those which
also match the second query.

If exclude is True, documents will only match the resulting query if
they match the first query, but not the second query: the results of
the first query are "filtered" to only include those which do not match
the second query.

Documents will always be weighted according to only the first query.

- `query`: The query to filter.
- `filter`: The filter to apply to the query.
- `exclude`: If True, the sense of the filter is reversed - only
  documents which do not match the second query will be returned. 

Definition at line 900 of file searchconnection.py.

00900 
00901     def query_filter(self, query, filter, exclude=False):
00902         """Filter a query with another query.
00903 
00904         If exclude is False (or not specified), documents will only match the
00905         resulting query if they match the both the first and second query: the
00906         results of the first query are "filtered" to only include those which
00907         also match the second query.
00908 
00909         If exclude is True, documents will only match the resulting query if
00910         they match the first query, but not the second query: the results of
00911         the first query are "filtered" to only include those which do not match
00912         the second query.
00913         
00914         Documents will always be weighted according to only the first query.
00915 
00916         - `query`: The query to filter.
00917         - `filter`: The filter to apply to the query.
00918         - `exclude`: If True, the sense of the filter is reversed - only
00919           documents which do not match the second query will be returned. 
00920 
00921         """
00922         if self._index is None:
00923             raise _errors.SearchError("SearchConnection has been closed")
00924         if not isinstance(filter, _xapian.Query):
00925             raise _errors.SearchError("Filter must be a Xapian Query object")
00926         if exclude:
00927             return _log(_xapian.Query, _xapian.Query.OP_AND_NOT, query, filter)
00928         else:
00929             return _log(_xapian.Query, _xapian.Query.OP_FILTER, query, filter)

Here is the call graph for this function:

Build a query which modifies the weights of a subquery.

This produces a query which returns the same documents as the subquery,
and in the same order, but with the weights assigned to each document
multiplied by the value of "multiplier".  "multiplier" may be any floating
point value, but negative values will be clipped to 0, since Xapian
doesn't support negative weights.

This can be useful when producing queries to be combined with
query_composite, because it allows the relative importance of parts of
the query to be adjusted.

Definition at line 884 of file searchconnection.py.

00884 
00885     def query_multweight(self, query, multiplier):
00886         """Build a query which modifies the weights of a subquery.
00887 
00888         This produces a query which returns the same documents as the subquery,
00889         and in the same order, but with the weights assigned to each document
00890         multiplied by the value of "multiplier".  "multiplier" may be any floating
00891         point value, but negative values will be clipped to 0, since Xapian
00892         doesn't support negative weights.
00893 
00894         This can be useful when producing queries to be combined with
00895         query_composite, because it allows the relative importance of parts of
00896         the query to be adjusted.
00897 
00898         """
00899         return _log(_xapian.Query, _xapian.Query.OP_SCALE_WEIGHT, query, multiplier)

Here is the call graph for this function:

A query which matches no documents in the database.

This may be useful as a placeholder in various situations.

Definition at line 1438 of file searchconnection.py.

01438 
01439     def query_none(self):
01440         """A query which matches no documents in the database.
01441 
01442         This may be useful as a placeholder in various situations.
01443 
01444         """
01445         return _log(_xapian.Query)

Here is the call graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.query_parse (   self,
  string,
  allow = None,
  deny = None,
  default_op = OP_AND,
  default_allow = None,
  default_deny = None 
)
Parse a query string.

This is intended for parsing queries entered by a user.  If you wish to
combine structured queries, it is generally better to use the other
query building methods, such as `query_composite` (though you may wish
to create parts of the query to combine with such methods with this
method).

The string passed to this method can have various operators in it.  In
particular, it may contain field specifiers (ie, field names, followed
by a colon, followed by some text to search for in that field).  For
example, if "author" is a field in the database, the search string
could contain "author:richard", and this would be interpreted as
"search for richard in the author field".  By default, any fields in
the database which are indexed with INDEX_EXACT or INDEX_FREETEXT will
be available for field specific searching in this way - however, this
can be modified using the "allow" or "deny" parameters, and also by the
allow_field_specific tag on INDEX_FREETEXT fields.

Any text which isn't prefixed by a field specifier is used to search
the "default set" of fields.  By default, this is the full set of
fields in the database which are indexed with INDEX_FREETEXT and for
which the search_by_default flag set (ie, if the text is found in any
of those fields, the query will match).  However, this may be modified
with the "default_allow" and "default_deny" parameters.  (Note that
fields which are indexed with INDEX_EXACT aren't allowed to be used in
the default list of fields.)

- `string`: The string to parse.
- `allow`: A list of fields to allow in the query.
- `deny`: A list of fields not to allow in the query.
- `default_op`: The default operator to combine query terms with.
- `default_allow`: A list of fields to search for by default.
- `default_deny`: A list of fields not to search for by default.

Only one of `allow` and `deny` may be specified.

Only one of `default_allow` and `default_deny` may be specified.

If any of the entries in `allow` are not present in the configuration
for the database, or are not specified for indexing (either as
INDEX_EXACT or INDEX_FREETEXT), they will be ignored.  If any of the
entries in `deny` are not present in the configuration for the
database, they will be ignored.

Returns a Query object, which may be passed to the search() method, or
combined with other queries.

Definition at line 1182 of file searchconnection.py.

01182 
01183                     default_allow=None, default_deny=None):
01184         """Parse a query string.
01185 
01186         This is intended for parsing queries entered by a user.  If you wish to
01187         combine structured queries, it is generally better to use the other
01188         query building methods, such as `query_composite` (though you may wish
01189         to create parts of the query to combine with such methods with this
01190         method).
01191 
01192         The string passed to this method can have various operators in it.  In
01193         particular, it may contain field specifiers (ie, field names, followed
01194         by a colon, followed by some text to search for in that field).  For
01195         example, if "author" is a field in the database, the search string
01196         could contain "author:richard", and this would be interpreted as
01197         "search for richard in the author field".  By default, any fields in
01198         the database which are indexed with INDEX_EXACT or INDEX_FREETEXT will
01199         be available for field specific searching in this way - however, this
01200         can be modified using the "allow" or "deny" parameters, and also by the
01201         allow_field_specific tag on INDEX_FREETEXT fields.
01202 
01203         Any text which isn't prefixed by a field specifier is used to search
01204         the "default set" of fields.  By default, this is the full set of
01205         fields in the database which are indexed with INDEX_FREETEXT and for
01206         which the search_by_default flag set (ie, if the text is found in any
01207         of those fields, the query will match).  However, this may be modified
01208         with the "default_allow" and "default_deny" parameters.  (Note that
01209         fields which are indexed with INDEX_EXACT aren't allowed to be used in
01210         the default list of fields.)
01211 
01212         - `string`: The string to parse.
01213         - `allow`: A list of fields to allow in the query.
01214         - `deny`: A list of fields not to allow in the query.
01215         - `default_op`: The default operator to combine query terms with.
01216         - `default_allow`: A list of fields to search for by default.
01217         - `default_deny`: A list of fields not to search for by default.
01218 
01219         Only one of `allow` and `deny` may be specified.
01220 
01221         Only one of `default_allow` and `default_deny` may be specified.
01222 
01223         If any of the entries in `allow` are not present in the configuration
01224         for the database, or are not specified for indexing (either as
01225         INDEX_EXACT or INDEX_FREETEXT), they will be ignored.  If any of the
01226         entries in `deny` are not present in the configuration for the
01227         database, they will be ignored.
01228 
01229         Returns a Query object, which may be passed to the search() method, or
01230         combined with other queries.
01231 
01232         """
01233         qp = self._prepare_queryparser(allow, deny, default_op, default_allow,
01234                                        default_deny)
01235         return self._query_parse_with_fallback(qp, string)

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.query_range (   self,
  field,
  begin,
  end 
)
Create a query for a range search.

This creates a query which matches only those documents which have a
field value in the specified range.

Begin and end must be appropriate values for the field, according to
the 'type' parameter supplied to the SORTABLE action for the field.

The begin and end values are both inclusive - any documents with a
value equal to begin or end will be returned (unless end is less than
begin, in which case no documents will be returned).

Begin or end may be set to None in order to create an open-ended
range.  (They may also both be set to None, which will generate a query
which matches all documents containing any value for the field.)

Definition at line 944 of file searchconnection.py.

00944 
00945     def query_range(self, field, begin, end):
00946         """Create a query for a range search.
00947         
00948         This creates a query which matches only those documents which have a
00949         field value in the specified range.
00950 
00951         Begin and end must be appropriate values for the field, according to
00952         the 'type' parameter supplied to the SORTABLE action for the field.
00953 
00954         The begin and end values are both inclusive - any documents with a
00955         value equal to begin or end will be returned (unless end is less than
00956         begin, in which case no documents will be returned).
00957 
00958         Begin or end may be set to None in order to create an open-ended
00959         range.  (They may also both be set to None, which will generate a query
00960         which matches all documents containing any value for the field.)
00961 
00962         """
00963         if self._index is None:
00964             raise _errors.SearchError("SearchConnection has been closed")
00965 
00966         if begin is None and end is None:
00967             # Return a "match everything" query
00968             return _log(_xapian.Query, '')
00969 
00970         try:
00971             slot = self._field_mappings.get_slot(field, 'collsort')
00972         except KeyError:
00973             # Return a "match nothing" query
00974             return _log(_xapian.Query)
00975 
00976         sorttype = self._get_sort_type(field)
00977         marshaller = SortableMarshaller(False)
00978         fn = marshaller.get_marshall_function(field, sorttype)
00979 
00980         if begin is not None:
00981             begin = fn(field, begin)
00982         if end is not None:
00983             end = fn(field, end)
00984 
00985         if begin is None:
00986             return _log(_xapian.Query, _xapian.Query.OP_VALUE_LE, slot, end)
00987 
00988         if end is None:
00989             return _log(_xapian.Query, _xapian.Query.OP_VALUE_GE, slot, begin)
00990 
00991         return _log(_xapian.Query, _xapian.Query.OP_VALUE_RANGE, slot, begin, end)

Here is the call graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.query_similar (   self,
  ids,
  allow = None,
  deny = None,
  simterms = 10 
)
Get a query which returns documents which are similar to others.

The list of document IDs to base the similarity search on is given in
`ids`.  This should be an iterable, holding a list of strings.  If
any of the supplied IDs cannot be found in the database, they will be
ignored.  (If no IDs can be found in the database, the resulting query
will not match any documents.)

By default, all fields which have been indexed for freetext searching
will be used for the similarity calculation.  The list of fields used
for this can be customised using the `allow` and `deny` parameters
(only one of which may be specified):

- `allow`: A list of fields to base the similarity calculation on.
- `deny`: A list of fields not to base the similarity calculation on.
- `simterms`: Number of terms to use for the similarity calculation.

For convenience, any of `ids`, `allow`, or `deny` may be strings, which
will be treated the same as a list of length 1.

Regardless of the setting of `allow` and `deny`, only fields which have
been indexed for freetext searching will be used for the similarity
measure - all other fields will always be ignored for this purpose.

Definition at line 1273 of file searchconnection.py.

01273 
01274     def query_similar(self, ids, allow=None, deny=None, simterms=10):
01275         """Get a query which returns documents which are similar to others.
01276 
01277         The list of document IDs to base the similarity search on is given in
01278         `ids`.  This should be an iterable, holding a list of strings.  If
01279         any of the supplied IDs cannot be found in the database, they will be
01280         ignored.  (If no IDs can be found in the database, the resulting query
01281         will not match any documents.)
01282 
01283         By default, all fields which have been indexed for freetext searching
01284         will be used for the similarity calculation.  The list of fields used
01285         for this can be customised using the `allow` and `deny` parameters
01286         (only one of which may be specified):
01287 
01288         - `allow`: A list of fields to base the similarity calculation on.
01289         - `deny`: A list of fields not to base the similarity calculation on.
01290         - `simterms`: Number of terms to use for the similarity calculation.
01291 
01292         For convenience, any of `ids`, `allow`, or `deny` may be strings, which
01293         will be treated the same as a list of length 1.
01294 
01295         Regardless of the setting of `allow` and `deny`, only fields which have
01296         been indexed for freetext searching will be used for the similarity
01297         measure - all other fields will always be ignored for this purpose.
01298 
01299         """
01300         eterms, prefixes = self._get_eterms(ids, allow, deny, simterms)
01301 
01302         # Use the "elite set" operator, which chooses the terms with the
01303         # highest query weight to use.
01304         q = _log(_xapian.Query, _xapian.Query.OP_ELITE_SET, eterms, simterms)
01305         return q

Here is the call graph for this function:

Reopen the connection.

This updates the revision of the index which the connection references
to the latest flushed revision.

Definition at line 807 of file searchconnection.py.

00807 
00808     def reopen(self):
00809         """Reopen the connection.
00810 
00811         This updates the revision of the index which the connection references
00812         to the latest flushed revision.
00813 
00814         """
00815         if self._index is None:
00816             raise _errors.SearchError("SearchConnection has been closed")
00817         self._index.reopen()
00818         # Re-read the actions.
00819         self._load_config()
        

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.search (   self,
  query,
  startrank,
  endrank,
  checkatleast = 0,
  sortby = None,
  collapse = None,
  gettags = None,
  getfacets = None,
  allowfacets = None,
  denyfacets = None,
  usesubfacets = None,
  percentcutoff = None,
  weightcutoff = None,
  query_type = None 
)
Perform a search, for documents matching a query.

- `query` is the query to perform.
- `startrank` is the rank of the start of the range of matching
  documents to return (ie, the result with this rank will be returned).
  ranks start at 0, which represents the "best" matching document.
- `endrank` is the rank at the end of the range of matching documents
  to return.  This is exclusive, so the result with this rank will not
  be returned.
- `checkatleast` is the minimum number of results to check for: the
  estimate of the total number of matches will always be exact if
  the number of matches is less than `checkatleast`.  A value of ``-1``
  can be specified for the checkatleast parameter - this has the
  special meaning of "check all matches", and is equivalent to passing
  the result of get_doccount().
- `sortby` is the name of a field to sort by.  It may be preceded by a
  '+' or a '-' to indicate ascending or descending order
  (respectively).  If the first character is neither '+' or '-', the
  sort will be in ascending order.
- `collapse` is the name of a field to collapse the result documents
  on.  If this is specified, there will be at most one result in the
  result set for each value of the field.
- `gettags` is the name of a field to count tag occurrences in, or a
  list of fields to do so.
- `getfacets` is a boolean - if True, the matching documents will be
  examined to build up a list of the facet values contained in them.
- `allowfacets` is a list of the fieldnames of facets to consider.
- `denyfacets` is a list of fieldnames of facets which will not be
  considered.
- `usesubfacets` is a boolean - if True, only top-level facets and
  subfacets of facets appearing in the query are considered (taking
  precedence over `allowfacets` and `denyfacets`).
- `percentcutoff` is the minimum percentage a result must have to be
  returned.
- `weightcutoff` is the minimum weight a result must have to be
  returned.
- `query_type` is a value indicating the type of query being
  performed. If not None, the value is used to influence which facets
  are be returned by the get_suggested_facets() function. If the
  value of `getfacets` is False, it has no effect.

If neither 'allowfacets' or 'denyfacets' is specified, all fields
holding facets will be considered (but see 'usesubfacets').

Definition at line 1560 of file searchconnection.py.

01560 
01561                query_type=None):
01562         """Perform a search, for documents matching a query.
01563 
01564         - `query` is the query to perform.
01565         - `startrank` is the rank of the start of the range of matching
01566           documents to return (ie, the result with this rank will be returned).
01567           ranks start at 0, which represents the "best" matching document.
01568         - `endrank` is the rank at the end of the range of matching documents
01569           to return.  This is exclusive, so the result with this rank will not
01570           be returned.
01571         - `checkatleast` is the minimum number of results to check for: the
01572           estimate of the total number of matches will always be exact if
01573           the number of matches is less than `checkatleast`.  A value of ``-1``
01574           can be specified for the checkatleast parameter - this has the
01575           special meaning of "check all matches", and is equivalent to passing
01576           the result of get_doccount().
01577         - `sortby` is the name of a field to sort by.  It may be preceded by a
01578           '+' or a '-' to indicate ascending or descending order
01579           (respectively).  If the first character is neither '+' or '-', the
01580           sort will be in ascending order.
01581         - `collapse` is the name of a field to collapse the result documents
01582           on.  If this is specified, there will be at most one result in the
01583           result set for each value of the field.
01584         - `gettags` is the name of a field to count tag occurrences in, or a
01585           list of fields to do so.
01586         - `getfacets` is a boolean - if True, the matching documents will be
01587           examined to build up a list of the facet values contained in them.
01588         - `allowfacets` is a list of the fieldnames of facets to consider.
01589         - `denyfacets` is a list of fieldnames of facets which will not be
01590           considered.
01591         - `usesubfacets` is a boolean - if True, only top-level facets and
01592           subfacets of facets appearing in the query are considered (taking
01593           precedence over `allowfacets` and `denyfacets`).
01594         - `percentcutoff` is the minimum percentage a result must have to be
01595           returned.
01596         - `weightcutoff` is the minimum weight a result must have to be
01597           returned.
01598         - `query_type` is a value indicating the type of query being
01599           performed. If not None, the value is used to influence which facets
01600           are be returned by the get_suggested_facets() function. If the
01601           value of `getfacets` is False, it has no effect.
01602 
01603         If neither 'allowfacets' or 'denyfacets' is specified, all fields
01604         holding facets will be considered (but see 'usesubfacets').
01605 
01606         """
01607         if self._index is None:
01608             raise _errors.SearchError("SearchConnection has been closed")
01609         if 'facets' in _checkxapian.missing_features:
01610             if getfacets is not None or \
01611                allowfacets is not None or \
01612                denyfacets is not None or \
01613                usesubfacets is not None or \
01614                query_type is not None:
01615                 raise errors.SearchError("Facets unsupported with this release of xapian")
01616         if 'tags' in _checkxapian.missing_features:
01617             if gettags is not None:
01618                 raise errors.SearchError("Tags unsupported with this release of xapian")
01619         if checkatleast == -1:
01620             checkatleast = self._index.get_doccount()
01621 
01622         enq = _log(_xapian.Enquire, self._index)
01623         enq.set_query(query)
01624 
01625         if sortby is not None:
01626             asc = True
01627             if sortby[0] == '-':
01628                 asc = False
01629                 sortby = sortby[1:]
01630             elif sortby[0] == '+':
01631                 sortby = sortby[1:]
01632 
01633             try:
01634                 slotnum = self._field_mappings.get_slot(sortby, 'collsort')
01635             except KeyError:
01636                 raise _errors.SearchError("Field %r was not indexed for sorting" % sortby)
01637 
01638             # Note: we invert the "asc" parameter, because xapian treats
01639             # "ascending" as meaning "higher values are better"; in other
01640             # words, it considers "ascending" to mean return results in
01641             # descending order.
01642             enq.set_sort_by_value_then_relevance(slotnum, not asc)
01643 
01644         if collapse is not None:
01645             try:
01646                 slotnum = self._field_mappings.get_slot(collapse, 'collsort')
01647             except KeyError:
01648                 raise _errors.SearchError("Field %r was not indexed for collapsing" % collapse)
01649             enq.set_collapse_key(slotnum)
01650 
01651         maxitems = max(endrank - startrank, 0)
01652         # Always check for at least one more result, so we can report whether
01653         # there are more matches.
01654         checkatleast = max(checkatleast, endrank + 1)
01655 
01656         # Build the matchspy.
01657         matchspies = []
01658 
01659         # First, add a matchspy for any gettags fields
01660         if isinstance(gettags, basestring):
01661             if len(gettags) != 0:
01662                 gettags = [gettags]
01663         tagspy = None
01664         if gettags is not None and len(gettags) != 0:
01665             tagspy = _log(_xapian.TermCountMatchSpy)
01666             for field in gettags:
01667                 try:
01668                     prefix = self._field_mappings.get_prefix(field)
01669                     tagspy.add_prefix(prefix)
01670                 except KeyError:
01671                     raise _errors.SearchError("Field %r was not indexed for tagging" % field)
01672             matchspies.append(tagspy)
01673 
01674 
01675         # add a matchspy for facet selection here.
01676         facetspy = None
01677         facetfields = []
01678         if getfacets:
01679             if allowfacets is not None and denyfacets is not None:
01680                 raise _errors.SearchError("Cannot specify both `allowfacets` and `denyfacets`")
01681             if allowfacets is None:
01682                 allowfacets = [key for key in self._field_actions]
01683             if denyfacets is not None:
01684                 allowfacets = [key for key in allowfacets if key not in denyfacets]
01685 
01686             # include None in queryfacets so a top-level facet will
01687             # satisfy self._facet_hierarchy.get(field) in queryfacets
01688             # (i.e. always include top-level facets)
01689             queryfacets = set([None])
01690             if usesubfacets:
01691                 # add facets used in the query to queryfacets
01692                 termsiter = query.get_terms_begin()
01693                 termsend = query.get_terms_end()
01694                 while termsiter != termsend:
01695                     prefix = self._get_prefix_from_term(termsiter.get_term())
01696                     field = self._field_mappings.get_fieldname_from_prefix(prefix)
01697                     if field and FieldActions.FACET in self._field_actions[field]._actions:
01698                         queryfacets.add(field)
01699                     termsiter.next()
01700 
01701             for field in allowfacets:
01702                 try:
01703                     actions = self._field_actions[field]._actions
01704                 except KeyError:
01705                     actions = {}
01706                 for action, kwargslist in actions.iteritems():
01707                     if action == FieldActions.FACET:
01708                         # filter out non-top-level facets that aren't subfacets
01709                         # of a facet in the query
01710                         if usesubfacets and self._facet_hierarchy.get(field) not in queryfacets:
01711                             continue
01712                         # filter out facets that should never be returned for the query type
01713                         if self._facet_query_never(field, query_type):
01714                             continue
01715                         slot = self._field_mappings.get_slot(field, 'facet')
01716                         if facetspy is None:
01717                             facetspy = _log(_xapian.CategorySelectMatchSpy)
01718                         facettype = None
01719                         for kwargs in kwargslist:
01720                             facettype = kwargs.get('type', None)
01721                             if facettype is not None:
01722                                 break
01723                         if facettype is None or facettype == 'string':
01724                             facetspy.add_slot(slot, True)
01725                         else:
01726                             facetspy.add_slot(slot)
01727                         facetfields.append((field, slot, kwargslist))
01728 
01729             if facetspy is None:
01730                 # Set facetspy to False, to distinguish from no facet
01731                 # calculation being performed.  (This will prevent an
01732                 # error being thrown when the list of suggested facets is
01733                 # requested - instead, an empty list will be returned.)
01734                 facetspy = False
01735             else:
01736                 matchspies.append(facetspy)
01737 
01738 
01739         # Finally, build a single matchspy to pass to get_mset().
01740         if len(matchspies) == 0:
01741             matchspy = None
01742         elif len(matchspies) == 1:
01743             matchspy = matchspies[0]
01744         else:
01745             matchspy = _log(_xapian.MultipleMatchDecider)
01746             for spy in matchspies:
01747                 matchspy.append(spy)
01748 
01749         enq.set_docid_order(enq.DONT_CARE)
01750 
01751         # Set percentage and weight cutoffs
01752         if percentcutoff is not None or weightcutoff is not None:
01753             if percentcutoff is None:
01754                 percentcutoff = 0
01755             if weightcutoff is None:
01756                 weightcutoff = 0
01757             enq.set_cutoff(percentcutoff, weightcutoff)
01758 
01759         # Repeat the search until we don't get a DatabaseModifiedError
01760         while True:
01761             try:
01762                 if matchspy is None:
01763                     mset = enq.get_mset(startrank, maxitems, checkatleast)
01764                 else:
01765                     mset = enq.get_mset(startrank, maxitems, checkatleast,
01766                                         None, None, matchspy)
01767                 break
01768             except _xapian.DatabaseModifiedError, e:
01769                 self.reopen()
01770         facet_hierarchy = None
01771         if usesubfacets:
01772             facet_hierarchy = self._facet_hierarchy
01773             
01774         return SearchResults(self, enq, query, mset, self._field_mappings,
01775                              tagspy, gettags, facetspy, facetfields,
01776                              facet_hierarchy,
01777                              self._facet_query_table.get(query_type))

Here is the call graph for this function:

Here is the caller graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.significant_terms (   self,
  ids,
  maxterms = 10,
  allow = None,
  deny = None 
)
Get a set of "significant" terms for a document, or documents.

This has a similar interface to query_similar(): it takes a list of
ids, and an optional specification of a set of fields to consider.
Instead of returning a query, it returns a list of terms from the
document (or documents), which appear "significant".  Roughly,
in this situation significant means that the terms occur more
frequently in the specified document than in the rest of the corpus.

The list is in decreasing order of "significance".

By default, all terms related to fields which have been indexed for
freetext searching will be considered for the list of significant
terms.  The list of fields used for this can be customised using the
`allow` and `deny` parameters (only one of which may be specified):

- `allow`: A list of fields to consider.
- `deny`: A list of fields not to consider.

For convenience, any of `ids`, `allow`, or `deny` may be strings, which
will be treated the same as a list of length 1.

Regardless of the setting of `allow` and `deny`, only fields which have
been indexed for freetext searching will be considered - all other
fields will always be ignored for this purpose.

The maximum number of terms to return may be specified by the maxterms
parameter.

Definition at line 1306 of file searchconnection.py.

01306 
01307     def significant_terms(self, ids, maxterms=10, allow=None, deny=None):
01308         """Get a set of "significant" terms for a document, or documents.
01309 
01310         This has a similar interface to query_similar(): it takes a list of
01311         ids, and an optional specification of a set of fields to consider.
01312         Instead of returning a query, it returns a list of terms from the
01313         document (or documents), which appear "significant".  Roughly,
01314         in this situation significant means that the terms occur more
01315         frequently in the specified document than in the rest of the corpus.
01316 
01317         The list is in decreasing order of "significance".
01318 
01319         By default, all terms related to fields which have been indexed for
01320         freetext searching will be considered for the list of significant
01321         terms.  The list of fields used for this can be customised using the
01322         `allow` and `deny` parameters (only one of which may be specified):
01323 
01324         - `allow`: A list of fields to consider.
01325         - `deny`: A list of fields not to consider.
01326 
01327         For convenience, any of `ids`, `allow`, or `deny` may be strings, which
01328         will be treated the same as a list of length 1.
01329 
01330         Regardless of the setting of `allow` and `deny`, only fields which have
01331         been indexed for freetext searching will be considered - all other
01332         fields will always be ignored for this purpose.
01333 
01334         The maximum number of terms to return may be specified by the maxterms
01335         parameter.
01336 
01337         """
01338         eterms, prefixes = self._get_eterms(ids, allow, deny, maxterms)
01339         terms = []
01340         for term in eterms:
01341             pos = 0
01342             for char in term:
01343                 if not char.isupper():
01344                     break
01345                 pos += 1
01346             field = prefixes[term[:pos]]
01347             value = term[pos:]
01348             terms.append((field, value))
01349         return terms

Here is the call graph for this function:

def MoinMoin.support.xappy.searchconnection.SearchConnection.spell_correct (   self,
  querystr,
  allow = None,
  deny = None,
  default_op = OP_AND,
  default_allow = None,
  default_deny = None 
)
Correct a query spelling.

This returns a version of the query string with any misspelt words
corrected.

- `allow`: A list of fields to allow in the query.
- `deny`: A list of fields not to allow in the query.
- `default_op`: The default operator to combine query terms with.
- `default_allow`: A list of fields to search for by default.
- `default_deny`: A list of fields not to search for by default.

Only one of `allow` and `deny` may be specified.

Only one of `default_allow` and `default_deny` may be specified.

If any of the entries in `allow` are not present in the configuration
for the database, or are not specified for indexing (either as
INDEX_EXACT or INDEX_FREETEXT), they will be ignored.  If any of the
entries in `deny` are not present in the configuration for the
database, they will be ignored.

Note that it is possible that the resulting spell-corrected query will
still match no documents - the user should usually check that some
documents are matched by the corrected query before suggesting it to
users.

Definition at line 1447 of file searchconnection.py.

01447 
01448                       default_allow=None, default_deny=None):
01449         """Correct a query spelling.
01450 
01451         This returns a version of the query string with any misspelt words
01452         corrected.
01453 
01454         - `allow`: A list of fields to allow in the query.
01455         - `deny`: A list of fields not to allow in the query.
01456         - `default_op`: The default operator to combine query terms with.
01457         - `default_allow`: A list of fields to search for by default.
01458         - `default_deny`: A list of fields not to search for by default.
01459 
01460         Only one of `allow` and `deny` may be specified.
01461 
01462         Only one of `default_allow` and `default_deny` may be specified.
01463 
01464         If any of the entries in `allow` are not present in the configuration
01465         for the database, or are not specified for indexing (either as
01466         INDEX_EXACT or INDEX_FREETEXT), they will be ignored.  If any of the
01467         entries in `deny` are not present in the configuration for the
01468         database, they will be ignored.
01469 
01470         Note that it is possible that the resulting spell-corrected query will
01471         still match no documents - the user should usually check that some
01472         documents are matched by the corrected query before suggesting it to
01473         users.
01474 
01475         """
01476         qp = self._prepare_queryparser(allow, deny, default_op, default_allow,
01477                                        default_deny)
01478         try:
01479             qp.parse_query(querystr,
01480                            self._qp_flags_base |
01481                            self._qp_flags_phrase |
01482                            self._qp_flags_synonym |
01483                            self._qp_flags_bool |
01484                            qp.FLAG_SPELLING_CORRECTION)
01485         except _xapian.QueryParserError:
01486             qp.parse_query(querystr,
01487                            self._qp_flags_base |
01488                            self._qp_flags_phrase |
01489                            self._qp_flags_synonym |
01490                            qp.FLAG_SPELLING_CORRECTION)
01491         corrected = qp.get_corrected_query_string()
01492         if len(corrected) == 0:
01493             if isinstance(querystr, unicode):
01494                 # Encode as UTF-8 for consistency - this happens automatically
01495                 # to values passed to Xapian.
01496                 return querystr.encode('utf-8')
01497             return querystr
01498         return corrected

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Definition at line 747 of file searchconnection.py.

Definition at line 794 of file searchconnection.py.

Definition at line 795 of file searchconnection.py.

Definition at line 792 of file searchconnection.py.

Definition at line 793 of file searchconnection.py.

Definition at line 729 of file searchconnection.py.

Definition at line 742 of file searchconnection.py.

MoinMoin.support.xappy.searchconnection.SearchConnection._qp_flags_base = _xapian.QueryParser.FLAG_LOVEHATE [static, private]

Definition at line 723 of file searchconnection.py.

MoinMoin.support.xappy.searchconnection.SearchConnection._qp_flags_bool = _xapian.QueryParser.FLAG_BOOLEAN [static, private]

Definition at line 727 of file searchconnection.py.

MoinMoin.support.xappy.searchconnection.SearchConnection._qp_flags_phrase = _xapian.QueryParser.FLAG_PHRASE [static, private]

Definition at line 724 of file searchconnection.py.

Initial value:
(_xapian.QueryParser.FLAG_AUTO_SYNONYMS |
                         _xapian.QueryParser.FLAG_AUTO_MULTIWORD_SYNONYMS)

Definition at line 725 of file searchconnection.py.

Definition at line 871 of file searchconnection.py.

Definition at line 872 of file searchconnection.py.


The documentation for this class was generated from the following file: