lib/python/Products/ZCTextIndex/Lexicon.py · b82b274679009de6194a1b8abfed9306f47fb361 · Kirill Smelkov / Zope

Refactor the query parser to rely on the lexicon for parsing terms. · b82b2746

Guido van Rossum authored May 20, 2002

ILexicon.py:

  - Added parseTerms() and isGlob().

  - Added get_word(), get_wid() (get_word() is old; get_wid() for symmetry).

  - Reflowed some text.

IQueryParser.py:

  - Expanded docs for parseQuery().

  - Added getIgnored() and parseQueryEx().

IPipelineElement.py:

  - Added processGlob().

Lexicon.py:

  - Added parseTerms() and isGlob().

  - Added get_wid().

  - Some pipeline elements now support processGlob().

ParseTree.py:

  - Clarified the error message for calling executeQuery() on a
    NotNode.

QueryParser.py (lots of changes):

  - Change private names __tokens etc. into protected _tokens etc.

  - Add getIgnored() and parseQueryEx() methods.

  - The atom parser now uses the lexicon's parseTerms() and isGlob()
    methods.

  - Query parts that consist only of stopwords (as determined by the
    lexicon), or of stopwords and negated terms, yield None instead of
    a parse tree node; the ignored term is added to self._ignored.
    None is ignored when combining terms for AND/OR/NOT operators, and
    when an operator has no non-None operands, the operator itself
    returns None.  When this None percolates all the way to the top,
    the parser raises a ParseError exception.

tests/testQueryParser.py:

  - Changed test expressions of the form "a AND b AND c" to "aa AND bb
    AND cc" so that the terms won't be considered stopwords.

  - The test for "and/" can only work for the base class.

tests/testZCTextIndex.py:

  - Added copyright notice.

  - Refactor testStopWords() to have two helpers, one for success, one
    for failures.

  - Change testStopWords() to require parser failure for those queries
    that have only stopwords or stopwords plus negated terms.

  - Improve compareSet() to sort the sets of keys, and use a more
    direct way of extracting the keys.  This wasn't strictly needed
    (nothing fails without this), but the old approach of copying the
    keys into a dict in a loop depends on the dict hashing to always
    return keys in the same order.

b82b2746

Lexicon.py 4.71 KB

Replace Lexicon.py