- 15 May, 2002 11 commits
-
-
Casey Duncan authored
Removed QueryParser as a persistent attribute of the ZCTextIndex so that it doesn't need to be persistent (It stores no state). Updated tests. Functionally tested in Zope.
-
Guido van Rossum authored
-
Guido van Rossum authored
indexed (where the bytes are counted before entry into the pipeline, and the words are counted after the pipeline is done). To get the numbers, use the _nbytes and _nwords instance variables directly.
-
Tim Peters authored
(unless I erred ...).
-
Tim Peters authored
-
Tim Peters authored
-
Tim Peters authored
-
Tim Peters authored
-
Tim Peters authored
-
Tim Peters authored
-
Tim Peters authored
intersection gimmicks into their own functions with their own test suite. This turned up two bugs: 1. The mass weighted union gimmick was incorrect when passed a list with a single mapping. In that case, it neglected to multiply the mapping by the given weight. 2. The underlying weighted{Intersection, Union} code does something crazy if you pass it weights less than 0. I had vaguely hoped to be able to subtract scores by passing 1 and -1 as weights, but this doesn't work. It's hard to say exactly what it does then. The line weightedUnion(IIBTree(), mapping, 1, -2) seems to return a mapping with the same keys, but *all* of whose values are -2, regardless of the original mapping's values.
-
- 14 May, 2002 21 commits
-
-
Jeremy Hylton authored
www.python.org. next step is to add queries using ZCTextIndex
-
Jeremy Hylton authored
A little overzealous in the last checkin.
-
Jeremy Hylton authored
ZCTextIndex has grown a new argument with a default value that can be used to specify an Index class to use. The default is OkapiIndex.Index. There is a little kludge to make the test succeed. testZCTestIndex.IndexTests uses the Index.Index tests instead of OkapiIndex.Index. Tim will probably fix this.
-
Fred Drake authored
-
Guido van Rossum authored
names.
-
Jeremy Hylton authored
-
Jeremy Hylton authored
Re-order imports so that all Zope imports go together and are separate from all the ZCTextIndex imports. Reformat _apply_index() doc string to use std Python style, which is one-line summary followed by paragraphs of text that start at the same offset as the function name. Do comparison of None using is instead of ==.
-
Fred Drake authored
-
Casey Duncan authored
-
Casey Duncan authored
-
Casey Duncan authored
-
Casey Duncan authored
Some additional plug-in index APIs were added to ZCTextIndex and support APIs added to Index and Lexicon. _apply_index does not use NBest since ZCatalog has an incompatible strategy for finding the top results. NBest might be abstracted from this product for general consumption in application code.
-
Tim Peters authored
-
Fred Drake authored
-
Fred Drake authored
-
Guido van Rossum authored
inside the while loop either.
-
Fred Drake authored
-
matt@zope.com authored
-
Guido van Rossum authored
first byte -- we always find the end of a particular encoded number by searching for the next byte with the high bit set. This simplifies the encoding and gives us more space for small encodings: 128 values can now be encoded in 1 byte, and 16K in 2 bytes.
-
Guido van Rossum authored
-
Jeremy Hylton authored
_indexedSearch(): Simplify logic that called _apply_index() for each index in the catalog. The if statement under the comment "Optimization" had identical code on either branch. Perhaps the odd indentation made this confusing. Regardless, remove the conditional. Change computation of normalized scores to multiply first, then divide. Use literal 100. to make sure mult and div are floating point ops. searchResults(): Simplify logic at beginning of searchResults(). The first two conditionals depended on kw, so organize the logic to make that clearer. Write helper method to find "sort-on" and "sort-index" instead of duplicating code in searchResults(). For case were results are sorted, simplify construction of the final LazyCat and make it more efficient to boot. Instead of use a list comprehension and a reduce + lambda to construct list and length of contained lists, do it with one explicit for loop that constructs both values. Note: I did detailed timing stats on three ways to compute the length of a sequence of sequences. reduce + lambda was the slowest. For short lists, an explicit for loop is fastest. For long lists, reduce(operater.add, map(len, list)) is fastest. The explicit for loop is big win here, because we've got to walk over the elements anyway to undo the Schwarzian transform. Sundry: Use getattr() with default value of None in preference to hasattr() followed by getattr(). This gets the same result with half the work. Changes for consistent and frequent use of whitespace. Use types.StringType and isinstance() to test for strings.
-
- 13 May, 2002 1 commit
-
-
Jeremy Hylton authored
-
- 10 May, 2002 3 commits
-
-
Fred Drake authored
content, but does not match the expected end tag, treat it as character data. This is mostly useful when script includes string literal that include end tags.
-
Fred Drake authored
-
matt@zope.com authored
from 2.5 branch.
-
- 09 May, 2002 2 commits
-
-
Andreas Jung authored
- Collector 386: workaround for hanging FTP connections with NcFTP
-
Toby Dickenson authored
-
- 07 May, 2002 2 commits
-
-
Guido van Rossum authored
don't have a corresponding .py file, to prevent tests that import deleted modules from running using the stale bytecode files. This has bitten enough people enough times that it's time it became a standard part of every test suite runner. (Zope3 already has it.) Somebody merge this into the Zope2 trunk please.
-
Shane Hathaway authored
-