- 29 May, 2002 2 commits
-
-
Guido van Rossum authored
-
Fred Drake authored
right. Added a comment explaining what it represents in the function.
-
- 28 May, 2002 5 commits
-
-
Tim Peters authored
CosineIndex.query_weight(): rewrote to squash code duplication. No change in what it returns (it's always returned an upper bound on possible doc scores, although people probably haven't thought of it that way before). Elsewhere: consequent changes. Problems: + mhindex.py needs repair, but I can't run it. Note that its current use of query_weight isn't legitimate (the usage doesn't conform to the IIndex interface -- passing a string is passing "a sequence", but not the intended sequence <wink>). + ZCTextIndex doesn't pass query_weight() on. + We've defined no methods to help clients compute what needs to be passed to query_weight (a sequence of only the positive terms). I changed mailtest.py to cheat, but it's doing a wrong thing for negative terms. + I expect it will be impossible to shake people from the belief that 100.0 * score / query_weight is some kind of "relevance score". It isn't. So perhaps better not to expose this in ZCTextIndex.
-
Andreas Jung authored
-
Andreas Jung authored
-
Andreas Jung authored
-
Andreas Jung authored
-
- 27 May, 2002 5 commits
-
-
Chris Withers authored
-
Tim Peters authored
-
Tim Peters authored
len(IIBTree) in these guys, so don't.
-
Tim Peters authored
expensive.
-
Tim Peters authored
to use both after the test to ensure they were the same.
-
- 24 May, 2002 3 commits
-
-
Guido van Rossum authored
guidelines. Added some conformance tests.
-
Casey Duncan authored
Fixed problem where external methods did not setup the func_defaults properly on first load. This resulted in ZPublisher being unable to properly arguments to Ex Methods on the first try.
-
Fred Drake authored
-
- 23 May, 2002 13 commits
-
-
Tim Peters authored
calling _add_wordinfo in a loop. This is a simple way to save oodles of functions calls. In a brief but non-trivial test, this boosted overall indexing rate by 12% (so huge bang for the buck).
-
Tim Peters authored
-
Guido van Rossum authored
-
Tim Peters authored
-
Guido van Rossum authored
-
Guido van Rossum authored
Persistent class.
-
Guido van Rossum authored
-
Chris Withers authored
-
Guido van Rossum authored
-
Guido van Rossum authored
Add -w and -W option to dump the word list (by word and by wid, respectively). Except KeyboardInterrupt from unqualified except clauses.
-
Guido van Rossum authored
- Use slightly more portable values for the Data.fs and Zope/lib/python. - Add -t NNN option to specify how often to commit a transaction; default 20,000. - Change -p into -p NNN to specify how often (counted in commits) to pack (default 0 -- never pack). - Reworked the commit and pack logic to maintain the various counters across folders. - Store relative paths (e.g. "inbox/1"). - Store the mtime of indexed messages in doctimes[docid]. - Store the mtime of indexed folders in watchfolders[folder] (unused). - Refactor updatefolder() to: (a) Avoid indexing messages it's already indexed and whose mtime hasn't changed. (This probably needs an override just in case.) (b) Unindex messages that no longer exist in the folder. - Include the folder name and the message header fields from, to, cc, bcc, and subject in the text to be indexed.
-
Tim Peters authored
-
Guido van Rossum authored
valid value is input, or the empty string, and interpret the empty string as the default. Indicate the default in the prompt.
-
- 22 May, 2002 12 commits
-
-
Guido van Rossum authored
Add glob support to the HTMLWordSplitter class.
-
Casey Duncan authored
selected in a mutally exclusive manner (such as splitters). Existing pipeline elements have been grouped appropriately. Added a stop word remover that does not remove single char words. Modified ZMI lexicon add form to use pipeline element groups to render form. Groups with multiple elements are rendered as selects, singletons are rendered as checkboxes.
-
Guido van Rossum authored
-
Guido van Rossum authored
but the pattern may not begin with a glob character (else someone specifying "*" as the pattern can tie up the CPU for a long time).
-
Andreas Jung authored
class
-
Andreas Jung authored
and recognizes the header attribute
-
Casey Duncan authored
* A pipeline factory registry now allows registration of possible pipeline elements for use by Zope lexicons. * ZMI constructor form for lexicon uses pipeline registry to generate form fields * ZMI constructor form for ZCTextindex allows you to choose between Okapi and Cosine relevance algorithms
-
Guido van Rossum authored
-
Fred Drake authored
instead of an extension type, and let StopWordRemover be a Python class that uses the helper if available.
-
Andreas Jung authored
-
Shane Hathaway authored
-
Tim Peters authored
-