1. 13 May, 2001 1 commit
    • Tim Peters's avatar
      Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask". · 2f228e75
      Tim Peters authored
      The comment following used to say:
      	/* We use ~hash instead of hash, as degenerate hash functions, such
      	   as for ints <sigh>, can have lots of leading zeros. It's not
      	   really a performance risk, but better safe than sorry.
      	   12-Dec-00 tim:  so ~hash produces lots of leading ones instead --
      	   what's the gain? */
      That is, there was never a good reason for doing it.  And to the contrary,
      as explained on Python-Dev last December, it tended to make the *sum*
      (i + incr) & mask (which is the first table index examined in case of
      collison) the same "too often" across distinct hashes.
      
      Changing to the simpler "i = hash & mask" reduced the number of string-dict
      collisions (== # number of times we go around the lookup for-loop) from about
      6 million to 5 million during a full run of the test suite (these are
      approximate because the test suite does some random stuff from run to run).
      The number of collisions in non-string dicts also decreased, but not as
      dramatically.
      
      Note that this may, for a given dict, change the order (wrt previous
      releases) of entries exposed by .keys(), .values() and .items().  A number
      of std tests suffered bogus failures as a result.  For dicts keyed by
      small ints, or (less so) by characters, the order is much more likely to be
      in increasing order of key now; e.g.,
      
      >>> d = {}
      >>> for i in range(10):
      ...    d[i] = i
      ...
      >>> d
      {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
      >>>
      
      Unfortunately. people may latch on to that in small examples and draw a
      bogus conclusion.
      
      test_support.py
          Moved test_extcall's sortdict() into test_support, made it stronger,
          and imported sortdict into other std tests that needed it.
      test_unicode.py
          Excluced cp875 from the "roundtrip over range(128)" test, because
          cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
          See Python-Dev for excruciating details.
      Cookie.py
          Chaged various output functions to sort dicts before building
          strings from them.
      test_extcall
          Fiddled the expected-result file.  This remains sensitive to native
          dict ordering, because, e.g., if there are multiple errors in a
          keyword-arg dict (and test_extcall sets up many cases like that), the
          specific error Python complains about first depends on native dict
          ordering.
      2f228e75
  2. 12 May, 2001 7 commits
  3. 11 May, 2001 27 commits
  4. 10 May, 2001 5 commits
    • Fred Drake's avatar
      Write a better synopsis for the Scrap module, and provide a link to · 986badae
      Fred Drake authored
      useful documentation on the Scrap Manager.
      986badae
    • Fred Drake's avatar
    • Tim Peters's avatar
      Restore dicts' tp_compare slot, and change dict_richcompare to say it · 4fa58bfa
      Tim Peters authored
      doesn't know how to do LE, LT, GE, GT.  dict_richcompare can't do the
      latter any faster than dict_compare can.  More importantly, for
      cmp(dict1, dict2), Python *first* tries rich compares with EQ, LT, and
      GT one at a time, even if the tp_compare slot is defined, and
      dict_richcompare called dict_compare for the latter two because
      it couldn't do them itself.  The result was a lot of wasted calls to
      dict_compare.  Now dict_richcompare gives up at once the times Python
      calls it with LT and GT from try_rich_to_3way_compare(), and dict_compare
      is called only once (when Python gets around to trying the tp_compare
      slot).
      Continued mystery:  despite that this cut the number of calls to
      dict_compare approximately in half in test_mutants.py, the latter still
      runs amazingly slowly.  Running under the debugger doesn't show excessive
      activity in the dict comparison code anymore, so I'm guessing the culprit
      is somewhere else -- but where?  Perhaps in the element (key/value)
      comparison code?  We clearly spend a lot of time figuring out how to
      compare things.
      4fa58bfa
    • Tim Peters's avatar
      Make test_mutants stronger by also adding random keys during comparisons. · 4c02fecf
      Tim Peters authored
      A Mystery:  test_mutants ran amazingly slowly even before dictobject.c
      "got fixed".  I don't have a clue as to why.  dict comparison was and
      remains linear-time in the size of the dicts, and test_mutants only tries
      100 dict pairs, of size averaging just 50.  So "it should" run in less than
      an eyeblink; but it takes at least a second on this 800MHz box.
      4c02fecf
    • Tim Peters's avatar
      Change test_mmap.py to use test_support.TESTFN instead of hardcoded "foo", · fd69208b
      Tim Peters authored
      and wrap the body in try/finally to ensure TESTFN gets cleaned up no
      matter what.
      fd69208b