1. 02 Jul, 2002 8 commits
  2. 01 Jul, 2002 2 commits
    • Fred Drake's avatar
      7c75bf20
    • Tim Peters's avatar
      OK, I couldn't stand it <0.5 wink>: removed all uncertainty about what's · 19b74c78
      Tim Peters authored
      in gc_refs, even at the cost of putting back a test+branch in
      visit_decref.
      
      The good news:  since gc_refs became utterly tame then, it became
      clear that another special value could be useful.  The move_roots() and
      move_root_reachable() passes have now been replaced by a single
      move_unreachable() pass.  Besides saving a pass over the generation, this
      has a better effect:  most of the time everything turns out to be
      reachable, so we were breaking the generation list apart and moving it
      into into the reachable list, one element at a time.  Now the reachable
      stuff stays in the generation list, and the unreachable stuff is moved
      instead.  This isn't quite as good as it sounds, since sometimes we
      guess wrongly that a thing is unreachable, and have to move it back again.
      
      Still, overall, it yields a significant (but not dramatic) boost in
      collection speed.
      19b74c78
  3. 30 Jun, 2002 14 commits
    • Tim Peters's avatar
      visit_decref(): Two optimizations. · 93cd83e4
      Tim Peters authored
      1. You're not supposed to call this with a NULL argument, although the
         docs could be clearer about that.  The other visit_XYZ() functions
         don't bother to check.  This doesn't either now, although it does
         assert non-NULL-ness now.
      
      2. It doesn't matter whether the object is currently tracked, so don't
         bother checking that either (if it isn't currently tracked, it may
         have some nonsense value in gc_refs, but it doesn't hurt to
         decrement gibberish, and it's cheaper to do so than to make everyone
         test for trackedness).
      
      It would be nice to get rid of the other tests on IS_TRACKED.  Perhaps
      trackedness should not be a matter of not being in any gc list, but
      should be a matter of being in a new "untracked" gc list.  This list
      simply wouldn't be involved in the collection mechanism.  A newly
      created object would be put in the untracked list.  Tracking would
      simply unlink it and move it into the gen0 list.  Untracking would do
      the reverse.  No test+branch needed then.  visit_move() may be vulnerable
      then, though, and I don't know how this would work with the trashcan.
      93cd83e4
    • Tim Peters's avatar
      SF bug #574132: Major GC related performance regression · 8839617c
      Tim Peters authored
      "The regression" is actually due to that 2.2.1 had a bug that prevented
      the regression (which isn't a regression at all) from showing up.  "The
      regression" is actually a glitch in cyclic gc that's been there forever.
      
      As the generation being collected is analyzed, objects that can't be
      collected (because, e.g., we find they're externally referenced, or
      are in an unreachable cycle but have a __del__ method) are moved out
      of the list of candidates.  A tricksy scheme uses negative values of
      gc_refs to mark such objects as being moved.  However, the exact
      negative value set at the start may become "more negative" over time
      for objects not in the generation being collected, and the scheme was
      checking for an exact match on the negative value originally assigned.
      As a result, objects in generations older than the one being collected
      could get scanned too, and yanked back into a younger generation.  Doing
      so doesn't lead to an error, but doesn't do any good, and can burn an
      unbounded amount of time doing useless work.
      
      A test case is simple (thanks to Kevin Jacobs for finding it!):
      
      x = []
      for i in xrange(200000):
          x.append((1,))
      
      Without the patch, this ends up scanning all of x on every gen0 collection,
      scans all of x twice on every gen1 collection, and x gets yanked back into
      gen1 on every gen0 collection.  With the patch, once x gets to gen2, it's
      never scanned again until another gen2 collection, and stays in gen2.
      
      Bugfix candidate, although the code has changed enough that I think I'll
      need to port it by hand.  2.2.1 also has a different bug that causes
      bound method objects not to get tracked at all (so the test case doesn't
      burn absurd amounts of time in 2.2.1, but *should* <wink>).
      8839617c
    • Martin v. Löwis's avatar
      Patch #569753: Remove support for WIN16. · 6238d2b0
      Martin v. Löwis authored
      Rename all occurrences of MS_WIN32 to MS_WINDOWS.
      6238d2b0
    • Martin v. Löwis's avatar
      Bump required PyXML version to 0.6.5. · adfa7409
      Martin v. Löwis authored
      adfa7409
    • Martin v. Löwis's avatar
      Implement the encoding argument for toxml and toprettyxml. · 7d650ca8
      Martin v. Löwis authored
      Document toprettyxml.
      7d650ca8
    • Martin v. Löwis's avatar
      Merge from PyXML: · 2ebfd09e
      Martin v. Löwis authored
      [1.3] Added documentation of the namespace URI for elements with no namespace.
      [1.4] New property http://www.python.org/sax/properties/encoding.
      [1.5] Support optional string interning in pyexpat.
      2ebfd09e
    • Martin v. Löwis's avatar
      0e2d8814
    • Martin v. Löwis's avatar
      Fix spacing. · d1b516c2
      Martin v. Löwis authored
      d1b516c2
    • Martin v. Löwis's avatar
      Merge changes from PyXML: · 18476a37
      Martin v. Löwis authored
      [1.15]
      Added understanding of the feature_validation, feature_external_pes,
      and feature_string_interning features.
      Added support for the feature_external_ges feature.
      Added support for the property_xml_string property.
      [1.16]
      Made it recognize the namespace prefixes feature.
      [1.17]
      removed erroneous first line
      [1.19]
      Support optional string interning in pyexpat.
      [1.21]
      Restore compatibility with versions of Python that did not support weak
      references.  These do not get the cyclic reference fix, but they will
      continue to work as they did before.
      [1.22]
      Activate entity processing unless standalone.
      18476a37
    • Martin v. Löwis's avatar
      Define PyDoc_STRVAR if it is not available (PyXML 1.54). · b4fcf4d1
      Martin v. Löwis authored
      Remove support for Python 1.5 (PyXML 1.55).
      b4fcf4d1
    • Martin v. Löwis's avatar
      6b2cf0e5
    • Raymond Hettinger's avatar
    • Raymond Hettinger's avatar
    • Raymond Hettinger's avatar
  4. 29 Jun, 2002 9 commits
  5. 28 Jun, 2002 7 commits
    • Jeremy Hylton's avatar
      Track change of begin() to _begin(). · 566fe9ef
      Jeremy Hylton authored
      566fe9ef
    • Barry Warsaw's avatar
      Lots of new and updated tests to check for proper ascii header · b6a92139
      Barry Warsaw authored
      folding.  Note that some of the Japanese tests have changed, but I
      don't really know if they are correct or not. :(
      
      Someone with Japanese and RFC 2047 expertise, please take a look!
      b6a92139
    • Barry Warsaw's avatar
      _max_append(): When adding the string `s' to its own line, it should · ba2577b7
      Barry Warsaw authored
      be lstrip'd so that old continuation whitespace is replaced by that
      specified in Header's continuation_ws parameter.
      ba2577b7
    • Barry Warsaw's avatar
      Teach this class about "highest-level syntactic breaks" but only for · 76612508
      Barry Warsaw authored
      headers with no charset or 'us-ascii' charsets.  Actually this is only
      partially true: we know about semicolons (but not true parameters) and
      we know about whitespace (but not technically folding whitespace).
      Still it should be good enough for all practical purposes.
      
      Other changes include:
      
      __init__(): Add a continuation_ws argument, which defaults to a single
      space.  Set this to change the whitespace used for continuation lines
      when a header must be split.  Also, changed the way header line
      lengths are calculated, so that they take into account continuation_ws
      (when tabs-expanded) and any provided header_name parameter.  This
      should do much better on returning split headers for which the first
      and subsequent lines must fit into a specified width.
      
      guess_maxlinelen(): Removed.  I don't think we need this method as
      part of the public API.
      
      encode_chunks() -> _encode_chunks(): I don't think we need this one as
      part of the public API either.
      76612508
    • Barry Warsaw's avatar
      _split_header(): The code here was terminally broken because it didn't · 062749ac
      Barry Warsaw authored
      know anything about RFC 2047 encoded headers.  Fortunately we have a
      perfectly good header splitter in Header.encode().  So we just call
      that to give us a properly formatted and split header.
      Header.encode() didn't know about "highest-level syntactic breaks" but
      that's been fixed now too.
      062749ac
    • Jeremy Hylton's avatar
      Simplify HTTPSConnection constructor. · 7c75c99a
      Jeremy Hylton authored
      See discussion in SF bug 458463.
      7c75c99a
    • Jeremy Hylton's avatar
      Close SF patch 523944: importing modules with foreign newlines. · 13f99d70
      Jeremy Hylton authored
      Didn't use the patch, because universal newlines support made it easy.
      It might be worth fixing the actual problem in the 2.2 maintenance
      branch, in which case the patch is still needed.
      13f99d70