Commits · 6fc13d9595ef4972551c0dfa7920f8d87935388a · Kirill Smelkov / cpython

02 Jul, 2002 8 commits

Finished transitioning to using gc_refs to track gc objects' states. · 6fc13d95

Tim Peters authored Jul 02, 2002

This was mostly a matter of adding comments and light code rearrangement.
Upon untracking, gc_next is still set to NULL.  It's a cheap way to
provoke memory faults if calling code is insane.  It's also used in some
way by the trashcan mechanism.

6fc13d95

Remove bogus assignment to self.length in NamedNodeMap.__delitem__(). · 8e8dc419
Fred Drake authored Jul 02, 2002

8e8dc419
Minor markup adjustments, consistency changes, and shorten a long · abe7c1a4
Fred Drake authored Jul 02, 2002
```
line.
```
abe7c1a4
Add refcount info for PyErr_SetFromWindowsErr() and · 7c1bb9c5
Fred Drake authored Jul 02, 2002
```
PyErr_SetFromWindowsErrWithFilename().
```
7c1bb9c5
Docs for PyErr_SetFromWindowsErrWithFilename() and · 4f2722ac
Thomas Heller authored Jul 02, 2002
```
PyErr_SetFromWindowsErr().
Fixes SF# 576016, with additional markup.
```
4f2722ac
Do not depend on pymemcompat.h (was only used for PyXML); Martin likes · b28467b7
Fred Drake authored Jul 02, 2002
```
it all inline.
```
b28467b7
Mac OS X Jaguar (developer preview) seems to have a working getaddrinfo(). · 84262fb1
Jack Jansen authored Jul 02, 2002

84262fb1

Reserved another gc_refs value for untracked objects. Every live gc · ea405639

Tim Peters authored Jul 02, 2002

object should now have a well-defined gc_refs value, with clear transitions
among gc_refs states.  As a result, none of the visit_XYZ traversal
callbacks need to check IS_TRACKED() anymore, and those tests were removed.
(They were already looking for objects with specific gc_refs states, and
the gc_refs state of an untracked object can no longer match any other
gc_refs state by accident.)
Added more asserts.
I expect that the gc_next == NULL indicator for an untracked object is
now redundant and can also be removed, but I ran out of time for this.

ea405639

01 Jul, 2002 2 commits

Bring this back into sync with PyXML revision 1.58. · 7c75bf20
Fred Drake authored Jul 01, 2002

7c75bf20

OK, I couldn't stand it <0.5 wink>: removed all uncertainty about what's · 19b74c78

Tim Peters authored Jul 01, 2002

in gc_refs, even at the cost of putting back a test+branch in
visit_decref.

The good news:  since gc_refs became utterly tame then, it became
clear that another special value could be useful.  The move_roots() and
move_root_reachable() passes have now been replaced by a single
move_unreachable() pass.  Besides saving a pass over the generation, this
has a better effect:  most of the time everything turns out to be
reachable, so we were breaking the generation list apart and moving it
into into the reachable list, one element at a time.  Now the reachable
stuff stays in the generation list, and the unreachable stuff is moved
instead.  This isn't quite as good as it sounds, since sometimes we
guess wrongly that a thing is unreachable, and have to move it back again.

Still, overall, it yields a significant (but not dramatic) boost in
collection speed.

19b74c78

30 Jun, 2002 14 commits

visit_decref(): Two optimizations. · 93cd83e4

Tim Peters authored Jun 30, 2002

1. You're not supposed to call this with a NULL argument, although the
docs could be clearer about that. The other visit_XYZ() functions
don't bother to check. This doesn't either now, although it does
assert non-NULL-ness now.

2. It doesn't matter whether the object is currently tracked, so don't
bother checking that either (if it isn't currently tracked, it may
have some nonsense value in gc_refs, but it doesn't hurt to
decrement gibberish, and it's cheaper to do so than to make everyone
test for trackedness).

It would be nice to get rid of the other tests on IS_TRACKED. Perhaps
trackedness should not be a matter of not being in any gc list, but
should be a matter of being in a new "untracked" gc list. This list
simply wouldn't be involved in the collection mechanism. A newly
created object would be put in the untracked list. Tracking would
simply unlink it and move it into the gen0 list. Untracking would do
the reverse. No test+branch needed then. visit_move() may be vulnerable
then, though, and I don't know how this would work with the trashcan.

93cd83e4

SF bug #574132: Major GC related performance regression · 8839617c

Tim Peters authored Jun 30, 2002

"The regression" is actually due to that 2.2.1 had a bug that prevented
the regression (which isn't a regression at all) from showing up. "The
regression" is actually a glitch in cyclic gc that's been there forever.

As the generation being collected is analyzed, objects that can't be
collected (because, e.g., we find they're externally referenced, or
are in an unreachable cycle but have a __del__ method) are moved out
of the list of candidates. A tricksy scheme uses negative values of
gc_refs to mark such objects as being moved. However, the exact
negative value set at the start may become "more negative" over time
for objects not in the generation being collected, and the scheme was
checking for an exact match on the negative value originally assigned.
As a result, objects in generations older than the one being collected
could get scanned too, and yanked back into a younger generation. Doing
so doesn't lead to an error, but doesn't do any good, and can burn an
unbounded amount of time doing useless work.

A test case is simple (thanks to Kevin Jacobs for finding it!):

x = []
for i in xrange(200000):
x.append((1,))

Without the patch, this ends up scanning all of x on every gen0 collection,
scans all of x twice on every gen1 collection, and x gets yanked back into
gen1 on every gen0 collection. With the patch, once x gets to gen2, it's
never scanned again until another gen2 collection, and stays in gen2.

Bugfix candidate, although the code has changed enough that I think I'll
need to port it by hand. 2.2.1 also has a different bug that causes
bound method objects not to get tracked at all (so the test case doesn't
burn absurd amounts of time in 2.2.1, but *should* <wink>).

8839617c

Patch #569753: Remove support for WIN16. · 6238d2b0
Martin v. Löwis authored Jun 30, 2002
```
Rename all occurrences of MS_WIN32 to MS_WINDOWS.
```
6238d2b0
Bump required PyXML version to 0.6.5. · adfa7409
Martin v. Löwis authored Jun 30, 2002

adfa7409
Implement the encoding argument for toxml and toprettyxml. · 7d650ca8
Martin v. Löwis authored Jun 30, 2002
```
Document toprettyxml.
```
7d650ca8

Merge from PyXML: · 2ebfd09e

Martin v. Löwis authored Jun 30, 2002

[1.3] Added documentation of the namespace URI for elements with no namespace.
[1.4] New property http://www.python.org/sax/properties/encoding.
[1.5] Support optional string interning in pyexpat.

2ebfd09e

Add xml namespace initially (PyXML 1.19). · 0e2d8814
Martin v. Löwis authored Jun 30, 2002

0e2d8814
Fix spacing. · d1b516c2
Martin v. Löwis authored Jun 30, 2002

d1b516c2

Merge changes from PyXML: · 18476a37

Martin v. Löwis authored Jun 30, 2002

[1.15]
Added understanding of the feature_validation, feature_external_pes,
and feature_string_interning features.
Added support for the feature_external_ges feature.
Added support for the property_xml_string property.
[1.16]
Made it recognize the namespace prefixes feature.
[1.17]
removed erroneous first line
[1.19]
Support optional string interning in pyexpat.
[1.21]
Restore compatibility with versions of Python that did not support weak
references.  These do not get the cyclic reference fix, but they will
continue to work as they did before.
[1.22]
Activate entity processing unless standalone.

18476a37

Define PyDoc_STRVAR if it is not available (PyXML 1.54). · b4fcf4d1
Martin v. Löwis authored Jun 30, 2002
```
Remove support for Python 1.5 (PyXML 1.55).
```
b4fcf4d1
Undo usage of PyOS_snprintf (rev. 1.51 of PyXML). · 6b2cf0e5
Martin v. Löwis authored Jun 30, 2002

6b2cf0e5
Fixed bug 574978 shutil example out of sync with source code · 550fd5d7
Raymond Hettinger authored Jun 30, 2002

550fd5d7
Fix bug 575221 referred to dictionary type instead of dict. · 8a9e8b6d
Raymond Hettinger authored Jun 30, 2002

8a9e8b6d
Code modernization. Replace v=s[i]; del s[i] with single lookup v=s.pop(i) · 46ac8eb3
Raymond Hettinger authored Jun 30, 2002

46ac8eb3

29 Jun, 2002 9 commits
- Clarify the version information for the unicode() built-in. · 78e057a3
  Fred Drake authored Jun 29, 2002
```
Closes SF bug #575272.
```
  78e057a3
- Another test of long headers. · 19698174
  Barry Warsaw authored Jun 29, 2002
  
  19698174
- Oleg Broytmann's support for RFC 2231 encoded parameters, SF patch #549133 · 9546e797
  Barry Warsaw authored Jun 29, 2002
```
New test cases.
```
  9546e797
- Oleg Broytmann's support for RFC 2231 encoded parameters, SF patch #549133 · 12566a88
  Barry Warsaw authored Jun 29, 2002
```
Specifically,

decode_rfc2231(), encode_rfc2231(): Functions to encode and decode RFC
2231 style parameters.

decode_params(): Function to decode a list of parameters.
```
  12566a88
- Oleg Broytmann's support for RFC 2231 encoded parameters, SF patch #549133 · 908dc4be
  Barry Warsaw authored Jun 29, 2002
```
Specifically,

_formatparam(): Teach this about encoded `param' arguments, which are
a 3-tuple of items (charset, language, value).  language is ignored.

_unquotevalue(): Handle both 3-tuple RFC 2231 values and unencoded
values.

_get_params_preserve(): Decode the parameters before returning them.

get_params(), get_param(): Use _unquotevalue().

get_filename(), get_boundary(): Teach these about encoded (3-tuple)
parameters.
```
  908dc4be
- test_multilingual(): Test for Header.__unicode__(). · 3fdc889e
  Barry Warsaw authored Jun 29, 2002
  
  3fdc889e
- __unicode__(): Patch # 541263 by Mikhail Zabaluev, implementation · 8e69bdac
  Barry Warsaw authored Jun 29, 2002
```
modified by Barry.
```
  8e69bdac
- Add documentation for new textwrap module. · ae64f3ad
  Greg Ward authored Jun 29, 2002
  
  ae64f3ad
- Typo fix. · 8b46c71d
  Greg Ward authored Jun 29, 2002
  
  8b46c71d
28 Jun, 2002 7 commits

Track change of begin() to _begin(). · 566fe9ef
Jeremy Hylton authored Jun 28, 2002

566fe9ef

Lots of new and updated tests to check for proper ascii header · b6a92139

Barry Warsaw authored Jun 28, 2002

folding.  Note that some of the Japanese tests have changed, but I
don't really know if they are correct or not. :(

Someone with Japanese and RFC 2047 expertise, please take a look!

b6a92139

_max_append(): When adding the string `s' to its own line, it should · ba2577b7
Barry Warsaw authored Jun 28, 2002
```
be lstrip'd so that old continuation whitespace is replaced by that
specified in Header's continuation_ws parameter.
```
ba2577b7

Teach this class about "highest-level syntactic breaks" but only for · 76612508

Barry Warsaw authored Jun 28, 2002

headers with no charset or 'us-ascii' charsets.  Actually this is only
partially true: we know about semicolons (but not true parameters) and
we know about whitespace (but not technically folding whitespace).
Still it should be good enough for all practical purposes.

Other changes include:

__init__(): Add a continuation_ws argument, which defaults to a single
space.  Set this to change the whitespace used for continuation lines
when a header must be split.  Also, changed the way header line
lengths are calculated, so that they take into account continuation_ws
(when tabs-expanded) and any provided header_name parameter.  This
should do much better on returning split headers for which the first
and subsequent lines must fit into a specified width.

guess_maxlinelen(): Removed.  I don't think we need this method as
part of the public API.

encode_chunks() -> _encode_chunks(): I don't think we need this one as
part of the public API either.

76612508

_split_header(): The code here was terminally broken because it didn't · 062749ac

Barry Warsaw authored Jun 28, 2002

know anything about RFC 2047 encoded headers.  Fortunately we have a
perfectly good header splitter in Header.encode().  So we just call
that to give us a properly formatted and split header.
Header.encode() didn't know about "highest-level syntactic breaks" but
that's been fixed now too.

062749ac

Simplify HTTPSConnection constructor. · 7c75c99a
Jeremy Hylton authored Jun 28, 2002
```
See discussion in SF bug 458463.
```
7c75c99a

Close SF patch 523944: importing modules with foreign newlines. · 13f99d70

Jeremy Hylton authored Jun 28, 2002

Didn't use the patch, because universal newlines support made it easy.
It might be worth fixing the actual problem in the 2.2 maintenance
branch, in which case the patch is still needed.

13f99d70