Commits · 263c2ec1264339ddb61bdd3f24d4716011f8cac7 · Kirill Smelkov / cpython

18 May, 2001 1 commit

A much improved HTML parser -- a replacement for sgmllib. The API is · 263c2ec1

Guido van Rossum authored May 18, 2001

derived from but not quite compatible with that of sgmllib, so it's a
new file.  I suppose it needs documentation, and htmllib needs to be
changed to use this instead of sgmllib, and sgmllib needs to be
declared obsolete.  But that can all be done later.

This code was first published as part of TAL (part of Zope Page
Templates), but that was strongly based on sgmllib anyway.  Authors
are Fred drake and Guido van Rossum.

263c2ec1

17 May, 2001 9 commits

Speed dictresize by collapsing its two passes into one; the reason given · 4a5685e9

Tim Peters authored May 17, 2001

in the comments for using two passes was bogus, as the only object that
can get decref'ed due to the copy is the dummy key, and decref'ing dummy
can't have side effects (for one thing, dummy is immortal! for another,
it's a string object, not a potentially dangerous user-defined object).

4a5685e9

Added pymactoolboxglue.c and changed the exported symbols having to do with this. · ea37681c
Jack Jansen authored May 17, 2001

ea37681c

Dynamically loaded toolbox modules don't need to link against each other... · c3aac0e4

Jack Jansen authored May 17, 2001

Dynamically loaded toolbox modules don't need to link against each other anymore, due to the new glue code that ties them together.

c3aac0e4

Glue code to connect obj_New and obj_Convert routines (the PyArg_Parse and... · 97049779

Jack Jansen authored May 17, 2001

Glue code to connect obj_New and obj_Convert routines (the PyArg_Parse and Py_BuildTuple helpers) from one dynamically imported module to another.

97049779

First step in porting MacPython modules to OSX/unix: break all references... · 6785df78

Jack Jansen authored May 17, 2001

First step in porting MacPython modules to OSX/unix: break all references between modules except for the obj_New() and obj_Convert() routines, the PyArg_Parse and Py_BuildValue helpers.

And these can now be vectored through glue routines (by defining USE_TOOLBOX_OBJECT_GLUE) which will do the necessary imports, whereupon the module's init routine will tell the glue routine about the real conversion routine address and everything is fine again.

6785df78

Fixed botched indent in _init_mac() code. (It may never be executed, · 25a209cf
Guido van Rossum authored May 17, 2001
```
but it still can't have any syntax errors.  Went a little too fast
there, Jack? :-)
```
25a209cf

Made distutils understand the MacPython Carbon runtime model. Distutils will... · d779b4d0

Jack Jansen authored May 17, 2001

Made distutils understand the MacPython Carbon runtime model. Distutils will build for the runtime model you are currently using for the interpreter.

d779b4d0

Fixed macroman<->latin1 conversion. Some chars don't · 0e2430d1
Jack Jansen authored May 17, 2001
```
exist in latin1, but at least the roundtrip results in the
same macroman characters.
```
0e2430d1

Fixed macroman<->latin1 conversion. Some characters don't exist in latin1, but... · e4f88843

Jack Jansen authored May 17, 2001

Fixed macroman<->latin1 conversion. Some characters don't exist in latin1, but at least the roundtrip gives
the correct macroman characters again.

e4f88843

16 May, 2001 1 commit

Moved the encoding map building logic from the individual mapping · a022398f

Marc-André Lemburg authored May 16, 2001

codec files to codecs.py and added logic so that multi mappings
in the decoding maps now result in mappings to None (undefined mapping)
in the encoding maps.

a022398f

15 May, 2001 10 commits

Bah, somehow the macroman<->iso-latin-1 translation got lost during the merge.... · 7a13cb48

Jack Jansen authored May 15, 2001

Bah, somehow the macroman<->iso-latin-1 translation got lost during the merge. Checking in one fixed file to make sure MacCVS Pro isn't the problem. If it isn't a flurry of checkins will follow tomorrow. If it is... well...

7a13cb48

Speed tuple comparisons in two ways: · dd646a0d

Tim Peters authored May 15, 2001

1. Omit the early-out EQ/NE "lengths different?" test.  Was unable to find
   any real code where it triggered, but it always costs.  The same is not
   true of list richcmps, where different-size lists appeared to get
   compared about half the time.
2. Because tuples are immutable, there's no need to refetch the lengths of
   both tuples from memory again on each loop trip.

BUG ALERT:  The tuple (and list) richcmp algorithm is arguably wrong,
because it won't believe there's any difference unless Py_EQ returns false
for some corresponding elements:

>>> class C:
...     def __lt__(x, y): return 1
...     __eq__ = __lt__
...
>>> C() < C()
1
>>> (C(),) < (C(),)
0
>>>

That doesn't make sense -- provided you believe the defn. of C makes sense.

dd646a0d

Add NEWS item for new string methods. · 3c830574
Marc-André Lemburg authored May 15, 2001

3c830574
Just changed "x,y" to "x, y" everywhere (i.e., inserted horizontal space · 3fbd4623
Tim Peters authored May 15, 2001
```
after commas that didn't have any).
```
3fbd4623
Add quoted-printable codec · fe0f94da
Guido van Rossum authored May 15, 2001

fe0f94da
Beef up the unicode() description a bit, based on material from AMK's · dbe75521
Fred Drake authored May 15, 2001
```
"What's New in Python ..." documents.
```
dbe75521
abspath(): Fix inconsistent indentation. · 80d26df1
Fred Drake authored May 15, 2001

80d26df1

This patch changes the way the string .encode() method works slightly · 164fe558

Marc-André Lemburg authored May 15, 2001

and introduces a new method .decode().

The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.

Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.

The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.

As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).

Written by Marc-Andre Lemburg. Copyright assigned to the PSF.

164fe558

Add warnings to the strop module, for to those functions that really · bb9a908a

Guido van Rossum authored May 15, 2001

*are* obsolete; three variables and the maketrans() function are not
(yet) obsolete.

Add a compensating warnings.filterwarnings() call to test_strop.py.

Add this to the NEWS.

bb9a908a

Ignore 'build' and 'Makefile.pre'. · 89875bc6
Guido van Rossum authored May 15, 2001

89875bc6

14 May, 2001 13 commits
- Fix new compiler warnings. Also boost "start" from (C) int to long and · 8192a782
  Tim Peters authored May 14, 2001
```
return a (C) long:  PyArg_ParseTuple and Py_BuildValue may not let us get
at the size_t we really want, but C int is clearly too small for a 64-bit
box, and both the start parameter and the return value should work for
large mapped files even on 32-bit boxes.  The code really needs to be
rethought from scratch (not by me, though ...).
```
  8192a782
- SF patch #418147 Fixes to allow compiling w/ Borland, from Stephen Hansen. · 8e5423c4
  Tim Peters authored May 14, 2001
  
  8e5423c4
- fcntl.ioctl(): Update error message; necessity noted by Michael Hudson. · 6e64fc6a
  Fred Drake authored May 14, 2001
  
  6e64fc6a
- Convert a couple of comments to docstrings -- PyUnit can use these when · 09f054f0
  Fred Drake authored May 14, 2001
```
the regression test is run in verbose mode.
```
  09f054f0
- pprint's workhorse _safe_repr() function took time quadratic in the # of · cf942b8c
  Tim Peters authored May 14, 2001
```
elements when crunching a list, dict or tuple.  Now takes linear time
instead -- huge speedup for even moderately large containers, and the
code is notably simpler too.
Added some basic "is the output correct?" tests to test_pprint.
```
  cf942b8c
- Convert the pprint test to use PyUnit. · 0d0a777d
  Fred Drake authored May 14, 2001
  
  0d0a777d
- Make sure we include all of Python's numeric types in the data model · 83dfedeb
  Fred Drake authored May 14, 2001
```
description, so that the introduction of complex is not a surprise.

This closes SF bug #423429.
```
  83dfedeb
- Added a WITHOUT_FRAMEWORKS define to all the config files, so that on MacOS<=9... · 071dfee5
  Jack Jansen authored May 14, 2001
```
Added a WITHOUT_FRAMEWORKS define to all the config files, so that on MacOS<=9 compiles use Universal Headers, not Carbon/Carbon.h.
```
  071dfee5
- Fix a typo, consistently spell ASCII in all caps, and insert blank · 59514c50
  Guido van Rossum authored May 14, 2001
```
lines between paragraphs in Mark Hammond's news item about the default
encoding in posixmodule.  Resist the temptation to reflow paragraphs.
```
  59514c50
- Fix the Py_FileSystemDefaultEncoding checkin - declare the variable in a... · 3651b69f
  Mark Hammond authored May 14, 2001
```
Fix the Py_FileSystemDefaultEncoding checkin - declare the variable in a fileobject.h, and initialize it in bltinmodule.
```
  3651b69f
- Fix the .find() method for memory maps. · 3c2e945a
  Greg Stein authored May 14, 2001
```
1) it didn't obey the "start" parameter (and when it does, we must validate
   the value)
2) the return value needs to be an absolute index, rather than relative to
   some arbitrary point in the file

(checking CVS, it appears this method never worked; these changes bring it
 into line with typical .find() behavior)
```
  3c2e945a
- SF bug[ #423781: pprint.isrecursive() broken. · c361b38c
  Tim Peters authored May 14, 2001
  
  c361b38c
- Add mention of the default file system encoding for Windows. · 3ac4559d
  Mark Hammond authored May 14, 2001
  
  3ac4559d
13 May, 2001 4 commits

A disgusting "fix" for the test___all__ failure under Windows. · a32a40d7
Tim Peters authored May 13, 2001

a32a40d7

Add support for Windows using "mbcs" as the default Unicode encoding when... · 62673723

Mark Hammond authored May 13, 2001

Add support for Windows using "mbcs" as the default Unicode encoding when dealing with the file system.  As discussed on python-dev and in patch 410465.

62673723

Aggressive reordering of dict comparisons. In case of collision, it stands · 07539686

Tim Peters authored May 13, 2001

to reason that me_key is much more likely to match the key we're looking
for than to match dummy, and if the key is absent me_key is much more
likely to be NULL than dummy: most dicts don't even have a dummy entry.
Running instrumented dict code over the test suite and some apps confirmed
that matching dummy was 200-300x less frequent than matching key in
practice. So this reorders the tests to try the common case first.
It can lose if a large dict with many collisions is mostly deleted, not
resized, and then frequently searched, but that's hardly a case we
should be favoring.

07539686

Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask". · 5770625e

Tim Peters authored May 13, 2001

The comment following used to say:
	/* We use ~hash instead of hash, as degenerate hash functions, such
	   as for ints <sigh>, can have lots of leading zeros. It's not
	   really a performance risk, but better safe than sorry.
	   12-Dec-00 tim:  so ~hash produces lots of leading ones instead --
	   what's the gain? */
That is, there was never a good reason for doing it.  And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.

Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.

Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items().  A number
of std tests suffered bogus failures as a result.  For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,

>>> d = {}
>>> for i in range(10):
...    d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>

Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.

test_support.py
    Moved test_extcall's sortdict() into test_support, made it stronger,
    and imported sortdict into other std tests that needed it.
test_unicode.py
    Excluced cp875 from the "roundtrip over range(128)" test, because
    cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
    See Python-Dev for excruciating details.
Cookie.py
    Chaged various output functions to sort dicts before building
    strings from them.
test_extcall
    Fiddled the expected-result file.  This remains sensitive to native
    dict ordering, because, e.g., if there are multiple errors in a
    keyword-arg dict (and test_extcall sets up many cases like that), the
    specific error Python complains about first depends on native dict
    ordering.

5770625e

12 May, 2001 2 commits

Got the first MacPython module working under MacOSX/MachO (gestalt). Main changes · fc00ce80

Jack Jansen authored May 12, 2001

are including Carbon/Carbon.h in stead of the old headers (unless WITHOUT_FRAMEWORKS
is defined, as it will be for classic MacPython) and selectively disabling all the
stuff that is unneeded in a unix-Python (event handling, etc).

fc00ce80

Be more sensible about when to use TARGET_API_MAC_OS8 in stead of... · 3c6be47a

Jack Jansen authored May 12, 2001

Be more sensible about when to use TARGET_API_MAC_OS8 in stead of !TARGET_API_MAC_CARBON. This should greatly facilitate porting stuff to OSX in its MachO/BSD incarnation.

3c6be47a