Commit 43737641 authored by Andrew M. Kuchling's avatar Andrew M. Kuchling

Removed forgotten text in list comprehensions section (taken from the Haskell

    description of listcomps and used as inspiration)
Rearranged sections (which accounts for much of the size of the diffs)
Added section on augmented assignment
Mentioned 'print >>file'
Broke up the "Core Changes" section into subsections
parent 699f98c0
......@@ -261,7 +261,6 @@ while the second one is correct:
[ (x,y) for x in seq1 for y in seq2]
\end{verbatim}
The idea of list comprehensions originally comes from the functional
programming language Haskell (\url{http://www.haskell.org}). Greg
Ewing argued most effectively for adding them to Python and wrote the
......@@ -269,95 +268,45 @@ initial list comprehension patch, which was then discussed for a
seemingly endless time on the python-dev mailing list and kept
up-to-date by Skip Montanaro.
A list comprehension has the form [ e | q[1], ..., q[n] ], n>=1, where
the q[i] qualifiers are either
* generators of the form p <- e, where p is a pattern (see Section
3.17) of type t and e is an expression of type [t]
* guards, which are arbitrary expressions of type Bool
* local bindings that provide new definitions for use in the
generated expression e or subsequent guards and generators.
% ======================================================================
\section{Distutils: Making Modules Easy to Install}
Before Python 2.0, installing modules was a tedious affair -- there
was no way to figure out automatically where Python is installed, or
what compiler options to use for extension modules. Software authors
had to go through an ardous ritual of editing Makefiles and
configuration files, which only really work on Unix and leave Windows
and MacOS unsupported. Software users faced wildly differing
installation instructions
The SIG for distribution utilities, shepherded by Greg Ward, has
created the Distutils, a system to make package installation much
easier. They form the \module{distutils} package, a new part of
Python's standard library. In the best case, installing a Python
module from source will require the same steps: first you simply mean
unpack the tarball or zip archive, and the run ``\code{python setup.py
install}''. The platform will be automatically detected, the compiler
will be recognized, C extension modules will be compiled, and the
distribution installed into the proper directory. Optional
command-line arguments provide more control over the installation
process, the distutils package offers many places to override defaults
-- separating the build from the install, building or installing in
non-default directories, and more.
In order to use the Distutils, you need to write a \file{setup.py}
script. For the simple case, when the software contains only .py
files, a minimal \file{setup.py} can be just a few lines long:
\begin{verbatim}
from distutils.core import setup
setup (name = "foo", version = "1.0",
py_modules = ["module1", "module2"])
\end{verbatim}
The \file{setup.py} file isn't much more complicated if the software
consists of a few packages:
\section{Augmented Assignment}
Augmented assignment operators, another long-requested feature, have
been added to Python 2.0. Augmented assignment operators include
\code{+=}, \code{-=}, \code{*=}, and so forth. For example, the
statement \code{a += 2} increments the value of the variable \code{a}
by 2, equivalent to the slightly lengthier
\code{a = a + 2}.
The full list of supported assignment operators is \code{+=},
\code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=},
\code{|=}, \code{^=}, \code{>>=}, and \code{<<=}. Python classes can
override the augmented assignment operators by defining methods named
\method{__iadd__}, \method{__isub__}, etc. For example, the following
\class{Number} class stores a number and supports using += to create a
new instance with an incremented value.
\begin{verbatim}
from distutils.core import setup
setup (name = "foo", version = "1.0",
packages = ["package", "package.subpackage"])
class Number:
def __init__(self, value):
self.value = value
def __iadd__(self, increment):
return Number( self.value + increment)
n = Number(5)
n += 3
print n.value
\end{verbatim}
A C extension can be the most complicated case; here's an example taken from
the PyXML package:
The \method{__iadd__} special method is called with the value of the
increment, and should return a new instance with an appropriately
modified value; this return value is bound as the new value of the
variable on the left-hand side.
\begin{verbatim}
from distutils.core import setup, Extension
expat_extension = Extension('xml.parsers.pyexpat',
define_macros = [('XML_NS', None)],
include_dirs = [ 'extensions/expat/xmltok',
'extensions/expat/xmlparse' ],
sources = [ 'extensions/pyexpat.c',
'extensions/expat/xmltok/xmltok.c',
'extensions/expat/xmltok/xmlrole.c',
]
)
setup (name = "PyXML", version = "0.5.4",
ext_modules =[ expat_extension ] )
\end{verbatim}
The Distutils can also take care of creating source and binary
distributions. The ``sdist'' command, run by ``\code{python setup.py
sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}.
Adding new commands isn't difficult, ``bdist_rpm'' and
``bdist_wininst'' commands have already been contributed to create an
RPM distribution and a Windows installer for the software,
respectively. Commands to create other distribution formats such as
Debian packages and Solaris \file{.pkg} files are in various stages of
development.
All this is documented in a new manual, \textit{Distributing Python
Modules}, that joins the basic set of Python documentation.
Augmented assignment operators were first introduced in the C
programming language, and most C-derived languages, such as
\program{awk}, C++, Java, Perl, and PHP also support them. The augmented
assignment patch was implemented by Thomas Wouters.
% ======================================================================
\section{String Methods}
......@@ -384,9 +333,10 @@ string manipulation functionality available through methods on both
2
\end{verbatim}
One thing that hasn't changed, April Fools' jokes notwithstanding, is
that Python strings are immutable. Thus, the string methods return new
strings, and do not modify the string on which they operate.
One thing that hasn't changed, a noteworthy April Fools' joke
notwithstanding, is that Python strings are immutable. Thus, the
string methods return new strings, and do not modify the string on
which they operate.
The old \module{string} module is still around for backwards
compatibility, but it mostly acts as a front-end to the new string
......@@ -467,115 +417,23 @@ March 2000 archives of the python-dev mailing list contain most of the
relevant discussion, especially in the threads titled ``Reference
cycle collection for Python'' and ``Finalization again''.
% ======================================================================
%\section{New XML Code}
%XXX write this section...
% ======================================================================
\section{Porting to 2.0}
New Python releases try hard to be compatible with previous releases,
and the record has been pretty good. However, some changes are
considered useful enough, often fixing initial design decisions that
turned to be actively mistaken, that breaking backward compatibility
can't always be avoided. This section lists the changes in Python 2.0
that may cause old Python code to break.
The change which will probably break the most code is tightening up
the arguments accepted by some methods. Some methods would take
multiple arguments and treat them as a tuple, particularly various
list methods such as \method{.append()} and \method{.insert()}.
In earlier versions of Python, if \code{L} is a list, \code{L.append(
1,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this
causes a \exception{TypeError} exception to be raised, with the
message: 'append requires exactly 1 argument; 2 given'. The fix is to
simply add an extra set of parentheses to pass both values as a tuple:
\code{L.append( (1,2) )}.
The earlier versions of these methods were more forgiving because they
used an old function in Python's C interface to parse their arguments;
2.0 modernizes them to use \function{PyArg_ParseTuple}, the current
argument parsing function, which provides more helpful error messages
and treats multi-argument calls as errors. If you absolutely must use
2.0 but can't fix your code, you can edit \file{Objects/listobject.c}
and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
preserve the old behaviour; this isn't recommended.
Some of the functions in the \module{socket} module are still
forgiving in this way. For example, \function{socket.connect(
('hostname', 25) )} is the correct form, passing a tuple representing
an IP address, but \function{socket.connect( 'hostname', 25 )} also
works. \function{socket.connect_ex()} and \function{socket.bind()} are
similarly easy-going. 2.0alpha1 tightened these functions up, but
because the documentation actually used the erroneous multiple
argument form, many people wrote code which would break with the
stricter checking. GvR backed out the changes in the face of public
reaction, so for the\module{socket} module, the documentation was
fixed and the multiple argument form is simply marked as deprecated;
it \emph{will} be tightened up again in a future Python version.
Some work has been done to make integers and long integers a bit more
interchangeable. In 1.5.2, large-file support was added for Solaris,
to allow reading files larger than 2Gb; this made the \method{tell()}
method of file objects return a long integer instead of a regular
integer. Some code would subtract two file offsets and attempt to use
the result to multiply a sequence or slice a string, but this raised a
\exception{TypeError}. In 2.0, long integers can be used to multiply
or slice a sequence, and it'll behave as you'd intuitively expect it
to; \code{3L * 'abc'} produces 'abcabcabc', and \code{
(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in
various new places where previously only integers were accepted, such
as in the \method{seek()} method of file objects.
The subtlest long integer change of all is that the \function{str()}
of a long integer no longer has a trailing 'L' character, though
\function{repr()} still includes it. The 'L' annoyed many people who
wanted to print long integers that looked just like regular integers,
since they had to go out of their way to chop off the character. This
is no longer a problem in 2.0, but code which assumes the 'L' is
there, and does \code{str(longval)[:-1]} will now lose the final
digit.
Taking the \function{repr()} of a float now uses a different
formatting precision than \function{str()}. \function{repr()} uses
\code{\%.17g} format string for C's \function{sprintf()}, while
\function{str()} uses \code{\%.12g} as before. The effect is that
\function{repr()} may occasionally show more decimal places than
\function{str()}, for numbers
For example, the number 8.1 can't be represented exactly in binary, so
\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is
\code{'8.1'}.
The \code{-X} command-line option, which turned all standard
exceptions into strings instead of classes, has been removed; the
standard exceptions will now always be classes. The
\module{exceptions} module containing the standard exceptions was
translated from Python to a built-in C module, written by Barry Warsaw
and Fredrik Lundh.
% Commented out for now -- I don't think anyone will care.
%The pattern and match objects provided by SRE are C types, not Python
%class instances as in 1.5. This means you can no longer inherit from
%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much
%of a problem since no one should have been doing that in the first
%place.
% ======================================================================
\section{Core Changes}
\section{Other Core Changes}
Various minor changes have been made to Python's syntax and built-in
functions. None of the changes are very far-reaching, but they're
handy conveniences.
A change to syntax makes it more convenient to call a given function
\subsection{Minor Language Changes}
A new syntax makes it more convenient to call a given function
with a tuple of arguments and/or a dictionary of keyword arguments.
In Python 1.5 and earlier, you do this with the \function{apply()}
In Python 1.5 and earlier, you'd use the \function{apply()}
built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
function \function{f()} with the argument tuple \var{args} and the
keyword arguments in the dictionary \var{kw}. Thanks to a patch from
Greg Ewing, 2.0 adds \code{f(*\var{args}, **\var{kw})} as a shorter
keyword arguments in the dictionary \var{kw}. \function{apply()}
is the same in 2.0, but thanks to a patch from
Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter
and clearer way to achieve the same effect. This syntax is
symmetrical with the syntax for defining functions:
......@@ -586,31 +444,31 @@ def f(*args, **kw):
...
\end{verbatim}
A new format style is available when using the \code{\%} operator.
The \keyword{print} statement can now have its output directed to a
file-like object by following the \keyword{print} with \code{>>
\var{fileobj}}, similar to the redirection operator in Unix shells.
Previously you'd either have to use the \method{write()} method of the
file-like object, which lacks the convenience and simplicity of
\keyword{print}, or you could assign a new value to \code{sys.stdout}
and then restore the old value. For sending output to standard error,
it's much easier to write this:
\begin{verbatim}
print >> sys.stderr, "Warning: action field not supplied"
\end{verbatim}
Modules can now be renamed on importing them, using the syntax
\code{import \var{module} as \var{name}} or \code{from \var{module}
import \var{name} as \var{othername}}. The patch was submitted by
Thomas Wouters.
A new format style is available when using the \code{\%} operator;
'\%r' will insert the \function{repr()} of its argument. This was
also added from symmetry considerations, this time for symmetry with
the existing '\%s' format style, which inserts the \function{str()} of
its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a
string containing \verb|'abc' abc|.
A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been
added. \function{zip()} returns a list of tuples where each tuple
contains the i-th element from each of the argument sequences. The
difference between \function{zip()} and \code{map(None, \var{seq1},
\var{seq2})} is that \function{map()} raises an error if the sequences
aren't all of the same length, while \function{zip()} truncates the
returned list to the length of the shortest argument sequence.
The \function{int()} and \function{long()} functions now accept an
optional ``base'' parameter when the first argument is a string.
\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
291. \code{int(123, 16)} raises a \exception{TypeError} exception
with the message ``can't convert non-string with explicit base''.
Modules can now be renamed on importing them, using the syntax
\code{import \var{module} as \var{name}} or \code{from \var{module}
import \var{name} as \var{othername}}.
Previously there was no way to implement a class that overrode
Python's built-in \keyword{in} operator and implemented a custom
version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
......@@ -638,17 +496,20 @@ b.append(b)
\end{verbatim}
The comparison \code{a==b} returns true, because the two recursive
data structures are isomorphic.
\footnote{See the thread ``trashcan and PR\#7'' in the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant links.
data structures are isomorphic. \footnote{See the thread ``trashcan
and PR\#7'' in the April 2000 archives of the python-dev mailing list
for the discussion leading up to this implementation, and some useful
relevant links.
%http://www.python.org/pipermail/python-dev/2000-April/004834.html
}
Work has been done on porting Python to 64-bit Windows on the Itanium
processor, mostly by Trent Mick of ActiveState. (Confusingly, \code{sys.platform} is still \code{'win32'} on
Win64 because it seems that for ease of porting, MS Visual C++ treats code
as 32 bit.
) PythonWin also supports Windows CE; see the Python CE page at
\url{http://starship.python.net/crew/mhammond/ce/} for more information.
processor, mostly by Trent Mick of ActiveState. (Confusingly,
\code{sys.platform} is still \code{'win32'} on Win64 because it seems
that for ease of porting, MS Visual C++ treats code as 32 bit on Itanium.)
PythonWin also supports Windows CE; see the Python CE page at
\url{http://starship.python.net/crew/mhammond/ce/} for more
information.
An attempt has been made to alleviate one of Python's warts, the
often-confusing \exception{NameError} exception when code refers to a
......@@ -668,6 +529,22 @@ def f():
f()
\end{verbatim}
\subsection{Changes to Built-in Functions}
A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been
added. \function{zip()} returns a list of tuples where each tuple
contains the i-th element from each of the argument sequences. The
difference between \function{zip()} and \code{map(None, \var{seq1},
\var{seq2})} is that \function{map()} raises an error if the sequences
aren't all of the same length, while \function{zip()} truncates the
returned list to the length of the shortest argument sequence.
The \function{int()} and \function{long()} functions now accept an
optional ``base'' parameter when the first argument is a string.
\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
291. \code{int(123, 16)} raises a \exception{TypeError} exception
with the message ``can't convert non-string with explicit base''.
A new variable holding more detailed version information has been
added to the \module{sys} module. \code{sys.version_info} is a tuple
\code{(\var{major}, \var{minor}, \var{micro}, \var{level},
......@@ -692,6 +569,96 @@ else:
can be reduced to a single \code{return dict.setdefault(key, [])} statement.
% ======================================================================
\section{Porting to 2.0}
New Python releases try hard to be compatible with previous releases,
and the record has been pretty good. However, some changes are
considered useful enough, often fixing initial design decisions that
turned to be actively mistaken, that breaking backward compatibility
can't always be avoided. This section lists the changes in Python 2.0
that may cause old Python code to break.
The change which will probably break the most code is tightening up
the arguments accepted by some methods. Some methods would take
multiple arguments and treat them as a tuple, particularly various
list methods such as \method{.append()} and \method{.insert()}.
In earlier versions of Python, if \code{L} is a list, \code{L.append(
1,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this
causes a \exception{TypeError} exception to be raised, with the
message: 'append requires exactly 1 argument; 2 given'. The fix is to
simply add an extra set of parentheses to pass both values as a tuple:
\code{L.append( (1,2) )}.
The earlier versions of these methods were more forgiving because they
used an old function in Python's C interface to parse their arguments;
2.0 modernizes them to use \function{PyArg_ParseTuple}, the current
argument parsing function, which provides more helpful error messages
and treats multi-argument calls as errors. If you absolutely must use
2.0 but can't fix your code, you can edit \file{Objects/listobject.c}
and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
preserve the old behaviour; this isn't recommended.
Some of the functions in the \module{socket} module are still
forgiving in this way. For example, \function{socket.connect(
('hostname', 25) )} is the correct form, passing a tuple representing
an IP address, but \function{socket.connect( 'hostname', 25 )} also
works. \function{socket.connect_ex()} and \function{socket.bind()} are
similarly easy-going. 2.0alpha1 tightened these functions up, but
because the documentation actually used the erroneous multiple
argument form, many people wrote code which would break with the
stricter checking. GvR backed out the changes in the face of public
reaction, so for the\module{socket} module, the documentation was
fixed and the multiple argument form is simply marked as deprecated;
it \emph{will} be tightened up again in a future Python version.
Some work has been done to make integers and long integers a bit more
interchangeable. In 1.5.2, large-file support was added for Solaris,
to allow reading files larger than 2Gb; this made the \method{tell()}
method of file objects return a long integer instead of a regular
integer. Some code would subtract two file offsets and attempt to use
the result to multiply a sequence or slice a string, but this raised a
\exception{TypeError}. In 2.0, long integers can be used to multiply
or slice a sequence, and it'll behave as you'd intuitively expect it
to; \code{3L * 'abc'} produces 'abcabcabc', and \code{
(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in
various new places where previously only integers were accepted, such
as in the \method{seek()} method of file objects.
The subtlest long integer change of all is that the \function{str()}
of a long integer no longer has a trailing 'L' character, though
\function{repr()} still includes it. The 'L' annoyed many people who
wanted to print long integers that looked just like regular integers,
since they had to go out of their way to chop off the character. This
is no longer a problem in 2.0, but code which assumes the 'L' is
there, and does \code{str(longval)[:-1]} will now lose the final
digit.
Taking the \function{repr()} of a float now uses a different
formatting precision than \function{str()}. \function{repr()} uses
\code{\%.17g} format string for C's \function{sprintf()}, while
\function{str()} uses \code{\%.12g} as before. The effect is that
\function{repr()} may occasionally show more decimal places than
\function{str()}, for numbers
For example, the number 8.1 can't be represented exactly in binary, so
\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is
\code{'8.1'}.
The \code{-X} command-line option, which turned all standard
exceptions into strings instead of classes, has been removed; the
standard exceptions will now always be classes. The
\module{exceptions} module containing the standard exceptions was
translated from Python to a built-in C module, written by Barry Warsaw
and Fredrik Lundh.
% Commented out for now -- I don't think anyone will care.
%The pattern and match objects provided by SRE are C types, not Python
%class instances as in 1.5. This means you can no longer inherit from
%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much
%of a problem since no one should have been doing that in the first
%place.
% ======================================================================
\section{Extending/Embedding Changes}
......@@ -754,6 +721,89 @@ Python 2.0's source now uses only ANSI C prototypes, so compiling Python now
requires an ANSI C compiler, and can no longer be done using a compiler that
only supports K\&R C.
% ======================================================================
\section{Distutils: Making Modules Easy to Install}
Before Python 2.0, installing modules was a tedious affair -- there
was no way to figure out automatically where Python is installed, or
what compiler options to use for extension modules. Software authors
had to go through an ardous ritual of editing Makefiles and
configuration files, which only really work on Unix and leave Windows
and MacOS unsupported. Software users faced wildly differing
installation instructions
The SIG for distribution utilities, shepherded by Greg Ward, has
created the Distutils, a system to make package installation much
easier. They form the \module{distutils} package, a new part of
Python's standard library. In the best case, installing a Python
module from source will require the same steps: first you simply mean
unpack the tarball or zip archive, and the run ``\code{python setup.py
install}''. The platform will be automatically detected, the compiler
will be recognized, C extension modules will be compiled, and the
distribution installed into the proper directory. Optional
command-line arguments provide more control over the installation
process, the distutils package offers many places to override defaults
-- separating the build from the install, building or installing in
non-default directories, and more.
In order to use the Distutils, you need to write a \file{setup.py}
script. For the simple case, when the software contains only .py
files, a minimal \file{setup.py} can be just a few lines long:
\begin{verbatim}
from distutils.core import setup
setup (name = "foo", version = "1.0",
py_modules = ["module1", "module2"])
\end{verbatim}
The \file{setup.py} file isn't much more complicated if the software
consists of a few packages:
\begin{verbatim}
from distutils.core import setup
setup (name = "foo", version = "1.0",
packages = ["package", "package.subpackage"])
\end{verbatim}
A C extension can be the most complicated case; here's an example taken from
the PyXML package:
\begin{verbatim}
from distutils.core import setup, Extension
expat_extension = Extension('xml.parsers.pyexpat',
define_macros = [('XML_NS', None)],
include_dirs = [ 'extensions/expat/xmltok',
'extensions/expat/xmlparse' ],
sources = [ 'extensions/pyexpat.c',
'extensions/expat/xmltok/xmltok.c',
'extensions/expat/xmltok/xmlrole.c',
]
)
setup (name = "PyXML", version = "0.5.4",
ext_modules =[ expat_extension ] )
\end{verbatim}
The Distutils can also take care of creating source and binary
distributions. The ``sdist'' command, run by ``\code{python setup.py
sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}.
Adding new commands isn't difficult, ``bdist_rpm'' and
``bdist_wininst'' commands have already been contributed to create an
RPM distribution and a Windows installer for the software,
respectively. Commands to create other distribution formats such as
Debian packages and Solaris \file{.pkg} files are in various stages of
development.
All this is documented in a new manual, \textit{Distributing Python
Modules}, that joins the basic set of Python documentation.
% ======================================================================
%\section{New XML Code}
%XXX write this section...
% ======================================================================
\section{Module changes}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment