Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Open sidebar
Kirill Smelkov
cpython
Commits
f6945183
Commit
f6945183
authored
17 years ago
by
Georg Brandl
Browse files
Options
Download
Email Patches
Plain Diff
Update docs w.r.t. PEP 3100 changes -- patch for GHOP by Dan Finnie.
parent
f25ef505
Changes
48
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
221 additions
and
313 deletions
+221
-313
Doc/extending/extending.rst
Doc/extending/extending.rst
+3
-4
Doc/howto/functional.rst
Doc/howto/functional.rst
+59
-87
Doc/howto/regex.rst
Doc/howto/regex.rst
+1
-1
Doc/howto/unicode.rst
Doc/howto/unicode.rst
+65
-132
Doc/library/array.rst
Doc/library/array.rst
+0
-19
Doc/library/collections.rst
Doc/library/collections.rst
+1
-1
Doc/library/configparser.rst
Doc/library/configparser.rst
+11
-11
Doc/library/csv.rst
Doc/library/csv.rst
+2
-2
Doc/library/datatypes.rst
Doc/library/datatypes.rst
+2
-2
Doc/library/easydialogs.rst
Doc/library/easydialogs.rst
+1
-1
Doc/library/email.charset.rst
Doc/library/email.charset.rst
+2
-2
Doc/library/email.header.rst
Doc/library/email.header.rst
+10
-10
Doc/library/email.util.rst
Doc/library/email.util.rst
+3
-3
Doc/library/fcntl.rst
Doc/library/fcntl.rst
+0
-1
Doc/library/fileinput.rst
Doc/library/fileinput.rst
+0
-6
Doc/library/functions.rst
Doc/library/functions.rst
+28
-12
Doc/library/gettext.rst
Doc/library/gettext.rst
+11
-12
Doc/library/imp.rst
Doc/library/imp.rst
+13
-0
Doc/library/itertools.rst
Doc/library/itertools.rst
+5
-3
Doc/library/logging.rst
Doc/library/logging.rst
+4
-4
No files found.
Doc/extending/extending.rst
View file @
f6945183
...
...
@@ -826,10 +826,9 @@ to run the detector (the :func:`collect` function), as well as configuration
interfaces and the ability to disable the detector at runtime. The cycle
detector is considered an optional component; though it is included by default,
it can be disabled at build time using the :option:`--without-cycle-gc` option
to the :program:`configure` script on Unix platforms (including Mac OS X) or by
removing the definition of ``WITH_CYCLE_GC`` in the :file:`pyconfig.h` header on
other platforms. If the cycle detector is disabled in this way, the :mod:`gc`
module will not be available.
to the :program:`configure` script on Unix platforms (including Mac OS X). If
the cycle detector is disabled in this way, the :mod:`gc` module will not be
available.
.. _refcountsinpython:
...
...
This diff is collapsed.
Click to expand it.
Doc/howto/functional.rst
View file @
f6945183
...
...
@@ -314,7 +314,7 @@ this::
Sets can take their contents from an iterable and let you iterate over the set's
elements::
S =
set((
2, 3, 5, 7, 11, 13
))
S =
{
2, 3, 5, 7, 11, 13
}
for i in S:
print(i)
...
...
@@ -616,29 +616,26 @@ Built-in functions
Let's look in more detail at built-in functions often used with iterators.
Two of Python's built-in functions, :func:`map` and :func:`filter`, are somewhat
obsolete; they duplicate the features of list comprehensions but return actual
lists instead of iterators.
Two of Python's built-in functions, :func:`map` and :func:`filter` duplicate the
features of generator expressions:
``map(f, iterA, iterB, ...)`` returns a
list containing ``f(iterA[0], iterB[0]),
f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
``map(f, iterA, iterB, ...)`` returns a
n iterator over the sequence
``f(iterA[0], iterB[0]),
f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
::
def upper(s):
return s.upper()
map(upper, ['sentence', 'fragment']) =>
list(
map(upper, ['sentence', 'fragment'])
)
=>
['SENTENCE', 'FRAGMENT']
[
upper(s) for s in ['sentence', 'fragment']
]
=>
list(
upper(s) for s in ['sentence', 'fragment']
)
=>
['SENTENCE', 'FRAGMENT']
As shown above, you can achieve the same effect with a list comprehension. The
:func:`itertools.imap` function does the same thing but can handle infinite
iterators; it'll be discussed later, in the section on the :mod:`itertools` module.
You can of course achieve the same effect with a list comprehension.
``filter(predicate, iter)`` returns a
list that contains
all the sequence
elements
that meet a certain condition, and is similarly duplicated by list
``filter(predicate, iter)`` returns a
n iterator over
all the sequence
elements
that meet a certain condition, and is similarly duplicated by list
comprehensions. A **predicate** is a function that returns the truth value of
some condition; for use with :func:`filter`, the predicate must take a single
value.
...
...
@@ -648,69 +645,61 @@ value.
def is_even(x):
return (x % 2) == 0
filter(is_even, range(10)) =>
list(
filter(is_even, range(10))
)
=>
[0, 2, 4, 6, 8]
This can also be written as a
list comprehen
sion::
This can also be written as a
generator expres
sion::
>>>
[
x for x in range(10) if is_even(x)
]
>>>
list(
x for x in range(10) if is_even(x)
)
[0, 2, 4, 6, 8]
:func:`filter` also has a counterpart in the :mod:`itertools` module,
:func:`itertools.ifilter`, that returns an iterator and can therefore handle
infinite sequences just as :func:`itertools.imap` can.
``reduce(func, iter, [initial_value])`` doesn't have a counterpart in the
:mod:`itertools` module because it cumulatively performs an operation on all the
iterable's elements and therefore can't be applied to infinite iterables.
``func`` must be a function that takes two elements and returns a single value.
:func:`reduce` takes the first two elements A and B returned by the iterator and
calculates ``func(A, B)``. It then requests the third element, C, calculates
``func(func(A, B), C)``, combines this result with the fourth element returned,
and continues until the iterable is exhausted. If the iterable returns no
values at all, a :exc:`TypeError` exception is raised. If the initial value is
supplied, it's used as a starting point and ``func(initial_value, A)`` is the
first calculation.
::
import operator
reduce(operator.concat, ['A', 'BB', 'C']) =>
'ABBC'
reduce(operator.concat, []) =>
TypeError: reduce() of empty sequence with no initial value
reduce(operator.mul, [1,2,3], 1) =>
6
reduce(operator.mul, [], 1) =>
1
If you use :func:`operator.add` with :func:`reduce`, you'll add up all the
elements of the iterable. This case is so common that there's a special
``functools.reduce(func, iter, [initial_value])`` cumulatively performs an
operation on all the iterable's elements and, therefore, can't be applied to
infinite iterables. ``func`` must be a function that takes two elements and
returns a single value. :func:`functools.reduce` takes the first two elements A
and B returned by the iterator and calculates ``func(A, B)``. It then requests
the third element, C, calculates ``func(func(A, B), C)``, combines this result
with the fourth element returned, and continues until the iterable is exhausted.
If the iterable returns no values at all, a :exc:`TypeError` exception is
raised. If the initial value is supplied, it's used as a starting point and
``func(initial_value, A)`` is the first calculation. ::
import operator
import functools
functools.reduce(operator.concat, ['A', 'BB', 'C']) =>
'ABBC'
functools.reduce(operator.concat, []) =>
TypeError: reduce() of empty sequence with no initial value
functools.reduce(operator.mul, [1,2,3], 1) =>
6
functools.reduce(operator.mul, [], 1) =>
1
If you use :func:`operator.add` with :func:`functools.reduce`, you'll add up all
the elements of the iterable. This case is so common that there's a special
built-in called :func:`sum` to compute it::
reduce(operator.add, [1,2,3,4], 0) =>
10
sum([1,2,3,4]) =>
10
sum([]) =>
0
functools.
reduce(operator.add, [1,2,3,4], 0) =>
10
sum([1,2,3,4]) =>
10
sum([]) =>
0
For many uses of :func:`reduce`, though, it can be clearer to just write the
obvious :keyword:`for` loop::
# Instead of:
product = reduce(operator.mul, [1,2,3], 1)
# Instead of:
product =
functools.
reduce(operator.mul, [1,2,3], 1)
# You can write:
product = 1
for i in [1,2,3]:
product *= i
# You can write:
product = 1
for i in [1,2,3]:
product *= i
``enumerate(iter)`` counts off the elements in the iterable, returning 2-tuples
containing the count and each element.
::
containing the count and each element. ::
enumerate(['subject', 'verb', 'object']) =>
(0, 'subject'), (1, 'verb'), (2, 'object')
...
...
@@ -723,12 +712,10 @@ indexes at which certain conditions are met::
if line.strip() == '':
print('Blank line at line #%i' % i)
``sorted(iterable, [cmp=None], [key=None], [reverse=False)`` collects all the
elements of the iterable into a list, sorts the list, and returns the sorted
result. The ``cmp``, ``key``, and ``reverse`` arguments are passed through to
the constructed list's ``.sort()`` method.
::
``sorted(iterable, [key=None], [reverse=False)`` collects all the elements of
the iterable into a list, sorts the list, and returns the sorted result. The
``key``, and ``reverse`` arguments are passed through to the constructed list's
``sort()`` method. ::
import random
# Generate 8 random numbers between [0, 10000)
...
...
@@ -962,14 +949,7 @@ consumed more than the others.
Calling functions on elements
-----------------------------
Two functions are used for calling other functions on the contents of an
iterable.
``itertools.imap(f, iterA, iterB, ...)`` returns a stream containing
``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``::
itertools.imap(operator.add, [5, 6, 5], [1, 2, 3]) =>
6, 8, 8
``itertools.imap(func, iter)`` is the same as built-in :func:`map`.
The ``operator`` module contains a set of functions corresponding to Python's
operators. Some examples are ``operator.add(a, b)`` (adds two values),
...
...
@@ -992,14 +972,7 @@ Selecting elements
Another group of functions chooses a subset of an iterator's elements based on a
predicate.
``itertools.ifilter(predicate, iter)`` returns all the elements for which the
predicate returns true::
def is_even(x):
return (x % 2) == 0
itertools.ifilter(is_even, itertools.count()) =>
0, 2, 4, 6, 8, 10, 12, 14, ...
``itertools.ifilter(predicate, iter)`` is the same as built-in :func:`filter`.
``itertools.ifilterfalse(predicate, iter)`` is the opposite, returning all
elements for which the predicate returns false::
...
...
@@ -1117,8 +1090,7 @@ that perform a single operation.
Some of the functions in this module are:
* Math operations: ``add()``, ``sub()``, ``mul()``, ``div()``, ``floordiv()``,
``abs()``, ...
* Math operations: ``add()``, ``sub()``, ``mul()``, ``floordiv()``, ``abs()``, ...
* Logical operations: ``not_()``, ``truth()``.
* Bitwise operations: ``and_()``, ``or_()``, ``invert()``.
* Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``.
...
...
@@ -1190,7 +1162,7 @@ is equivalent to::
f(*g(5, 6))
Even though ``compose()`` only accepts two functions, it's trivial to build up a
version that will compose any number of functions. We'll use ``reduce()``,
version that will compose any number of functions. We'll use ``
functools.
reduce()``,
``compose()`` and ``partial()`` (the last of which is provided by both
``functional`` and ``functools``).
...
...
@@ -1198,7 +1170,7 @@ version that will compose any number of functions. We'll use ``reduce()``,
from functional import compose, partial
multi_compose = partial(reduce, compose)
multi_compose = partial(
functools.
reduce, compose)
We can also use ``map()``, ``compose()`` and ``partial()`` to craft a version of
...
...
This diff is collapsed.
Click to expand it.
Doc/howto/regex.rst
View file @
f6945183
...
...
@@ -497,7 +497,7 @@ more convenient. If a program contains a lot of regular expressions, or re-uses
the same ones in several locations, then it might be worthwhile to collect all
the definitions in one place, in a section of code that compiles all the REs
ahead of time. To take an example from the standard library, here's an extract
from :file:`xmllib.py`::
from
the now deprecated
:file:`xmllib.py`::
ref = re.compile( ... )
entityref = re.compile( ... )
...
...
This diff is collapsed.
Click to expand it.
Doc/howto/unicode.rst
View file @
f6945183
...
...
@@ -237,129 +237,83 @@ Python's Unicode Support
Now that you've learned the rudiments of Unicode, we can look at Python's
Unicode features.
The String Type
---------------
The Unicode Type
----------------
Unicode strings are expressed as instances of the :class:`unicode` type, one of
Python's repertoire of built-in types. It derives from an abstract type called
:class:`basestring`, which is also an ancestor of the :class:`str` type; you can
therefore check if a value is a string type with ``isinstance(value,
basestring)``. Under the hood, Python represents Unicode strings as either 16-
or 32-bit integers, depending on how the Python interpreter was compiled.
The :func:`unicode` constructor has the signature ``unicode(string[, encoding,
errors])``. All of its arguments should be 8-bit strings. The first argument
is converted to Unicode using the specified encoding; if you leave off the
``encoding`` argument, the ASCII encoding is used for the conversion, so
characters greater than 127 will be treated as errors::
>>> unicode('abcdef')
u'abcdef'
>>> s = unicode('abcdef')
>>> type(s)
<type 'unicode'>
>>> unicode('abcdef' + chr(255))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 6:
ordinal not in range(128)
Since Python 3.0, the language features a ``str`` type that contain Unicode
characters, meaning any string created using ``"unicode rocks!"``, ``'unicode
rocks!``, or the triple-quoted string syntax is stored as Unicode.
To insert a Unicode character that is not part ASCII, e.g., any letters with
accents, one can use escape sequences in their string literals as such::
>>> "\N{GREEK CAPITAL LETTER DELTA}" # Using the character name
'\u0394'
>>> "\u0394" # Using a 16-bit hex value
'\u0394'
>>> "\U00000394" # Using a 32-bit hex value
'\u0394'
The ``errors`` argument specifies the response when the input string can't be
In addition, one can create a string using the :func:`decode` method of
:class:`bytes`. This method takes an encoding, such as UTF-8, and, optionally,
an *errors* argument.
The *errors* argument specifies the response when the input string can't be
converted according to the encoding's rules. Legal values for this argument are
'strict' (raise a
`
`UnicodeDecodeError`
`
exception), 'replace' (add U+FFFD,
'strict' (raise a
:exc:
`UnicodeDecodeError` exception), 'replace' (add U+FFFD,
'REPLACEMENT CHARACTER'), or 'ignore' (just leave the character out of the
Unicode result). The following examples show the differences::
>>>
unicode(
'\x80abc'
, errors='
strict
'
)
>>>
b
'\x80abc'
.decode("utf-8", "
strict
"
)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
ordinal not in range(128)
>>>
unicode(
'\x80abc'
, errors='
replace
'
)
u
'\ufffdabc'
>>>
unicode(
'\x80abc'
, errors='
ignore
'
)
u
'abc'
>>>
b
'\x80abc'
.decode("utf-8", "
replace
"
)
'\ufffdabc'
>>>
b
'\x80abc'
.decode("utf-8", "
ignore
"
)
'abc'
Encodings are specified as strings containing the encoding's name. Python
2.4
Encodings are specified as strings containing the encoding's name. Python
comes with roughly 100 different encodings; see the Python Library Reference at
<http://docs.python.org/lib/standard-encodings.html> for a list. Some encodings
have multiple names; for example, 'latin-1', 'iso_8859_1' and '8859' are all
synonyms for the same encoding.
One-character Unicode strings can also be created with the :func:`
uni
chr`
One-character Unicode strings can also be created with the :func:`chr`
built-in function, which takes integers and returns a Unicode string of length 1
that contains the corresponding code point. The reverse operation is the
built-in :func:`ord` function that takes a one-character Unicode string and
returns the code point value::
>>>
uni
chr(40960)
u
'\ua000'
>>> ord(
u
'\ua000')
>>> chr(40960)
'\ua000'
>>> ord('\ua000')
40960
Instances of the :class:`unicode` type have many of the same methods as the
8-bit string type for operations such as searching and formatting::
>>> s = u'Was ever feather so lightly blown to and fro as this multitude?'
>>> s.count('e')
5
>>> s.find('feather')
9
>>> s.find('bird')
-1
>>> s.replace('feather', 'sand')
u'Was ever sand so lightly blown to and fro as this multitude?'
>>> s.upper()
u'WAS EVER FEATHER SO LIGHTLY BLOWN TO AND FRO AS THIS MULTITUDE?'
Note that the arguments to these methods can be Unicode strings or 8-bit
strings. 8-bit strings will be converted to Unicode before carrying out the
operation; Python's default ASCII encoding will be used, so characters greater
than 127 will cause an exception::
>>> s.find('Was\x9f')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9f in position 3: ordinal not in range(128)
>>> s.find(u'Was\x9f')
-1
Much Python code that operates on strings will therefore work with Unicode
strings without requiring any changes to the code. (Input and output code needs
more updating for Unicode; more on this later.)
Another important method is ``.encode([encoding], [errors='strict'])``, which
returns an 8-bit string version of the Unicode string, encoded in the requested
encoding. The ``errors`` parameter is the same as the parameter of the
``unicode()`` constructor, with one additional possibility; as well as 'strict',
Converting to Bytes
-------------------
Another important str method is ``.encode([encoding], [errors='strict'])``,
which returns a ``bytes`` representation of the Unicode string, encoded in the
requested encoding. The ``errors`` parameter is the same as the parameter of
the :meth:`decode` method, with one additional possibility; as well as 'strict',
'ignore', and 'replace', you can also pass 'xmlcharrefreplace' which uses XML's
character references. The following example shows the different results::
>>> u =
uni
chr(40960) +
u
'abcd' +
uni
chr(1972)
>>> u = chr(40960) + 'abcd' + chr(1972)
>>> u.encode('utf-8')
'\xea\x80\x80abcd\xde\xb4'
b
'\xea\x80\x80abcd\xde\xb4'
>>> u.encode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in position 0: ordinal not in range(128)
>>> u.encode('ascii', 'ignore')
'abcd'
b
'abcd'
>>> u.encode('ascii', 'replace')
'?abcd?'
b
'?abcd?'
>>> u.encode('ascii', 'xmlcharrefreplace')
'ꀀabcd޴'
Python's 8-bit strings have a ``.decode([encoding], [errors])`` method that
interprets the string using the given encoding::
>>> u = unichr(40960) + u'abcd' + unichr(1972) # Assemble a string
>>> utf8_version = u.encode('utf-8') # Encode as UTF-8
>>> type(utf8_version), utf8_version
(<type 'str'>, '\xea\x80\x80abcd\xde\xb4')
>>> u2 = utf8_version.decode('utf-8') # Decode using UTF-8
>>> u == u2 # The two strings match
True
b'ꀀabcd޴'
The low-level routines for registering and accessing the available encodings are
found in the :mod:`codecs` module. However, the encoding and decoding functions
...
...
@@ -377,22 +331,14 @@ output.
Unicode Literals in Python Source Code
--------------------------------------
In Python source code, Unicode literals are written as strings prefixed with the
'u' or 'U' character: ``u'abcdefghijk'``. Specific code points can be written
using the ``\u`` escape sequence, which is followed by four hex digits giving
the code point. The ``\U`` escape sequence is similar, but expects 8 hex
digits, not 4.
Unicode literals can also use the same escape sequences as 8-bit strings,
including ``\x``, but ``\x`` only takes two hex digits so it can't express an
arbitrary code point. Octal escapes can go up to U+01ff, which is octal 777.
In Python source code, specific Unicode code points can be written using the
``\u`` escape sequence, which is followed by four hex digits giving the code
point. The ``\U`` escape sequence is similar, but expects 8 hex digits, not 4::
::
>>> s = u"a\xac\u1234\u20ac\U00008000"
^^^^ two-digit hex escape
^^^^^^ four-digit Unicode escape
^^^^^^^^^^ eight-digit Unicode escape
>>> s = "a\xac\u1234\u20ac\U00008000"
^^^^ two-digit hex escape
^^^^^ four-digit Unicode escape
^^^^^^^^^^ eight-digit Unicode escape
>>> for c in s: print(ord(c), end=" ")
...
97 172 4660 8364 32768
...
...
@@ -400,7 +346,7 @@ arbitrary code point. Octal escapes can go up to U+01ff, which is octal 777.
Using escape sequences for code points greater than 127 is fine in small doses,
but becomes an annoyance if you're using many accented characters, as you would
in a program with messages in French or some other accent-using language. You
can also assemble strings using the :func:`
uni
chr` built-in function, but this is
can also assemble strings using the :func:`chr` built-in function, but this is
even more tedious.
Ideally, you'd want to be able to write literals in your language's natural
...
...
@@ -408,14 +354,15 @@ encoding. You could then edit Python source code with your favorite editor
which would display the accented characters naturally, and have the right
characters used at runtime.
Python supports writing Unicode literals in any encoding, but you have to
declare the encoding being used. This is done by including a special comment as
either the first or second line of the source file::
Python supports writing Unicode literals in UTF-8 by default, but you can use
(almost) any encoding if you declare the encoding being used. This is done by
including a special comment as either the first or second line of the source
file::
#!/usr/bin/env python
# -*- coding: latin-1 -*-
u =
u
'abcdé'
u = 'abcdé'
print(ord(u[-1]))
The syntax is inspired by Emacs's notation for specifying variables local to a
...
...
@@ -424,22 +371,8 @@ file. Emacs supports many different variables, but Python only supports
them, you must supply the name ``coding`` and the name of your chosen encoding,
separated by ``':'``.
If you don't include such a comment, the default encoding used will be ASCII.
Versions of Python before 2.4 were Euro-centric and assumed Latin-1 as a default
encoding for string literals; in Python 2.4, characters greater than 127 still
work but result in a warning. For example, the following program has no
encoding declaration::
#!/usr/bin/env python
u = u'abcdé'
print(ord(u[-1]))
When you run it with Python 2.4, it will output the following warning::
amk:~$ python p263.py
sys:1: DeprecationWarning: Non-ASCII character '\xe9'
in file p263.py on line 2, but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details
If you don't include such a comment, the default encoding used will be UTF-8 as
already mentioned.
Unicode Properties
...
...
@@ -457,7 +390,7 @@ prints the numeric value of one particular character::
import unicodedata
u =
uni
chr(233) +
uni
chr(0x0bf2) +
uni
chr(3972) +
uni
chr(6000) +
uni
chr(13231)
u = chr(233) + chr(0x0bf2) + chr(3972) + chr(6000) + chr(13231)
for i, c in enumerate(u):
print(i, '%04x' % ord(c), unicodedata.category(c), end=" ")
...
...
@@ -487,8 +420,8 @@ list of category codes.
References
----------
The
Unicode and 8-bit string types are
described in the Python library reference
at
:ref:`typesseq`.
The
``str`` type is
described in the Python library reference
at
:ref:`typesseq`.
The documentation for the :mod:`unicodedata` module.
...
...
@@ -557,7 +490,7 @@ It's also possible to open files in update mode, allowing both reading and
writing::
f = codecs.open('test', encoding='utf-8', mode='w+')
f.write(
u
'\u4500 blah blah blah\n')
f.write('\u4500 blah blah blah\n')
f.seek(0)
print(repr(f.readline()[:1]))
f.close()
...
...
@@ -590,7 +523,7 @@ not much reason to bother. When opening a file for reading or writing, you can
usually just provide the Unicode string as the filename, and it will be
automatically converted to the right encoding for you::
filename =
u
'filename\u4500abc'
filename = 'filename\u4500abc'
f = open(filename, 'w')
f.write('blah\n')
f.close()
...
...
@@ -607,7 +540,7 @@ encoding and a list of Unicode strings will be returned, while passing an 8-bit
path will return the 8-bit versions of the filenames. For example, assuming the
default filesystem encoding is UTF-8, running the following program::
fn =
u
'filename\u4500abc'
fn = 'filename\u4500abc'
f = open(fn, 'w')
f.close()
...
...
@@ -619,7 +552,7 @@ will produce the following output::
amk:~$ python t.py
['.svn', 'filename\xe4\x94\x80abc', ...]
[
u
'.svn',
u
'filename\u4500abc', ...]
['.svn', 'filename\u4500abc', ...]
The first list contains UTF-8-encoded filenames, and the second list contains
the Unicode versions.
...
...
This diff is collapsed.
Click to expand it.
Doc/library/array.rst
View file @
f6945183
...
...
@@ -183,18 +183,6 @@ The following data items and methods are also supported:
returned.
.. method:: array.read(f, n)
.. deprecated:: 1.5.1
Use the :meth:`fromfile` method.
Read *n* items (as machine values) from the file object *f* and append them to
the end of the array. If less than *n* items are available, :exc:`EOFError` is
raised, but the items that were available are still inserted into the array.
*f* must be a real built-in file object; something else with a :meth:`read`
method won't do.
.. method:: array.remove(x)
Remove the first occurrence of *x* from the array.
...
...
@@ -229,13 +217,6 @@ The following data items and methods are also supported:
obtain a unicode string from an array of some other type.
.. method:: array.write(f)
.. deprecated:: 1.5.1
Use the :meth:`tofile` method.
Write all items (as machine values) to the file object *f*.
When an array object is printed or converted to a string, it is represented as
``array(typecode, initializer)``. The *initializer* is omitted if the array is
empty, otherwise it is a string if the *typecode* is ``'c'``, otherwise it is a
...
...
This diff is collapsed.
Click to expand it.
Doc/library/collections.rst
View file @
f6945183
...
...
@@ -403,7 +403,7 @@ they add the ability to access fields by name instead of position index.
Any valid Python identifier may be used for a fieldname except for names
starting with an underscore. Valid identifiers consist of letters, digits,
and underscores but do not start with a digit or underscore and cannot be
a :mod:`keyword` such as *class*, *for*, *return*, *global*, *pass*,
*print*,
a :mod:`keyword` such as *class*, *for*, *return*, *global*, *pass*,
or *raise*.
If *verbose* is true, the class definition is printed just before being built.
...
...
This diff is collapsed.
Click to expand it.
Doc/library/configparser.rst
View file @
f6945183
...
...
@@ -199,7 +199,7 @@ RawConfigParser Objects
.. method:: RawConfigParser.read(filenames)
Attempt to read and parse a list of filenames, returning a list of filenames
which were successfully parsed. If *filenames* is a
string or Unicode
string,
which were successfully parsed. If *filenames* is a string,
it is treated as a single filename. If a file named in *filenames* cannot be
opened, that file will be ignored. This is designed so that you can specify a
list of potential configuration file locations (for example, the current
...
...
@@ -330,8 +330,8 @@ The :class:`SafeConfigParser` class implements the same extended interface as
.. method:: SafeConfigParser.set(section, option, value)
If the given section exists, set the given option to the specified value;
otherwise raise :exc:`NoSectionError`. *value* must be a string
(:class:`str`
or :class:`unicode`); if
not, :exc:`TypeError` is raised.
otherwise raise :exc:`NoSectionError`. *value* must be a string
; if it is
not, :exc:`TypeError` is raised.
Examples
...
...
@@ -373,12 +373,12 @@ An example of reading the configuration file again::
# getint() and getboolean() also do this for their respective types
float = config.getfloat('Section1', 'float')
int = config.getint('Section1', 'int')
print
float + int
print
(
float + int
)
# Notice that the next output does not interpolate '%(bar)s' or '%(baz)s'.
# This is because we are using a RawConfigParser().
if config.getboolean('Section1', 'bool'):
print
config.get('Section1', 'foo')
print
(
config.get('Section1', 'foo')
)
To get interpolation, you will need to use a :class:`ConfigParser` or
:class:`SafeConfigParser`::
...
...
@@ -389,13 +389,13 @@ To get interpolation, you will need to use a :class:`ConfigParser` or
config.read('example.cfg')
# Set the third, optional argument of get to 1 if you wish to use raw mode.
print
config.get('Section1', 'foo', 0) # -> "Python is fun!"
print
config.get('Section1', 'foo', 1) # -> "%(bar)s is %(baz)s!"
print
(
config.get('Section1', 'foo', 0)
)
# -> "Python is fun!"
print
(
config.get('Section1', 'foo', 1)
)
# -> "%(bar)s is %(baz)s!"
# The optional fourth argument is a dict with members that will take
# precedence in interpolation.
print
config.get('Section1', 'foo', 0, {'bar': 'Documentation',
'baz': 'evil'})
print
(
config.get('Section1', 'foo', 0, {'bar': 'Documentation',
'baz': 'evil'})
)
Defaults are available in all three types of ConfigParsers. They are used in
interpolation if an option used is not defined elsewhere. ::
...
...
@@ -406,10 +406,10 @@ interpolation if an option used is not defined elsewhere. ::
config = ConfigParser.SafeConfigParser({'bar': 'Life', 'baz': 'hard'})
config.read('example.cfg')
print
config.get('Section1', 'foo') # -> "Python is fun!"
print
(
config.get('Section1', 'foo')
)
# -> "Python is fun!"
config.remove_option('Section1', 'bar')
config.remove_option('Section1', 'baz')
print
config.get('Section1', 'foo') # -> "Life is hard!"
print
(
config.get('Section1', 'foo')
)
# -> "Life is hard!"
The function ``opt_move`` below can be used to move options between sections::
...
...
This diff is collapsed.
Click to expand it.
Doc/library/csv.rst
View file @
f6945183
...
...
@@ -86,7 +86,7 @@ The :mod:`csv` module defines the following functions:
>>> import csv
>>> spamReader = csv.reader(open('eggs.csv'), delimiter=' ', quotechar='|')
>>> for row in spamReader:
... print
', '.join(row)
... print
(
', '.join(row)
)
Spam, Spam, Spam, Spam, Spam, Baked Beans
Spam, Lovely Spam, Wonderful Spam
...
...
@@ -121,7 +121,7 @@ The :mod:`csv` module defines the following functions:
.. function:: register_dialect(name[, dialect][, fmtparam])
Associate *dialect* with *name*. *name* must be a string
or Unicode object
. The
Associate *dialect* with *name*. *name* must be a string. The
dialect can be specified either by passing a sub-class of :class:`Dialect`, or
by *fmtparam* keyword arguments, or both, with keyword arguments overriding
parameters of the dialect. For full details about the dialect and formatting
...
...
This diff is collapsed.
Click to expand it.
Doc/library/datatypes.rst
View file @
f6945183
...
...
@@ -11,8 +11,8 @@ queues, and sets.
Python also provides some built-in data types, in particular,
:class:`dict`, :class:`list`, :class:`set` and :class:`frozenset`, and
:class:`tuple`. The :class:`str` class can be used to
handle binary data
and 8-bit text
, and the :class:`
unicode
` class to handle
Unicode text
.
:class:`tuple`. The :class:`str` class can be used to
strings, including
Unicode strings
, and the :class:`
bytes
` class to handle
binary data
.
The following modules are documented in this chapter:
...
...
This diff is collapsed.
Click to expand it.
Doc/library/easydialogs.rst
View file @
f6945183
...
...
@@ -107,7 +107,7 @@ The :mod:`EasyDialogs` module defines the following functions:
*actionButtonLabel* is a string to show instead of "Open" in the OK button,
*cancelButtonLabel* is a string to show instead of "Cancel" in the cancel
button, *wanted* is the type of value wanted as a return: :class:`str`,
:class:`unicode`,
:class:`FSSpec`, :class:`FSRef` and subtypes thereof are
:class:`FSSpec`, :class:`FSRef` and subtypes thereof are
acceptable.
.. index:: single: Navigation Services
...
...
This diff is collapsed.
Click to expand it.
Doc/library/email.charset.rst
View file @
f6945183
...
...
@@ -242,6 +242,6 @@ new entries to the global character set, alias, and codec registries:
Add a codec that map characters in the given character set to and from Unicode.
*charset* is the canonical name of a character set. *codecname* is the name of a
Python codec, as appropriate for the second argument to the :
func:`unicode`
built-in, or to the :meth:`encode` method of a Unicode string.
Python codec, as appropriate for the second argument to the :
class:`str`'s
:func:`decode` method
This diff is collapsed.
Click to expand it.
Doc/library/email.header.rst
View file @
f6945183
...
...
@@ -53,8 +53,8 @@ Here is the :class:`Header` class description:
Optional *s* is the initial header value. If ``None`` (the default), the
initial header value is not set. You can later append to the header with
:meth:`append` method calls. *s* may be a
byte string or a Unicode string, but
see the :meth:`append` documentation for semantics.
:meth:`append` method calls. *s* may be a
n instance of :class:`bytes` or
:class:`str`, but
see the :meth:`append` documentation for semantics.
Optional *charset* serves two purposes: it has the same meaning as the *charset*
argument to the :meth:`append` method. It also sets the default character set
...
...
@@ -86,19 +86,19 @@ Optional *errors* is passed straight through to the :meth:`append` method.
a :class:`Charset` instance. A value of ``None`` (the default) means that the
*charset* given in the constructor is used.
*s* may be a
byte string or a Unicode string. If it is a byte string (i.e.
``isinstance(s, str)`` is true)
, then *charset* is the encoding of that byte
string, and a
:exc:`UnicodeError` will be raised if the string cannot be decoded
with that
character set.
*s* may be a
n instance of :class:`bytes` or :class:`str`. If it is an instance
of :class:`bytes`
, then *charset* is the encoding of that byte
string, and a
:exc:`UnicodeError` will be raised if the string cannot be decoded
with that
character set.
If *s* is a
Unicode string
, then *charset* is a hint specifying the
character
set of the characters in the string. In this case, when producing an
If *s* is a
n instance of :class:`str`
, then *charset* is a hint specifying the
character
set of the characters in the string. In this case, when producing an
:rfc:`2822`\ -compliant header using :rfc:`2047` rules, the Unicode string will
be encoded using the following charsets in order: ``us-ascii``, the *charset*
hint, ``utf-8``. The first character set to not provoke a :exc:`UnicodeError`
is used.
Optional *errors* is passed through to any :func:`
uni
code` or
Optional *errors* is passed through to any :func:`
en
code` or
:func:`ustr.encode` call, and defaults to "strict".
...
...
@@ -121,7 +121,7 @@ operators and built-in functions.
.. method:: Header.__unicode__()
A helper for
the built-in
:func:`
uni
code`
function
. Returns the header as a
A helper for
:class:`str`'s
:func:`
en
code`
method
. Returns the header as a
Unicode string.
...
...
This diff is collapsed.
Click to expand it.
Doc/library/email.util.rst
View file @
f6945183
...
...
@@ -130,10 +130,10 @@ There are several useful utilities provided in the :mod:`email.utils` module:
When a header parameter is encoded in :rfc:`2231` format,
:meth:`Message.get_param` may return a 3-tuple containing the character set,
language, and value. :func:`collapse_rfc2231_value` turns this into a unicode
string. Optional *errors* is passed to the *errors* argument of
the built-in
:func:`
uni
code`
function
; it defaults to ``replace``. Optional
string. Optional *errors* is passed to the *errors* argument of
:class:`str`'s
:func:`
en
code`
method
; it defaults to ``
'
replace
'
``. Optional
*fallback_charset* specifies the character set to use if the one in the
:rfc:`2231` header is not known by Python; it defaults to ``us-ascii``.
:rfc:`2231` header is not known by Python; it defaults to ``
'
us-ascii
'
``.
For convenience, if the *value* passed to :func:`collapse_rfc2231_value` is not
a tuple, it should be a string and it is returned unquoted.
...
...
This diff is collapsed.
Click to expand it.
Doc/library/fcntl.rst
View file @
f6945183
...
...
@@ -47,7 +47,6 @@ The module defines the following functions:
.. function:: ioctl(fd, op[, arg[, mutate_flag]])
This function is identical to the :func:`fcntl` function, except that the
operations are typically defined in the library module :mod:`termios` and the
argument handling is even more complicated.
The parameter *arg* can be one of an integer, absent (treated identically to the
...
...
This diff is collapsed.
Click to expand it.
Doc/library/fileinput.rst
View file @
f6945183
...
...
@@ -168,9 +168,3 @@ The two following opening hooks are provided by this module:
Usage example: ``fi =
fileinput.FileInput(openhook=fileinput.hook_encoded("iso-8859-1"))``
.. note::
With this hook, :class:`FileInput` might return Unicode strings depending on the
specified *encoding*.
This diff is collapsed.
Click to expand it.
Doc/library/functions.rst
View file @
f6945183
...
...
@@ -129,7 +129,7 @@ available. They are listed here in alphabetical order.
different
ways
:
*
If
it
is
a
*
string
*,
you
must
also
give
the
*
encoding
*
(
and
optionally
,
*
errors
*)
parameters
;
:
func
:`
bytearray
`
then
converts
the
Unicode
string
to
*
errors
*)
parameters
;
:
func
:`
bytearray
`
then
converts
the
string
to
bytes
using
:
meth
:`
str
.
encode
`.
*
If
it
is
an
*
integer
*,
the
array
will
have
that
size
and
will
be
...
...
@@ -415,10 +415,9 @@ available. They are listed here in alphabetical order.
.. warning::
The default *locals* act as described for function :func:`locals` below:
modifications to the default *locals* dictionary should not be attempted. Pass
an explicit *locals* dictionary if you need to see effects of the code on
*locals* after function :func:`execfile` returns. :func:`exec` cannot be
used reliably to modify a function'
s
locals
.
modifications to the default *locals* dictionary should not be attempted.
Pass an explicit *locals* dictionary if you need to see effects of the
code on *locals* after function :func:`exec` returns.
.. function:: filter(function, iterable)
...
...
@@ -805,16 +804,17 @@ available. They are listed here in alphabetical order.
:mod:`fileinput`, :mod:`os`, :mod:`os.path`, :mod:`tempfile`, and
:mod:`shutil`.
.. XXX works for bytes too, but should it?
.. function:: ord(c)
Given a string of length one, return an integer representing the Unicode code
point of the character when the argument is a unicode object, or the value of
the byte when the argument is an 8-bit string. For example, ``ord('
a
')`` returns
the integer ``97``, ``ord(u'
\
u2020
')`` returns ``8224``. This is the inverse of
:func:`chr` for 8-bit strings and of :func:`unichr` for unicode objects. If a
unicode argument is given and Python was built with UCS2 Unicode, then the
character'
s
code
point
must
be
in
the
range
[
0..65535
]
inclusive
;
otherwise
the
string
length
is
two
,
and
a
:
exc
:`
TypeError
`
will
be
raised
.
point of the character. For example, ``ord('a')`` returns the integer ``97``
and ``ord('\u2020')`` returns ``8224``. This is the inverse of :func:`chr`.
If the argument length is not one, a :exc:`TypeError` will be raised. (If
Python was built with UCS2 Unicode, then the character's code point must be
in the range [0..65535] inclusive; otherwise the string length is two!)
.. function:: pow(x, y[, z])
...
...
@@ -838,6 +838,22 @@ available. They are listed here in alphabetical order.
accidents.)
.. function:: print([object, ...][, sep=' '][, end='
\n
'][, file=sys.stdout])
Print *object*\(s) to the stream *file*, separated by *sep* and followed by
*end*. *sep*, *end* and *file*, if present, must be given as keyword
arguments.
All non-keyword arguments are converted to strings like :func:`str` does and
written to the stream, separated by *sep* and followed by *end*. Both *sep*
and *end* must be strings; they can also be ``None``, which means to use the
default values. If no *object* is given, :func:`print` will just write
*end*.
The *file* argument must be an object with a ``write(string)`` method; if it
is not present or ``None``, :data:`sys.stdout` will be used.
.. function:: property([fget[, fset[, fdel[, doc]]]])
Return a property attribute.
...
...
This diff is collapsed.
Click to expand it.
Doc/library/gettext.rst
View file @
f6945183
...
...
@@ -136,9 +136,9 @@ The class-based API of the :mod:`gettext` module gives you more flexibility and
greater convenience than the GNU :program:`gettext` API. It is the recommended
way of localizing your Python applications and modules. :mod:`gettext` defines
a "translations" class which implements the parsing of GNU :file:`.mo` format
files, and has methods for returning
either standard 8-bit strings or Unicode
strings. Instances of this "translations" class can also install themselves i
n
the built-in namespace as the function
:func:`_`.
files, and has methods for returning
strings. Instances of this "translations"
class can also install themselves in the built-in namespace as the functio
n
:func:`_`.
.. function:: find(domain[, localedir[, languages[, all]]])
...
...
@@ -257,8 +257,7 @@ are the methods of :class:`NullTranslations`:
.. method:: NullTranslations.ugettext(message)
If a fallback has been set, forward :meth:`ugettext` to the fallback. Otherwise,
return the translated message as a Unicode string. Overridden in derived
classes.
return the translated message as a string. Overridden in derived classes.
.. method:: NullTranslations.ngettext(singular, plural, n)
...
...
@@ -276,7 +275,7 @@ are the methods of :class:`NullTranslations`:
.. method:: NullTranslations.ungettext(singular, plural, n)
If a fallback has been set, forward :meth:`ungettext` to the fallback.
Otherwise, return the translated message as a
Unicode
string. Overridden in
Otherwise, return the translated message as a string. Overridden in
derived classes.
...
...
@@ -347,8 +346,8 @@ initialize the "protected" :attr:`_charset` instance variable, defaulting to
``None`` if not found. If the charset encoding is specified, then all message
ids and message strings read from the catalog are converted to Unicode using
this encoding. The :meth:`ugettext` method always returns a Unicode, while the
:meth:`gettext` returns an encoded
8-bit
string. For the message id arguments
of both methods, either Unicode strings or
8-bit
strings containing only
:meth:`gettext` returns an encoded
byte
string. For the message id arguments
of both methods, either Unicode strings or
byte
strings containing only
US-ASCII characters are acceptable. Note that the Unicode version of the
methods (i.e. :meth:`ugettext` and :meth:`ungettext`) are the recommended
interface to use for internationalized Python programs.
...
...
@@ -366,7 +365,7 @@ The following methods are overridden from the base class implementation:
.. method:: GNUTranslations.gettext(message)
Look up the *message* id in the catalog and return the corresponding message
string, as a
n 8-bit
string encoded with the catalog's charset encoding, if
string, as a
byte
string encoded with the catalog's charset encoding, if
known. If there is no entry in the catalog for the *message* id, and a fallback
has been set, the look up is forwarded to the fallback's :meth:`gettext` method.
Otherwise, the *message* id is returned.
...
...
@@ -382,7 +381,7 @@ The following methods are overridden from the base class implementation:
.. method:: GNUTranslations.ugettext(message)
Look up the *message* id in the catalog and return the corresponding message
string, as a
Unicode
string. If there is no entry in the catalog for the
string, as a string. If there is no entry in the catalog for the
*message* id, and a fallback has been set, the look up is forwarded to the
fallback's :meth:`ugettext` method. Otherwise, the *message* id is returned.
...
...
@@ -391,7 +390,7 @@ The following methods are overridden from the base class implementation:
Do a plural-forms lookup of a message id. *singular* is used as the message id
for purposes of lookup in the catalog, while *n* is used to determine which
plural form to use. The returned message string is a
n 8-bit
string encoded with
plural form to use. The returned message string is a
byte
string encoded with
the catalog's charset encoding, if known.
If the message id is not found in the catalog, and a fallback is specified, the
...
...
@@ -410,7 +409,7 @@ The following methods are overridden from the base class implementation:
Do a plural-forms lookup of a message id. *singular* is used as the message id
for purposes of lookup in the catalog, while *n* is used to determine which
plural form to use. The returned message string is a
Unicode
string.
plural form to use. The returned message string is a string.
If the message id is not found in the catalog, and a fallback is specified, the
request is forwarded to the fallback's :meth:`ungettext` method. Otherwise,
...
...
This diff is collapsed.
Click to expand it.
Doc/library/imp.rst
View file @
f6945183
...
...
@@ -185,6 +185,19 @@ This module provides an interface to the mechanisms used to implement the
continue to use the old class definition. The same is true for derived classes.
.. function:: acquire_lock()
Acquires the interpreter's import lock for the current thread. This lock should
be used by import hooks to ensure thread-safety when importing modules. On
platforms without threads, this function does nothing.
.. function:: release_lock()
Release the interpreter's import lock. On platforms without threads, this
function does nothing.
The following constants with integer values, defined in this module, are used to
indicate the search result of :func:`find_module`.
...
...
This diff is collapsed.
Click to expand it.
Doc/library/itertools.rst
View file @
f6945183
...
...
@@ -177,7 +177,8 @@ loops that truncate the stream.
Make an iterator that filters elements from iterable returning only those for
which the predicate is ``True``. If *predicate* is ``None``, return the items
that are true. Equivalent to::
that are true. This function is the same as the built-in :func:`filter`
function. Equivalent to::
def ifilter(predicate, iterable):
if predicate is None:
...
...
@@ -204,7 +205,8 @@ loops that truncate the stream.
.. function:: imap(function, *iterables)
Make an iterator that computes the function using arguments from each of the
iterables. Equivalent to::
iterables. This function is the same as the built-in :func:`map` function.
Equivalent to::
def imap(function, *iterables):
iterables = [iter(it) for it in iterables)
...
...
@@ -230,7 +232,7 @@ loops that truncate the stream.
def islice(iterable, *args):
s = slice(*args)
it =
iter(
range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
)
it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
nexti = next(it)
for i, element in enumerate(iterable):
if i == nexti:
...
...
This diff is collapsed.
Click to expand it.
Doc/library/logging.rst
View file @
f6945183
...
...
@@ -95,7 +95,7 @@ yourself, though, it is simpler to use a :class:`RotatingFileHandler`::
logfiles = glob.glob('
%
s
*
' % LOG_FILENAME)
for filename in logfiles:
print
filename
print
(
filename
)
The result should be 6 separate files, each with part of the log history for the
application::
...
...
@@ -2428,13 +2428,13 @@ configuration::
HOST
=
'localhost'
PORT
=
9999
s
=
socket
.
socket
(
socket
.
AF_INET
,
socket
.
SOCK_STREAM
)
print
"connecting..."
print
(
"connecting..."
)
s
.
connect
((
HOST
,
PORT
))
print
"sending config..."
print
(
"sending config..."
)
s
.
send
(
struct
.
pack
(
">L"
,
len
(
data_to_send
)))
s
.
send
(
data_to_send
)
s
.
close
()
print
"complete"
print
(
"complete"
)
More
examples
...
...
This diff is collapsed.
Click to expand it.
Prev
1
2
3
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment