Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
dcc56f8b
Commit
dcc56f8b
authored
Aug 31, 2007
by
Georg Brandl
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Add bytes/remove unicode from the data model.
parent
85eb8c10
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
36 additions
and
64 deletions
+36
-64
Doc/reference/datamodel.rst
Doc/reference/datamodel.rst
+36
-64
No files found.
Doc/reference/datamodel.rst
View file @
dcc56f8b
...
...
@@ -289,52 +289,21 @@ Sequences
.. index::
builtin: chr
builtin: ord
object: string
single: character
single: byte
single: ASCII@ASCII
The items of a string are characters. There is no separate character type; a
character is represented by a string of one item. Characters represent (at
least) 8-bit bytes. The built-in functions :func:`chr` and :func:`ord` convert
between characters and nonnegative integers representing the byte values. Bytes
with the values 0-127 usually represent the corresponding ASCII values, but the
interpretation of values is up to the program. The string data type is also
used to represent arrays of bytes, e.g., to hold data read from a file.
.. index::
single: ASCII@ASCII
single: EBCDIC
single: character set
pair: string; comparison
builtin: chr
builtin: ord
(On systems whose native character set is not ASCII, strings may use EBCDIC in
their internal representation, provided the functions :func:`chr` and
:func:`ord` implement a mapping between ASCII and EBCDIC, and string comparison
preserves the ASCII order. Or perhaps someone can propose a better rule?)
Unicode
.. index::
builtin: unichr
builtin: ord
builtin: unicode
object: unicode
builtin: str
single: character
single: integer
single: Unicode
The items of a
Unicode object are Unicode code units. A Unicode code unit is
represented by a Unicode object of one item and can hold either a 16-bit o
r
32-bit value representing a Unicode ordinal (the maximum value for the ordinal
is given in ``sys.maxunicode``, and depends on how Python is configured at
compile time). Surrogate pairs may be present in the Unicode object, and will
be reported as two separate items. The built-in functions :func:`unichr` and
:func:`ord` convert between code units and nonnegative integers representing the
Unicode ordinals as defined in the Unicode Standard 3.0. Conversion from and to
o
ther encodings are possible through the Unicode method :meth:`encode` and the
built-in function :func:`uni
code`.
The items of a
string object are Unicode code units. A Unicode code
unit is represented by a string object of one item and can hold eithe
r
a 16-bit or 32-bit value representing a Unicode ordinal (the maximum
value for the ordinal is given in ``sys.maxunicode``, and depends on
how Python is configured at compile time). Surrogate pairs may be
present in the Unicode object, and will be reported as two separate
items. The built-in functions :func:`chr` and :func:`ord` convert
between code units and nonnegative integers representing the Unicode
o
rdinals as defined in the Unicode Standard 3.0. Conversion from and to
other encodings are possible through the string method :meth:`en
code`.
Tuples
.. index::
...
...
@@ -342,11 +311,12 @@ Sequences
pair: singleton; tuple
pair: empty; tuple
The items of a tuple are arbitrary Python objects. Tuples of two or more items
are formed by comma-separated lists of expressions. A tuple of one item (a
'singleton') can be formed by affixing a comma to an expression (an expression
by itself does not create a tuple, since parentheses must be usable for grouping
of expressions). An empty tuple can be formed by an empty pair of parentheses.
The items of a tuple are arbitrary Python objects. Tuples of two or
more items are formed by comma-separated lists of expressions. A tuple
of one item (a 'singleton') can be formed by affixing a comma to an
expression (an expression by itself does not create a tuple, since
parentheses must be usable for grouping of expressions). An empty
tuple can be formed by an empty pair of parentheses.
.. % Immutable sequences
...
...
@@ -369,14 +339,23 @@ Sequences
Lists
.. index:: object: list
The items of a list are arbitrary Python objects. Lists are formed by placing a
comma-separated list of expressions in square brackets. (Note that there are no
special cases needed to form lists of length 0 or 1.)
The items of a list are arbitrary Python objects. Lists are formed by
placing a comma-separated list of expressions in square brackets. (Note
that there are no special cases needed to form lists of length 0 or 1.)
Bytes
.. index:: bytes, byte
A bytes object is a mutable array. The items are 8-bit bytes,
represented by integers in the range 0 <= x < 256. Bytes literals
(like ``b'abc'`` and the built-in function :func:`bytes` can be used to
construct bytes objects. Also, bytes objects can be decoded to strings
via the :meth:`decode` method.
.. index:: module: array
The extension module :mod:`array` provides an additional example of a
mutable
sequence type.
The extension module :mod:`array` provides an additional example of a
mutable
sequence type.
.. % Mutable sequences
...
...
@@ -1230,12 +1209,14 @@ Basic customization
builtin: str
builtin: print
Called by the :func:`str` built-in function and by the :func:`print`
function to compute the "informal" string representation of an object. Thi
s
differs
from :meth:`__repr__` in that it does not have to be a valid Python
Called by the :func:`str` built-in function and by the :func:`print`
function
to compute the "informal" string representation of an object. This differ
s
from :meth:`__repr__` in that it does not have to be a valid Python
expression: a more convenient or concise representation may be used instead.
The return value must be a string object.
.. XXX what about subclasses of string?
.. method:: object.__format__(self, format_spec)
...
...
@@ -1355,15 +1336,6 @@ Basic customization
:meth:`__bool__`, all its instances are considered true.
.. method:: object.__unicode__(self)
.. index:: builtin: unicode
Called to implement :func:`unicode` builtin; should return a Unicode object.
When this method is not defined, string conversion is attempted, and the result
of string conversion is converted to Unicode using the system default encoding.
.. _attribute-access:
Customizing attribute access
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment