Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
1985f7b1
Commit
1985f7b1
authored
Oct 27, 2013
by
Serhiy Storchaka
Browse files
Options
Browse Files
Download
Plain Diff
Issue #19405: Fixed outdated comments in the _sre module.
parents
b9dcfea0
efa5a39f
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
6 additions
and
7 deletions
+6
-7
Lib/sre_compile.py
Lib/sre_compile.py
+5
-5
Modules/_sre.c
Modules/_sre.c
+1
-2
No files found.
Lib/sre_compile.py
View file @
1985f7b1
...
...
@@ -270,10 +270,10 @@ def _mk_bitmap(bits):
# set is constructed. Then, this bitmap is sliced into chunks of 256
# characters, duplicate chunks are eliminated, and each chunk is
# given a number. In the compiled expression, the charset is
# represented by a
16
-bit word sequence, consisting of one word for
# the number of different chunks, a sequence of 256 bytes (
128
words)
# represented by a
32
-bit word sequence, consisting of one word for
# the number of different chunks, a sequence of 256 bytes (
64
words)
# of chunk numbers indexed by their original chunk position, and a
# sequence of
chunks (16
words each).
# sequence of
256-bit chunks (8
words each).
# Compression is normally good: in a typical charset, large ranges of
# Unicode will be either completely excluded (e.g. if only cyrillic
...
...
@@ -286,9 +286,9 @@ def _mk_bitmap(bits):
# less significant byte is a bit index in the chunk (just like the
# CHARSET matching).
#
In UCS-4 mode, t
he BIGCHARSET opcode still supports only subsets
#
T
he BIGCHARSET opcode still supports only subsets
# of the basic multilingual plane; an efficient representation
# for all of U
TF-16
has not yet been developed. This means,
# for all of U
nicode
has not yet been developed. This means,
# in particular, that negated charsets cannot be represented as
# bigcharsets.
...
...
Modules/_sre.c
View file @
1985f7b1
...
...
@@ -1348,8 +1348,7 @@ _compile(PyObject* self_, PyObject* args)
\_________\_____/ /
\____________/
It also helps that SRE_CODE is always an unsigned type, either 2 bytes or 4
bytes wide (the latter if Python is compiled for "wide" unicode support).
It also helps that SRE_CODE is always an unsigned type.
*/
/* Defining this one enables tracing of the validator */
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment