Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
efa5a39f
Commit
efa5a39f
authored
Oct 27, 2013
by
Serhiy Storchaka
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Issue #19405: Fixed outdated comments in the _sre module.
parent
246eb110
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
6 additions
and
7 deletions
+6
-7
Lib/sre_compile.py
Lib/sre_compile.py
+5
-5
Modules/_sre.c
Modules/_sre.c
+1
-2
No files found.
Lib/sre_compile.py
View file @
efa5a39f
...
...
@@ -276,10 +276,10 @@ def _mk_bitmap(bits):
# set is constructed. Then, this bitmap is sliced into chunks of 256
# characters, duplicate chunks are eliminated, and each chunk is
# given a number. In the compiled expression, the charset is
# represented by a
16
-bit word sequence, consisting of one word for
# the number of different chunks, a sequence of 256 bytes (
128
words)
# represented by a
32
-bit word sequence, consisting of one word for
# the number of different chunks, a sequence of 256 bytes (
64
words)
# of chunk numbers indexed by their original chunk position, and a
# sequence of
chunks (16
words each).
# sequence of
256-bit chunks (8
words each).
# Compression is normally good: in a typical charset, large ranges of
# Unicode will be either completely excluded (e.g. if only cyrillic
...
...
@@ -292,9 +292,9 @@ def _mk_bitmap(bits):
# less significant byte is a bit index in the chunk (just like the
# CHARSET matching).
#
In UCS-4 mode, t
he BIGCHARSET opcode still supports only subsets
#
T
he BIGCHARSET opcode still supports only subsets
# of the basic multilingual plane; an efficient representation
# for all of U
TF-16
has not yet been developed. This means,
# for all of U
nicode
has not yet been developed. This means,
# in particular, that negated charsets cannot be represented as
# bigcharsets.
...
...
Modules/_sre.c
View file @
efa5a39f
...
...
@@ -2749,8 +2749,7 @@ _compile(PyObject* self_, PyObject* args)
\_________\_____/ /
\____________/
It also helps that SRE_CODE is always an unsigned type, either 2 bytes or 4
bytes wide (the latter if Python is compiled for "wide" unicode support).
It also helps that SRE_CODE is always an unsigned type.
*/
/* Defining this one enables tracing of the validator */
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment