Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
e5162bd9
Commit
e5162bd9
authored
Oct 24, 2013
by
Serhiy Storchaka
Browse files
Options
Browse Files
Download
Plain Diff
Issue #19327: Fixed the working of regular expressions with too big charset.
parents
775d111a
f2e07046
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
8 additions
and
3 deletions
+8
-3
Lib/sre_compile.py
Lib/sre_compile.py
+1
-1
Lib/test/test_re.py
Lib/test/test_re.py
+3
-0
Misc/NEWS
Misc/NEWS
+2
-0
Modules/_sre.c
Modules/_sre.c
+2
-2
No files found.
Lib/sre_compile.py
View file @
e5162bd9
...
...
@@ -339,7 +339,7 @@ def _optimize_unicode(charset, fixup):
else
:
code
=
'I'
# Convert block indices to byte array of 256 bytes
mapping
=
array
.
array
(
'
b
'
,
mapping
).
tobytes
()
mapping
=
array
.
array
(
'
B
'
,
mapping
).
tobytes
()
# Convert byte array to word array
mapping
=
array
.
array
(
code
,
mapping
)
assert
mapping
.
itemsize
==
_sre
.
CODESIZE
...
...
Lib/test/test_re.py
View file @
e5162bd9
...
...
@@ -482,6 +482,9 @@ class ReTests(unittest.TestCase):
"
\
u2222
").group(1), "
\
u2222
")
self.assertEqual(re.match("
([
\
u2222
\
u2223
])
",
"
\
u2222
", re.UNICODE).group(1), "
\
u2222
")
r = '[%s]' % ''.join(map(chr, range(256, 2**16, 255)))
self.assertEqual(re.match(r,
"
\
uff01
", re.UNICODE).group(), "
\
uff01
")
def test_big_codesize(self):
# Issue #1160
...
...
Misc/NEWS
View file @
e5162bd9
...
...
@@ -19,6 +19,8 @@ Core and Builtins
Library
-------
- Issue #19327: Fixed the working of regular expressions with too big charset.
- Issue #17400: New '
is_global
' attribute for ipaddress to tell if an address
is allocated by IANA for global or private networks.
...
...
Modules/_sre.c
View file @
e5162bd9
...
...
@@ -447,7 +447,7 @@ SRE_CHARSET(SRE_CODE* set, SRE_CODE ch)
count
=
*
(
set
++
);
if
(
sizeof
(
SRE_CODE
)
==
2
)
{
block
=
((
char
*
)
set
)[
ch
>>
8
];
block
=
((
unsigned
char
*
)
set
)[
ch
>>
8
];
set
+=
128
;
if
(
set
[
block
*
16
+
((
ch
&
255
)
>>
4
)]
&
(
1
<<
(
ch
&
15
)))
return
ok
;
...
...
@@ -457,7 +457,7 @@ SRE_CHARSET(SRE_CODE* set, SRE_CODE ch)
/* !(c & ~N) == (c < N+1) for any unsigned c, this avoids
* warnings when c's type supports only numbers < N+1 */
if
(
!
(
ch
&
~
65535
))
block
=
((
char
*
)
set
)[
ch
>>
8
];
block
=
((
unsigned
char
*
)
set
)[
ch
>>
8
];
else
block
=
-
1
;
set
+=
64
;
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment