Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
1f268285
Commit
1f268285
authored
Jul 28, 2009
by
Mark Dickinson
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Issue #6561: '\d' in a regular expression should match only Unicode
character category [Nd], not [No].
parent
6bd13fbb
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
32 additions
and
6 deletions
+32
-6
Doc/library/re.rst
Doc/library/re.rst
+6
-5
Lib/test/test_re.py
Lib/test/test_re.py
+21
-0
Misc/NEWS
Misc/NEWS
+4
-0
Modules/_sre.c
Modules/_sre.c
+1
-1
No files found.
Doc/library/re.rst
View file @
1f268285
...
...
@@ -338,11 +338,12 @@ the second character. For example, ``\$`` matches the character ``'$'``.
``\d``
For Unicode (str) patterns:
Matches any Unicode digit (which includes ``[0-9]``, and also many
other digit characters). If the :const:`ASCII` flag is used only
``[0-9]`` is matched (but the flag affects the entire regular
expression, so in such cases using an explicit ``[0-9]`` may be a
better choice).
Matches any Unicode decimal digit (that is, any character in
Unicode character category [Nd]). This includes ``[0-9]``, and
also many other digit characters. If the :const:`ASCII` flag is
used only ``[0-9]`` is matched (but the flag affects the entire
regular expression, so in such cases using an explicit ``[0-9]``
may be a better choice).
For 8-bit (bytes) patterns:
Matches any decimal digit; this is equivalent to ``[0-9]``.
...
...
Lib/test/test_re.py
View file @
1f268285
...
...
@@ -605,6 +605,27 @@ class ReTests(unittest.TestCase):
self.assertEqual(next(iter).span(), (4, 4))
self.assertRaises(StopIteration, next, iter)
def test_bug_6561(self):
# '
\
d
'
should match characters in Unicode category 'Nd'
# (Number, Decimal Digit), but not those in 'Nl' (Number,
# Letter) or 'No' (Number, Other).
decimal_digits = [
'
\
u0037
', # '
\
N{DIGIT SEVEN}
', category 'Nd'
'
\
u0e58
', # '
\
N{THAI DIGIT SIX}
', category 'Nd'
'
\
uff10
', # '
\
N{FULLWIDTH DIGIT ZERO}
', category 'Nd'
]
for x in decimal_digits:
self.assertEqual(re.match('^
\
d$
'
, x).group(0), x)
not_decimal_digits = [
'
\
u2165
', # '
\
N{ROMAN NUMERAL SIX}
', category 'Nl'
'
\
u3039
', # '
\
N{HANGZHOU NUMERAL TWENTY}
', category 'Nl'
'
\
u2082
', # '
\
N{SUBSCRIPT TWO}
', category 'No'
'
\
u32b4
', # '
\
N{CIRCLED NUMBER THIRTY NINE}
', category 'No'
]
for x in not_decimal_digits:
self.assertIsNone(re.match('^
\
d$
'
, x))
def test_empty_array(self):
# SF buf 1647541
import array
...
...
Misc/NEWS
View file @
1f268285
...
...
@@ -108,6 +108,10 @@ Library
Extension Modules
-----------------
- Issue #6561: '\d' in a regex now matches only characters with
Unicode category 'Nd' (Number, Decimal Digit). Previously it also
matched characters with category 'No'.
- Issue #4509: Array objects are no longer modified after an operation
failing due to the resize restriction in-place when the object has exported
buffers.
...
...
Modules/_sre.c
View file @
1f268285
...
...
@@ -168,7 +168,7 @@ static unsigned int sre_lower_locale(unsigned int ch)
#if defined(HAVE_UNICODE)
#define SRE_UNI_IS_DIGIT(ch) Py_UNICODE_ISD
IGIT
((Py_UNICODE)(ch))
#define SRE_UNI_IS_DIGIT(ch) Py_UNICODE_ISD
ECIMAL
((Py_UNICODE)(ch))
#define SRE_UNI_IS_SPACE(ch) Py_UNICODE_ISSPACE((Py_UNICODE)(ch))
#define SRE_UNI_IS_LINEBREAK(ch) Py_UNICODE_ISLINEBREAK((Py_UNICODE)(ch))
#define SRE_UNI_IS_ALNUM(ch) Py_UNICODE_ISALNUM((Py_UNICODE)(ch))
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment