bpo-34636: Use fast path for more chars in SRE category macros. (GH-9170)

When handling \s, \d, or \w (and their inverse) escapes in bytes regexes this a small but measurable performance improvement.  https://bugs.python.org/issue34636

bpo-34636: Use fast path for more chars in SRE category macros. (GH-9170)
When handling \s, \d, or \w (and their inverse) escapes in bytes regexes this a small but measurable performance improvement.  https://bugs.python.org/issue34636
ec014a10 · Sergey Fedoseev · Miss Islington (bot) · d13e59c1 · ec014a10 · ec014a10
Commit ec014a10 authored Sep 12, 2018 by Sergey Fedoseev Committed by Miss Islington (bot) Sep 11, 2018
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 3 deletions

Misc/NEWS.d/next/Library/2018-09-11-15-04-05.bpo-34636.capCmt.rst ...S.d/next/Library/2018-09-11-15-04-05.bpo-34636.capCmt.rst +2 -0

Modules/_sre.c Modules/_sre.c +3 -3

No files found.
--- a/Misc/NEWS.d/next/Library/2018-09-11-15-04-05.bpo-34636.capCmt.rst
+++ b/Misc/NEWS.d/next/Library/2018-09-11-15-04-05.bpo-34636.capCmt.rst
+Speed up re scanning of many non-matching characters for \s \w and \d within
+bytes objects. (microoptimization)
--- a/Modules/_sre.c
+++ b/Modules/_sre.c
@@ -87,13 +87,13 @@ static const char copyright[] =
 /* search engine state */

 #define SRE_IS_DIGIT(ch)\
-    ((ch) < 128 && Py_ISDIGIT(ch))
+    ((ch) <= '9' && Py_ISDIGIT(ch))
 #define SRE_IS_SPACE(ch)\
-    ((ch) < 128 && Py_ISSPACE(ch))
+    ((ch) <= ' ' && Py_ISSPACE(ch))
 #define SRE_IS_LINEBREAK(ch)\
    ((ch) == '\n')
 #define SRE_IS_WORD(ch)\
-    ((ch) < 128 && (Py_ISALNUM(ch) || (ch) == '_'))
+    ((ch) <= 'z' && (Py_ISALNUM(ch) || (ch) == '_'))

 static unsigned int sre_lower_ascii(unsigned int ch)
 {