Commit 66771422 authored by Cheryl Sabella's avatar Cheryl Sabella Committed by Terry Jan Reedy

bpo-32614: Modify re examples to use a raw string to prevent warning (GH-5265)

Modify RE examples in documentation to use raw strings to prevent DeprecationWarning.
Add text to REGEX HOWTO to highlight the deprecation.  Approved by Serhiy Storchaka.
parent bbbcf869
...@@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one ...@@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
disadvantage which is the topic of the next section. disadvantage which is the topic of the next section.
.. _the-backslash-plague:
The Backslash Plague The Backslash Plague
-------------------- --------------------
...@@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with ...@@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
while ``"\n"`` is a one-character string containing a newline. Regular while ``"\n"`` is a one-character string containing a newline. Regular
expressions will often be written in Python code using this raw string notation. expressions will often be written in Python code using this raw string notation.
In addition, special escape sequences that are valid in regular expressions,
but not valid as Python string literals, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
which means the sequences will be invalid if raw string notation or escaping
the backslashes isn't used.
+-------------------+------------------+ +-------------------+------------------+
| Regular String | Raw string | | Regular String | Raw string |
+===================+==================+ +===================+==================+
...@@ -457,10 +466,16 @@ In actual programs, the most common style is to store the ...@@ -457,10 +466,16 @@ In actual programs, the most common style is to store the
Two pattern methods return all of the matches for a pattern. Two pattern methods return all of the matches for a pattern.
:meth:`~re.Pattern.findall` returns a list of matching strings:: :meth:`~re.Pattern.findall` returns a list of matching strings::
>>> p = re.compile('\d+') >>> p = re.compile(r'\d+')
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping') >>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
['12', '11', '10'] ['12', '11', '10']
The ``r`` prefix, making the literal a raw string literal, is needed in this
example because escape sequences in a normal "cooked" string literal that are
not recognized by Python, as opposed to regular expressions, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
:ref:`the-backslash-plague`.
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the :meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
result. The :meth:`~re.Pattern.finditer` method returns a sequence of result. The :meth:`~re.Pattern.finditer` method returns a sequence of
:ref:`match object <match-objects>` instances as an :term:`iterator`:: :ref:`match object <match-objects>` instances as an :term:`iterator`::
...@@ -1096,11 +1111,11 @@ following calls:: ...@@ -1096,11 +1111,11 @@ following calls::
The module-level function :func:`re.split` adds the RE to be used as the first The module-level function :func:`re.split` adds the RE to be used as the first
argument, but is otherwise the same. :: argument, but is otherwise the same. ::
>>> re.split('[\W]+', 'Words, words, words.') >>> re.split(r'[\W]+', 'Words, words, words.')
['Words', 'words', 'words', ''] ['Words', 'words', 'words', '']
>>> re.split('([\W]+)', 'Words, words, words.') >>> re.split(r'([\W]+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', ''] ['Words', ', ', 'words', ', ', 'words', '.', '']
>>> re.split('[\W]+', 'Words, words, words.', 1) >>> re.split(r'[\W]+', 'Words, words, words.', 1)
['Words', 'words, words.'] ['Words', 'words, words.']
......
...@@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and ...@@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
Arabic numerals:: Arabic numerals::
import re import re
p = re.compile('\d+') p = re.compile(r'\d+')
s = "Over \u0e55\u0e57 57 flavours" s = "Over \u0e55\u0e57 57 flavours"
m = p.search(s) m = p.search(s)
......
...@@ -345,7 +345,7 @@ The special characters are: ...@@ -345,7 +345,7 @@ The special characters are:
This example looks for a word following a hyphen: This example looks for a word following a hyphen:
>>> m = re.search('(?<=-)\w+', 'spam-egg') >>> m = re.search(r'(?<=-)\w+', 'spam-egg')
>>> m.group(0) >>> m.group(0)
'egg' 'egg'
......
Modify RE examples in documentation to use raw strings to prevent
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
deprecation.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment