Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
ddbdc9a6
Commit
ddbdc9a6
authored
Oct 06, 2013
by
Georg Brandl
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Closes #15956: improve documentation of named groups and how to reference them.
parent
bcc55d69
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
28 additions
and
15 deletions
+28
-15
Doc/library/re.rst
Doc/library/re.rst
+28
-15
No files found.
Doc/library/re.rst
View file @
ddbdc9a6
...
...
@@ -237,21 +237,32 @@ The special characters are:
``(?P<name>...)``
Similar to regular parentheses, but the substring matched by the group is
accessible within the rest of the regular expression via the symbolic group
name *name*. Group names must be valid Python identifiers, and each group
name must be defined only once within a regular expression. A symbolic group
is also a numbered group, just as if the group were not named. So the group
named ``id`` in the example below can also be referenced as the numbered group
``1``.
For example, if the pattern is ``(?P<id>[a-zA-Z_]\w*)``, the group can be
referenced by its name in arguments to methods of match objects, such as
``m.group('id')`` or ``m.end('id')``, and also by name in the regular
expression itself (using ``(?P=id)``) and replacement text given to
``.sub()`` (using ``\g<id>``).
accessible via the symbolic group name *name*. Group names must be valid
Python identifiers, and each group name must be defined only once within a
regular expression. A symbolic group is also a numbered group, just as if
the group were not named.
Named groups can be referenced in three contexts. If the pattern is
``(?P<quote>['"]).*?(?P=quote)`` (i.e. matching a string quoted with either
single or double quotes):
+---------------------------------------+----------------------------------+
| Context of reference to group "quote" | Ways to reference it |
+=======================================+==================================+
| in the same pattern itself | * ``(?P=quote)`` (as shown) |
| | * ``\1`` |
+---------------------------------------+----------------------------------+
| when processing match object ``m`` | * ``m.group('quote')`` |
| | * ``m.end('quote')`` (etc.) |
+---------------------------------------+----------------------------------+
| in a string passed to the ``repl`` | * ``\g<quote>`` |
| argument of ``re.sub()`` | * ``\g<1>`` |
| | * ``\1`` |
+---------------------------------------+----------------------------------+
``(?P=name)``
Matches whatever text was matched by the earlier group named *name*.
A backreference to a named group; it matches whatever text was matched by the
earlier group named *name*.
``(?#...)``
A comment; the contents of the parentheses are simply ignored.
...
...
@@ -331,7 +342,8 @@ the second character. For example, ``\$`` matches the character ``'$'``.
depends on the values of the ``UNICODE`` and ``LOCALE`` flags.
For example, ``r'\bfoo\b'`` matches ``'foo'``, ``'foo.'``, ``'(foo)'``,
``'bar foo baz'`` but not ``'foobar'`` or ``'foo3'``.
Inside a character range, ``\b`` represents the backspace character, for compatibility with Python's string literals.
Inside a character range, ``\b`` represents the backspace character, for
compatibility with Python's string literals.
``\B``
Matches the empty string, but only when it is *not* at the beginning or end of a
...
...
@@ -642,7 +654,8 @@ form.
when not adjacent to a previous match, so ``sub('x*', '-', 'abc')`` returns
``'-a-b-c-'``.
In addition to character escapes and backreferences as described above,
In string-type *repl* arguments, in addition to the character escapes and
backreferences described above,
``\g<name>`` will use the substring matched by the group named ``name``, as
defined by the ``(?P<name>...)`` syntax. ``\g<number>`` uses the corresponding
group number; ``\g<2>`` is therefore equivalent to ``\2``, but isn't ambiguous
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment