Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
8663e346
Commit
8663e346
authored
Jan 10, 2011
by
Terry Reedy
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Issue #10875: Update Regular Expression HOWTO; patch by 'SilentGhost'.
parent
c842a17d
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
18 additions
and
28 deletions
+18
-28
Doc/howto/regex.rst
Doc/howto/regex.rst
+16
-28
Misc/NEWS
Misc/NEWS
+2
-0
No files found.
Doc/howto/regex.rst
View file @
8663e346
...
...
@@ -5,7 +5,6 @@
****************************
:Author: A.M. Kuchling
<amk
@
amk
.
ca
>
:Release: 0.05
.. TODO:
Document lookbehind assertions
...
...
@@ -24,11 +23,6 @@
Introduction
============
The :mod:`re` module was added in Python 1.5, and provides Perl-style regular
expression patterns. Earlier versions of Python came with the :mod:`regex`
module, which provided Emacs-style patterns. The :mod:`regex` module was
removed completely in Python 2.5.
Regular expressions (called REs, or regexes, or regex patterns) are essentially
a tiny, highly specialized programming language embedded inside Python and made
available through the :mod:`re` module. Using this little language, you specify
...
...
@@ -264,7 +258,7 @@ performing string substitutions. ::
>>> import re
>>> p = re.compile('ab*')
>>> p
<
_sre
.
SRE_Pattern
object
at
80b4150
>
<
_sre
.
SRE_Pattern
object
at
0x
...
>
:func:`re.compile` also accepts an optional *flags* argument, used to enable
various special features and syntax variations. We'll go over the available
...
...
@@ -362,8 +356,8 @@ information about the match: where it starts and ends, the substring it matched,
and more.
You can learn about this by interactively experimenting with the :mod:`re`
module. If you have
Tkinter
available, you may also want to look at
:file:`Tools/
scripts
/redemo.py`, a demonstration program included with the
module. If you have
:mod:`tkinter`
available, you may also want to look at
:file:`Tools/
demo
/redemo.py`, a demonstration program included with the
Python distribution. It allows you to enter REs and strings, and displays
whether the RE matches or fails. :file:`redemo.py` can be quite useful when
trying to debug a complicated RE. Phil Schwartz's `Kodos
...
...
@@ -373,11 +367,10 @@ testing RE patterns.
This HOWTO uses the standard Python interpreter for its examples. First, run the
Python interpreter, import the :mod:`re` module, and compile a RE::
Python 2.2.2 (#1, Feb 10 2003, 12:57:01)
>>> import re
>>> p = re.compile('[a-z]+')
>>> p
<
_sre
.
SRE_Pattern
object
at
80c3c28
>
<
_sre
.
SRE_Pattern
object
at
0x
...
>
Now, you can try matching various strings against the RE ``[a-z]+``. An empty
string shouldn't match at all, since ``+`` means 'one or more repetitions'.
...
...
@@ -395,7 +388,7 @@ result in a variable for later use. ::
>>> m = p.match('tempo')
>>> m
<
_sre
.
SRE_Match
object
at
80c4f68
>
<
_sre
.
SRE_Match
object
at
0x
...
>
Now you can query the :class:`MatchObject` for information about the matching
string. :class:`MatchObject` instances also have several methods and
...
...
@@ -434,7 +427,7 @@ case. ::
>>> print(p.match('::: message'))
None
>>> m = p.search('::: message') ; print(m)
<
re
.
MatchObject
instance
at
80c9650
>
<
_sre
.
SRE_Match
object
at
0x
...
>
>>> m.group()
'message'
>>> m.span()
...
...
@@ -459,11 +452,11 @@ Two pattern methods return all of the matches for a pattern.
:meth:`findall` has to create the entire list before it can be returned as the
result. The :meth:`finditer` method returns a sequence of :class:`MatchObject`
instances as an :term:`iterator`
. [#]_
::
instances as an :term:`iterator`::
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
>>> iterator
<callable
-iterator
object
at
0x401833ac
>
<callable
_iterator
object
at
0x
...
>
>>> for match in iterator:
... print(match.span())
...
...
...
@@ -485,7 +478,7 @@ the RE string added as the first argument, and still return either ``None`` or a
>>> print(re.match(r'From\s+', 'Fromage amk'))
None
>>> re.match(r'From\s+', 'From amk Thu May 14 19:12:10 1998')
<
re
.
MatchObject
instance
at
80c5978
>
<
_sre
.
SRE_Match
object
at
0x
...
>
Under the hood, these functions simply create a pattern object for you
and call the appropriate method on it. They also store the compiled object in a
...
...
@@ -687,7 +680,7 @@ given location, they can obviously be matched an infinite number of times.
line, the RE to use is ``^From``. ::
>>> print(re.search('^From', 'From Here to Eternity'))
<
re
.
MatchObject
instance
at
80c1520
>
<
_sre
.
SRE_Match
object
at
0x
...
>
>>> print(re.search('^From', 'Reciting From Memory'))
None
...
...
@@ -699,11 +692,11 @@ given location, they can obviously be matched an infinite number of times.
or any location followed by a newline character. ::
>>> print(re.search('}$', '{block}'))
<
re
.
MatchObject
instance
at
80adfa8
>
<
_sre
.
SRE_Match
object
at
0x
...
>
>>> print(re.search('}$', '{block} '))
None
>>> print(re.search('}$', '{block}\n'))
<
re
.
MatchObject
instance
at
80adfa8
>
<
_sre
.
SRE_Match
object
at
0x
...
>
To match a literal ``'$'``, use ``\$`` or enclose it inside a character class,
as in ``[$]``.
...
...
@@ -728,7 +721,7 @@ given location, they can obviously be matched an infinite number of times.
>>> p = re.compile(r'\bclass\b')
>>> print(p.search('no class at all'))
<
re
.
MatchObject
instance
at
80c8f28
>
<
_sre
.
SRE_Match
object
at
0x
...
>
>>> print(p.search('the declassified algorithm'))
None
>>> print(p.search('one subclass is'))
...
...
@@ -746,7 +739,7 @@ given location, they can obviously be matched an infinite number of times.
>>> print(p.search('no class at all'))
None
>>> print(p.search('\b' + 'class' + '\b') )
<
re
.
MatchObject
instance
at
80c3ee0
>
<
_sre
.
SRE_Match
object
at
0x
...
>
Second, inside a character class, where there's no use for this assertion,
``\b`` represents the backspace character, for compatibility with Python's
...
...
@@ -1316,8 +1309,8 @@ a regular expression that handles all of the possible cases, the patterns will
be *very* complicated. Use an HTML or XML parser module for such tasks.)
Not
Using re.VERBOSE
----------------
----
Using re.VERBOSE
----------------
By now you've probably noticed that regular expressions are a very compact
notation, but they're not terribly readable. REs of moderate complexity can
...
...
@@ -1366,8 +1359,3 @@ reference for programming in Python. (The first edition covered Python's
now-removed :mod:`regex` module, which won't help you much.) Consider checking
it out from your library.
.. rubric:: Footnotes
.. [#] Introduced in Python 2.2.2.
Misc/NEWS
View file @
8663e346
...
...
@@ -31,6 +31,8 @@ Core and Builtins
Library
-------
- Issue #10875: Update Regular Expression HOWTO; patch by 'SilentGhost'.
- Issue #10869: Fixed bug where ast.increment_lineno modified the root
node twice.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment