Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
961c2882
Commit
961c2882
authored
Sep 10, 2004
by
Fred Drake
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
document the exceptions raised by sgmllib, htmllib, and HTMLParser
parent
a2544ee7
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
34 additions
and
6 deletions
+34
-6
Doc/lib/libhtmllib.tex
Doc/lib/libhtmllib.tex
+11
-4
Doc/lib/libhtmlparser.tex
Doc/lib/libhtmlparser.tex
+13
-0
Doc/lib/libsgmllib.tex
Doc/lib/libsgmllib.tex
+10
-2
No files found.
Doc/lib/libhtmllib.tex
View file @
961c2882
...
...
@@ -35,8 +35,8 @@ The interface to feed data to an instance is through the \method{feed()}
method, which takes a string argument. This can be called with as
little or as much text at a time as desired;
\samp
{
p.feed(a);
p.feed(b)
}
has the same effect as
\samp
{
p.feed(a+b)
}
. When the data
contains complete HTML
tag
s, these are processed immediately;
incomplete
elemen
ts are saved in a buffer. To force processing of all
contains complete HTML
markup construct
s, these are processed immediately;
incomplete
construc
ts are saved in a buffer. To force processing of all
unprocessed data, call the
\method
{
close()
}
method.
For example, to parse the entire contents of a file, use:
...
...
@@ -60,7 +60,7 @@ should define the \method{do_\var{tag}()} method.
\end{itemize}
The module defines a
single class
:
The module defines a
parser class and an exception
:
\begin{classdesc}
{
HTMLParser
}{
formatter
}
This is the basic HTML parser class. It supports all entity names
...
...
@@ -68,6 +68,12 @@ required by the XHTML 1.0 Recommendation (\url{http://www.w3.org/TR/xhtml1}).
It also defines handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.
\end{classdesc}
\begin{excdesc}
{
HTMLParseError
}
Exception raised by the
\class
{
HTMLParser
}
class when it encounters an
error while parsing.
\versionadded
{
2.4
}
\end{excdesc}
\begin{seealso}
\seemodule
{
formatter
}{
Interface definition for transforming an
...
...
@@ -118,7 +124,8 @@ implementation adds a textual footnote marker using an index into the
list of hyperlinks created by
\method
{
anchor
_
bgn()
}
.
\end{methoddesc}
\begin{methoddesc}
{
handle
_
image
}{
source, alt
\optional
{
, ismap
\optional
{
, align
\optional
{
, width
\optional
{
, height
}}}}}
\begin{methoddesc}
{
handle
_
image
}{
source, alt
\optional
{
, ismap
\optional
{
,
align
\optional
{
, width
\optional
{
, height
}}}}}
This method is called to handle images. The default implementation
simply passes the
\var
{
alt
}
value to the
\method
{
handle
_
data()
}
method.
...
...
Doc/lib/libhtmlparser.tex
View file @
961c2882
...
...
@@ -4,6 +4,8 @@
\declaremodule
{
standard
}{
HTMLParser
}
\modulesynopsis
{
A simple parser that can handle HTML and XHTML.
}
\versionadded
{
2.2
}
This module defines a class
\class
{
HTMLParser
}
which serves as the
basis for parsing text files formatted in HTML
\index
{
HTML
}
(HyperText
Mark-up Language) and XHTML.
\index
{
XHTML
}
Unlike the parser in
...
...
@@ -23,6 +25,17 @@ that end tags match start tags or call the end-tag handler for
elements which are closed implicitly by closing an outer element.
\end{classdesc}
An exception is defined as well:
\begin{excdesc}
{
HTMLParseError
}
Exception raised by the
\class
{
HTMLParser
}
class when it encounters an
error while parsing. This exception provides three attributes:
\member
{
msg
}
is a brief message explaining the error,
\member
{
lineno
}
is the number of the line on which the broken construct was detected,
and
\member
{
offset
}
is the number of characters into the line at which
the construct starts.
\end{excdesc}
\class
{
HTMLParser
}
instances have the following methods:
...
...
Doc/lib/libsgmllib.tex
View file @
961c2882
...
...
@@ -14,7 +14,6 @@ only exists as a base for the \refmodule{htmllib} module. Another
HTML parser which supports XHTML and offers a somewhat different
interface is available in the
\refmodule
{
HTMLParser
}
module.
\begin{classdesc}
{
SGMLParser
}{}
The
\class
{
SGMLParser
}
class is instantiated without arguments.
The parser is hardcoded to recognize the following
...
...
@@ -40,7 +39,16 @@ spaces, tabs, and newlines are allowed between the trailing
\end{itemize}
\end{classdesc}
\class
{
SGMLParser
}
instances have the following interface methods:
A single exception is defined as well:
\begin{excdesc}
{
SGMLParseError
}
Exception raised by the
\class
{
SGMLParser
}
class when it encounters an
error while parsing.
\versionadded
{
2.1
}
\end{excdesc}
\class
{
SGMLParser
}
instances have the following methods:
\begin{methoddesc}
{
reset
}{}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment