Commit 961c2882 authored by Fred Drake's avatar Fred Drake

document the exceptions raised by sgmllib, htmllib, and HTMLParser

parent a2544ee7
...@@ -35,8 +35,8 @@ The interface to feed data to an instance is through the \method{feed()} ...@@ -35,8 +35,8 @@ The interface to feed data to an instance is through the \method{feed()}
method, which takes a string argument. This can be called with as method, which takes a string argument. This can be called with as
little or as much text at a time as desired; \samp{p.feed(a); little or as much text at a time as desired; \samp{p.feed(a);
p.feed(b)} has the same effect as \samp{p.feed(a+b)}. When the data p.feed(b)} has the same effect as \samp{p.feed(a+b)}. When the data
contains complete HTML tags, these are processed immediately; contains complete HTML markup constructs, these are processed immediately;
incomplete elements are saved in a buffer. To force processing of all incomplete constructs are saved in a buffer. To force processing of all
unprocessed data, call the \method{close()} method. unprocessed data, call the \method{close()} method.
For example, to parse the entire contents of a file, use: For example, to parse the entire contents of a file, use:
...@@ -60,7 +60,7 @@ should define the \method{do_\var{tag}()} method. ...@@ -60,7 +60,7 @@ should define the \method{do_\var{tag}()} method.
\end{itemize} \end{itemize}
The module defines a single class: The module defines a parser class and an exception:
\begin{classdesc}{HTMLParser}{formatter} \begin{classdesc}{HTMLParser}{formatter}
This is the basic HTML parser class. It supports all entity names This is the basic HTML parser class. It supports all entity names
...@@ -68,6 +68,12 @@ required by the XHTML 1.0 Recommendation (\url{http://www.w3.org/TR/xhtml1}). ...@@ -68,6 +68,12 @@ required by the XHTML 1.0 Recommendation (\url{http://www.w3.org/TR/xhtml1}).
It also defines handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements. It also defines handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.
\end{classdesc} \end{classdesc}
\begin{excdesc}{HTMLParseError}
Exception raised by the \class{HTMLParser} class when it encounters an
error while parsing.
\versionadded{2.4}
\end{excdesc}
\begin{seealso} \begin{seealso}
\seemodule{formatter}{Interface definition for transforming an \seemodule{formatter}{Interface definition for transforming an
...@@ -118,7 +124,8 @@ implementation adds a textual footnote marker using an index into the ...@@ -118,7 +124,8 @@ implementation adds a textual footnote marker using an index into the
list of hyperlinks created by \method{anchor_bgn()}. list of hyperlinks created by \method{anchor_bgn()}.
\end{methoddesc} \end{methoddesc}
\begin{methoddesc}{handle_image}{source, alt\optional{, ismap\optional{, align\optional{, width\optional{, height}}}}} \begin{methoddesc}{handle_image}{source, alt\optional{, ismap\optional{,
align\optional{, width\optional{, height}}}}}
This method is called to handle images. The default implementation This method is called to handle images. The default implementation
simply passes the \var{alt} value to the \method{handle_data()} simply passes the \var{alt} value to the \method{handle_data()}
method. method.
......
...@@ -4,6 +4,8 @@ ...@@ -4,6 +4,8 @@
\declaremodule{standard}{HTMLParser} \declaremodule{standard}{HTMLParser}
\modulesynopsis{A simple parser that can handle HTML and XHTML.} \modulesynopsis{A simple parser that can handle HTML and XHTML.}
\versionadded{2.2}
This module defines a class \class{HTMLParser} which serves as the This module defines a class \class{HTMLParser} which serves as the
basis for parsing text files formatted in HTML\index{HTML} (HyperText basis for parsing text files formatted in HTML\index{HTML} (HyperText
Mark-up Language) and XHTML.\index{XHTML} Unlike the parser in Mark-up Language) and XHTML.\index{XHTML} Unlike the parser in
...@@ -23,6 +25,17 @@ that end tags match start tags or call the end-tag handler for ...@@ -23,6 +25,17 @@ that end tags match start tags or call the end-tag handler for
elements which are closed implicitly by closing an outer element. elements which are closed implicitly by closing an outer element.
\end{classdesc} \end{classdesc}
An exception is defined as well:
\begin{excdesc}{HTMLParseError}
Exception raised by the \class{HTMLParser} class when it encounters an
error while parsing. This exception provides three attributes:
\member{msg} is a brief message explaining the error, \member{lineno}
is the number of the line on which the broken construct was detected,
and \member{offset} is the number of characters into the line at which
the construct starts.
\end{excdesc}
\class{HTMLParser} instances have the following methods: \class{HTMLParser} instances have the following methods:
......
...@@ -14,7 +14,6 @@ only exists as a base for the \refmodule{htmllib} module. Another ...@@ -14,7 +14,6 @@ only exists as a base for the \refmodule{htmllib} module. Another
HTML parser which supports XHTML and offers a somewhat different HTML parser which supports XHTML and offers a somewhat different
interface is available in the \refmodule{HTMLParser} module. interface is available in the \refmodule{HTMLParser} module.
\begin{classdesc}{SGMLParser}{} \begin{classdesc}{SGMLParser}{}
The \class{SGMLParser} class is instantiated without arguments. The \class{SGMLParser} class is instantiated without arguments.
The parser is hardcoded to recognize the following The parser is hardcoded to recognize the following
...@@ -40,7 +39,16 @@ spaces, tabs, and newlines are allowed between the trailing ...@@ -40,7 +39,16 @@ spaces, tabs, and newlines are allowed between the trailing
\end{itemize} \end{itemize}
\end{classdesc} \end{classdesc}
\class{SGMLParser} instances have the following interface methods: A single exception is defined as well:
\begin{excdesc}{SGMLParseError}
Exception raised by the \class{SGMLParser} class when it encounters an
error while parsing.
\versionadded{2.1}
\end{excdesc}
\class{SGMLParser} instances have the following methods:
\begin{methoddesc}{reset}{} \begin{methoddesc}{reset}{}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment