Commit 25211f57 authored by Fred Drake's avatar Fred Drake

Added more information on the differences between the htmllib and HTMLParser

modules.
parent 5fe2c139
...@@ -70,6 +70,12 @@ handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements. ...@@ -70,6 +70,12 @@ handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.
\begin{seealso} \begin{seealso}
\seemodule{HTMLParser}{Alternate HTML parser that offers a slightly
lower-level view of the input, but is
designed to work with XHTML, and does not
implement some of the SGML syntax not used in
``HTML as deployed'' and which isn't legal
for XHTML.}
\seemodule{htmlentitydefs}{Definition of replacement text for HTML \seemodule{htmlentitydefs}{Definition of replacement text for HTML
2.0 entities.} 2.0 entities.}
\seemodule{sgmllib}{Base class for \class{HTMLParser}.} \seemodule{sgmllib}{Base class for \class{HTMLParser}.}
......
...@@ -6,7 +6,9 @@ ...@@ -6,7 +6,9 @@
This module defines a class \class{HTMLParser} which serves as the This module defines a class \class{HTMLParser} which serves as the
basis for parsing text files formatted in HTML\index{HTML} (HyperText basis for parsing text files formatted in HTML\index{HTML} (HyperText
Mark-up Language) and XHTML.\index{XHTML} Mark-up Language) and XHTML.\index{XHTML} Unlike the parser in
\refmodule{htmllib}, this parser is not based on the SGML parser in
\refmodule{sgmllib}.
\begin{classdesc}{HTMLParser}{} \begin{classdesc}{HTMLParser}{}
...@@ -15,6 +17,10 @@ The \class{HTMLParser} class is instantiated without arguments. ...@@ -15,6 +17,10 @@ The \class{HTMLParser} class is instantiated without arguments.
An HTMLParser instance is fed HTML data and calls handler functions An HTMLParser instance is fed HTML data and calls handler functions
when tags begin and end. The \class{HTMLParser} class is meant to be when tags begin and end. The \class{HTMLParser} class is meant to be
overridden by the user to provide a desired behavior. overridden by the user to provide a desired behavior.
Unlike the parser in \refmodule{htmllib}, this parser does not check
that end tags match start tags or call the end-tag handler for
elements which are closed implicitly by closing an outer element.
\end{classdesc} \end{classdesc}
......
...@@ -10,8 +10,9 @@ This module defines a class \class{SGMLParser} which serves as the ...@@ -10,8 +10,9 @@ This module defines a class \class{SGMLParser} which serves as the
basis for parsing text files formatted in SGML (Standard Generalized basis for parsing text files formatted in SGML (Standard Generalized
Mark-up Language). In fact, it does not provide a full SGML parser Mark-up Language). In fact, it does not provide a full SGML parser
--- it only parses SGML insofar as it is used by HTML, and the module --- it only parses SGML insofar as it is used by HTML, and the module
only exists as a base for the \refmodule{htmllib}\refstmodindex{htmllib} only exists as a base for the \refmodule{htmllib} module. Another
module. HTML parser which supports XHTML and offers a somewhat different
interface is available in the \refmodule{HTMLParser} module.
\begin{classdesc}{SGMLParser}{} \begin{classdesc}{SGMLParser}{}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment