Commit f99e4b5d authored by Ezio Melotti's avatar Ezio Melotti

Improve HTMLParser example in the doc and fix a couple minor things.

parent f50ffa94
...@@ -101,9 +101,9 @@ An exception is defined as well: ...@@ -101,9 +101,9 @@ An exception is defined as well:
.. method:: HTMLParser.handle_startendtag(tag, attrs) .. method:: HTMLParser.handle_startendtag(tag, attrs)
Similar to :meth:`handle_starttag`, but called when the parser encounters an Similar to :meth:`handle_starttag`, but called when the parser encounters an
XHTML-style empty tag (``<a .../>``). This method may be overridden by XHTML-style empty tag (``<img ... />``). This method may be overridden by
subclasses which require this particular lexical information; the default subclasses which require this particular lexical information; the default
implementation simple calls :meth:`handle_starttag` and :meth:`handle_endtag`. implementation simply calls :meth:`handle_starttag` and :meth:`handle_endtag`.
.. method:: HTMLParser.handle_endtag(tag) .. method:: HTMLParser.handle_endtag(tag)
...@@ -178,27 +178,23 @@ An exception is defined as well: ...@@ -178,27 +178,23 @@ An exception is defined as well:
Example HTML Parser Application Example HTML Parser Application
------------------------------- -------------------------------
As a basic example, below is a very basic HTML parser that uses the As a basic example, below is a simple HTML parser that uses the
:class:`HTMLParser` class to print out tags as they are encountered:: :class:`HTMLParser` class to print out start tags, end tags, and data
as they are encountered::
>>> from html.parser import HTMLParser
>>> from html.parser import HTMLParser
>>> class MyHTMLParser(HTMLParser):
... def handle_starttag(self, tag, attrs): class MyHTMLParser(HTMLParser):
... print("Encountered a {} start tag".format(tag)) def handle_starttag(self, tag, attrs):
... def handle_endtag(self, tag): print("Encountered a start tag:", tag)
... print("Encountered a {} end tag".format(tag)) def handle_endtag(self, tag):
... print("Encountered an end tag:", tag)
>>> page = """<html><h1>Title</h1><p>I'm a paragraph!</p></html>""" def handle_data(self, data):
>>> print("Encountered some data:", data)
>>> myparser = MyHTMLParser()
>>> myparser.feed(page) parser = MyHTMLParser()
Encountered a html start tag parser.feed('<html><head><title>Test</title></head>'
Encountered a h1 start tag '<body><h1>Parse me!</h1></body></html>')
Encountered a h1 end tag
Encountered a p start tag
Encountered a p end tag
Encountered a html end tag
.. rubric:: Footnotes .. rubric:: Footnotes
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment