Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
4e28c593
Commit
4e28c593
authored
Apr 22, 1999
by
Fred Drake
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Markup nits.
Make module references hyperlinks.
parent
b7168c3a
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
45 additions
and
46 deletions
+45
-46
Doc/lib/libhtmllib.tex
Doc/lib/libhtmllib.tex
+15
-16
Doc/lib/libsgmllib.tex
Doc/lib/libsgmllib.tex
+30
-30
No files found.
Doc/lib/libhtmllib.tex
View file @
4e28c593
\section
{
\module
{
htmllib
}
---
A parser for HTML documents.
}
\declaremodule
{
standard
}{
htmllib
}
A parser for HTML documents
}
\declaremodule
{
standard
}{
htmllib
}
\modulesynopsis
{
A parser for HTML documents.
}
\index
{
HTML
}
...
...
@@ -17,15 +17,13 @@ in string form via a method, and makes calls to methods of a
other classes in order to add functionality, and allows most of its
methods to be extended or overridden. In turn, this class is derived
from and extends the
\class
{
SGMLParser
}
class defined in module
\module
{
sgmllib
}
\refstmodindex
{
sgmllib
}
. The
\class
{
HTMLParser
}
\
ref
module
{
sgmllib
}
\refstmodindex
{
sgmllib
}
. The
\class
{
HTMLParser
}
implementation supports the HTML 2.0 language as described in
\rfc
{
1866
}
. Two implementations of formatter objects are provided in
the
\module
{
formatter
}
\refstmodindex
{
formatter
}
module; refer to the
the
\
ref
module
{
formatter
}
\refstmodindex
{
formatter
}
module; refer to the
documentation for that module for information on the formatter
interface.
\index
{
SGML
}
\withsubitem
{
(in module sgmllib)
}{
\ttindex
{
SGMLParser
}}
\index
{
formatter
}
The following is a summary of the interface defined by
\class
{
sgmllib.SGMLParser
}
:
...
...
@@ -49,16 +47,16 @@ parser.close()
\item
The interface to define semantics for HTML tags is very simple: derive
a class and define methods called
\
code
{
start
_
\var
{
tag
}
()
}
,
\
code
{
end
_
\var
{
tag
}
()
}
, or
\code
{
do
_
\var
{
tag
}
()
}
. The parser will
call these at appropriate moments:
\
code
{
start
_
\var
{
tag
}}
or
\
code
{
do
_
\var
{
tag
}
()
}
is called when an opening tag of the form
\code
{
<
\var
{
tag
}
...>
}
is encountered;
\
code
{
end
_
\var
{
tag
}
()
}
is called
a class and define methods called
\
method
{
start
_
\var
{
tag
}
()
}
,
\
method
{
end
_
\var
{
tag
}
()
}
, or
\method
{
do
_
\var
{
tag
}
()
}
. The parser will
call these at appropriate moments:
\
method
{
start
_
\var
{
tag
}}
or
\
method
{
do
_
\var
{
tag
}
()
}
is called when an opening tag of the form
\code
{
<
\var
{
tag
}
...>
}
is encountered;
\
method
{
end
_
\var
{
tag
}
()
}
is called
when a closing tag of the form
\code
{
<
\var
{
tag
}
>
}
is encountered. If
an opening tag requires a corresponding closing tag, like
\code
{
<H1>
}
...
\code
{
</H1>
}
, the class should define the
\
code
{
start
_
\var
{
tag
}
()
}
...
\code
{
</H1>
}
, the class should define the
\
method
{
start
_
\var
{
tag
}
()
}
method; if a tag requires no closing tag, like
\code
{
<P>
}
, the class
should define the
\
code
{
do
_
\var
{
tag
}
()
}
method.
should define the
\
method
{
do
_
\var
{
tag
}
()
}
method.
\end{itemize}
...
...
@@ -90,8 +88,9 @@ affects the operation of \method{handle_data()} and \method{save_end()}.
This method is called at the start of an anchor region. The arguments
correspond to the attributes of the
\code
{
<A>
}
tag with the same
names. The default implementation maintains a list of hyperlinks
(defined by the
\code
{
href
}
attribute) within the document. The list
of hyperlinks is available as the data attribute
\code
{
anchorlist
}
.
(defined by the
\code
{
HREF
}
attribute for
\code
{
<A>
}
tags) within the
document. The list of hyperlinks is available as the data attribute
\member
{
anchorlist
}
.
\end{methoddesc}
\begin{methoddesc}
{
anchor
_
end
}{}
...
...
@@ -115,7 +114,7 @@ nested.
\begin{methoddesc}
{
save
_
end
}{}
Ends buffering character data and returns all data saved since the
preceeding call to
\method
{
save
_
bgn()
}
. If the
\
code
{
nofill
}
flag is
preceeding call to
\method
{
save
_
bgn()
}
. If the
\
member
{
nofill
}
flag is
false, whitespace is collapsed to single spaces. A call to this
method without a preceeding call to
\method
{
save
_
bgn()
}
will raise a
\exception
{
TypeError
}
exception.
...
...
Doc/lib/libsgmllib.tex
View file @
4e28c593
\section
{
\module
{
sgmllib
}
---
Simple SGML parser.
}
\declaremodule
{
standard
}{
sgmllib
}
Simple SGML parser
}
\declaremodule
{
standard
}{
sgmllib
}
\modulesynopsis
{
Only as much of an SGML parser as needed to parse HTML.
}
\index
{
SGML
}
...
...
@@ -10,7 +10,7 @@ This module defines a class \class{SGMLParser} which serves as the
basis for parsing text files formatted in SGML (Standard Generalized
Mark-up Language). In fact, it does not provide a full SGML parser
--- it only parses SGML insofar as it is used by HTML, and the module
only exists as a base for the
\module
{
htmllib
}
\refstmodindex
{
htmllib
}
only exists as a base for the
\
ref
module
{
htmllib
}
\refstmodindex
{
htmllib
}
module.
...
...
@@ -49,8 +49,8 @@ implicitly at instantiation time.
\begin{methoddesc}
{
setnomoretags
}{}
Stop processing tags. Treat all following input as literal input
(CDATA). (This is only provided so the HTML tag
\code
{
<PLAINTEXT>
}
can be implemented.)
(CDATA). (This is only provided so the HTML tag
\code
{
<PLAINTEXT>
}
can be implemented.)
\end{methoddesc}
\begin{methoddesc}
{
setliteral
}{}
...
...
@@ -72,15 +72,15 @@ redefined version should always call \method{close()}.
\begin{methoddesc}
{
handle
_
starttag
}{
tag, method, attributes
}
This method is called to handle start tags for which either a
\
code
{
start
_
\var
{
tag
}
()
}
or
\code
{
do
_
\var
{
tag
}
()
}
method has been
\
method
{
start
_
\var
{
tag
}
()
}
or
\method
{
do
_
\var
{
tag
}
()
}
method has been
defined. The
\var
{
tag
}
argument is the name of the tag converted to
lower case, and the
\var
{
method
}
argument is the bound method which
should be used to support semantic interpretation of the start tag.
The
\var
{
attributes
}
argument is a list of
\code
{
(
\var
{
name
}
,
\var
{
value
}
)
}
pairs containing the attributes found inside the tag's
\code
{
<>
}
brackets. The
\var
{
name
}
has been translated to lower case and doubl
e
quotes and backslashes in the
\var
{
value
}
have been interpreted. For
instance, for the tag
\code
{
<A HREF="http://www.cwi.nl/">
}
, this
The
\var
{
attributes
}
argument is a list of
\code
{
(
\var
{
name
}
,
\var
{
value
}
)
}
pairs containing the attributes found inside the tag's
\code
{
<>
}
brackets. The
\var
{
name
}
has been translated to lower cas
e
and double quotes and backslashes in the
\var
{
value
}
have been interpreted.
For
instance, for the tag
\code
{
<A HREF="http://www.cwi.nl/">
}
, this
method would be called as
\samp
{
unknown
_
starttag('a', [('href',
'http://www.cwi.nl/')])
}
. The base implementation simply calls
\var
{
method
}
with
\var
{
attributes
}
as the only argument.
...
...
@@ -88,11 +88,11 @@ method would be called as \samp{unknown_starttag('a', [('href',
\begin{methoddesc}
{
handle
_
endtag
}{
tag, method
}
This method is called to handle endtags for which an
\
code
{
end
_
\var
{
tag
}
()
}
method has been defined. The
\var
{
tag
}
argument is the name of the tag converted to lower case, and the
\var
{
method
}
argument is the bound method which should be used to
\
method
{
end
_
\var
{
tag
}
()
}
method has been defined. The
\var
{
tag
}
argument is the name of the tag converted to lower case, and
the
\var
{
method
}
argument is the bound method which should be used to
support semantic interpretation of the end tag. If no
\
code
{
end
_
\var
{
tag
}
()
}
method is defined for the closing element,
\
method
{
end
_
\var
{
tag
}
()
}
method is defined for the closing element,
this handler is not called. The base implementation simply calls
\var
{
method
}
.
\end{methoddesc}
...
...
@@ -120,12 +120,12 @@ This method is called to process a general entity reference of the
form
\samp
{
\&\var
{
ref
}
;
}
where
\var
{
ref
}
is an general entity
reference. It looks for
\var
{
ref
}
in the instance (or class)
variable
\member
{
entitydefs
}
which should be a mapping from entity
names to corresponding translations.
If a translation is found, it calls the method
\method
{
handle
_
data()
}
with the translation; otherwise, it calls the method
\code
{
unknown
_
entityref(
\var
{
ref
}
)
}
. The default
\member
{
entitydefs
}
defines translations for
\code
{
\&
amp;
}
,
\code
{
\&
apos
}
,
\code
{
\&
gt;
}
,
\code
{
\&
lt;
}
, and
\code
{
\&
quot;
}
.
names to corresponding translations.
If a translation is found, it
calls the method
\method
{
handle
_
data()
}
with the translation;
otherwise, it calls the method
\code
{
unknown
_
entityref(
\var
{
ref
}
)
}
.
The default
\member
{
entitydefs
}
defines translations for
\code
{
\&
amp;
}
,
\code
{
\&
apos
}
,
\code
{
\&
gt;
}
,
\code
{
\&
lt;
}
, and
\code
{
\&
quot;
}
.
\end{methoddesc}
\begin{methoddesc}
{
handle
_
comment
}{
comment
}
...
...
@@ -175,8 +175,8 @@ case:
\begin{methoddescni}
{
start
_
\var
{
tag
}}{
attributes
}
This method is called to process an opening tag
\var
{
tag
}
. It has
preference over
\
code
{
do
_
\var
{
tag
}
()
}
. The
\var
{
attributes
}
argument has the same meaning as described for
preference over
\
method
{
do
_
\var
{
tag
}
()
}
. The
\var
{
attributes
}
argument has the same meaning as described for
\method
{
handle
_
starttag()
}
above.
\end{methoddescni}
...
...
@@ -192,10 +192,10 @@ This method is called to process a closing tag \var{tag}.
Note that the parser maintains a stack of open elements for which no
end tag has been found yet. Only tags processed by
\
code
{
start
_
\var
{
tag
}
()
}
are pushed on this stack. Definition of an
\
code
{
end
_
\var
{
tag
}
()
}
method is optional for these tags. For tags
processed by
\
code
{
do
_
\var
{
tag
}
()
}
or by
\method
{
unknown
_
tag()
}
, no
\
code
{
end
_
\var
{
tag
}
()
}
method must be defined; if defined, it will not
be used. If both
\code
{
start
_
\var
{
tag
}
()
}
and
\code
{
do
_
\var
{
tag
}
()
}
methods exist for a tag, the
\code
{
start
_
\var
{
tag
}
()
}
method takes
precedence.
\
method
{
start
_
\var
{
tag
}
()
}
are pushed on this stack. Definition of an
\
method
{
end
_
\var
{
tag
}
()
}
method is optional for these tags. For tags
processed by
\
method
{
do
_
\var
{
tag
}
()
}
or by
\method
{
unknown
_
tag()
}
, no
\
method
{
end
_
\var
{
tag
}
()
}
method must be defined; if defined, it will
not be used. If both
\method
{
start
_
\var
{
tag
}
()
}
and
\method
{
do
_
\var
{
tag
}
()
}
methods exist for a tag, the
\method
{
start
_
\var
{
tag
}
()
}
method takes
precedence.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment