Commit 6ef871ce authored by Fred Drake's avatar Fred Drake

Logical markup.

Lots of nits in both.
parent 7be8fcb4
This diff is collapsed.
...@@ -5,59 +5,59 @@ ...@@ -5,59 +5,59 @@
\index{World-Wide Web} \index{World-Wide Web}
\index{URL} \index{URL}
\setindexsubitem{(in module urllib)}
This module provides a high-level interface for fetching data across This module provides a high-level interface for fetching data across
the World-Wide Web. In particular, the \code{urlopen()} function is the World-Wide Web. In particular, the \function{urlopen()} function
similar to the built-in function \code{open()}, but accepts URLs is similar to the built-in function \function{open()}, but accepts
(Universal Resource Locators) instead of filenames. Some restrictions Universal Resource Locators (URLs) instead of filenames. Some
apply --- it can only open URLs for reading, and no seek operations restrictions apply --- it can only open URLs for reading, and no seek
are available. operations are available.
It defines the following public functions: It defines the following public functions:
\begin{funcdesc}{urlopen}{url} \begin{funcdesc}{urlopen}{url}
Open a network object denoted by a URL for reading. If the URL does Open a network object denoted by a URL for reading. If the URL does
not have a scheme identifier, or if it has \samp{file:} as its scheme not have a scheme identifier, or if it has \file{file:} as its scheme
identifier, this opens a local file; otherwise it opens a socket to a identifier, this opens a local file; otherwise it opens a socket to a
server somewhere on the network. If the connection cannot be made, or server somewhere on the network. If the connection cannot be made, or
if the server returns an error code, the \code{IOError} exception is if the server returns an error code, the \exception{IOError} exception
raised. If all went well, a file-like object is returned. This is raised. If all went well, a file-like object is returned. This
supports the following methods: \code{read()}, \code{readline()}, supports the following methods: \method{read()}, \method{readline()},
\code{readlines()}, \code{fileno()}, \code{close()} and \code{info()}. \method{readlines()}, \method{fileno()}, \method{close()} and
\method{info()}.
Except for the last one, these methods have the same interface as for Except for the last one, these methods have the same interface as for
file objects --- see the section on File Objects earlier in this file objects --- see section \ref{bltin-file-objects} in this
manual. (It's not a built-in file object, however, so it can't be manual. (It is not a built-in file object, however, so it can't be
used at those few places where a true built-in file object is used at those few places where a true built-in file object is
required.) required.)
The \code{info()} method returns an instance of the class The \method{info()} method returns an instance of the class
\code{mimetools.Message} containing the headers received from the server, \class{mimetools.Message} containing the headers received from the
if the protocol uses such headers (currently the only supported server, if the protocol uses such headers (currently the only
protocol that uses this is HTTP). See the description of the supported protocol that uses this is HTTP). See the description of
\code{mimetools} module. the \module{mimetools}\refstmodindex{mimetools} module.
\refstmodindex{mimetools}
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{urlretrieve}{url} \begin{funcdesc}{urlretrieve}{url}
Copy a network object denoted by a URL to a local file, if necessary. Copy a network object denoted by a URL to a local file, if necessary.
If the URL points to a local file, or a valid cached copy of the If the URL points to a local file, or a valid cached copy of the
object exists, the object is not copied. Return a tuple (\var{filename}, object exists, the object is not copied. Return a tuple
\var{headers}) where \var{filename} is the local file name under which \code{(\var{filename}, \var{headers})} where \var{filename} is the
the object can be found, and \var{headers} is either \code{None} (for local file name under which the object can be found, and \var{headers}
a local object) or whatever the \code{info()} method of the object is either \code{None} (for a local object) or whatever the
returned by \code{urlopen()} returned (for a remote object, possibly \method{info()} method of the object returned by \function{urlopen()}
cached). Exceptions are the same as for \code{urlopen()}. returned (for a remote object, possibly cached). Exceptions are the
same as for \function{urlopen()}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{urlcleanup}{} \begin{funcdesc}{urlcleanup}{}
Clear the cache that may have been built up by previous calls to Clear the cache that may have been built up by previous calls to
\code{urlretrieve()}. \function{urlretrieve()}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{quote}{string\optional{\, addsafe}} \begin{funcdesc}{quote}{string\optional{\, addsafe}}
Replace special characters in \var{string} using the \code{\%xx} escape. Replace special characters in \var{string} using the \samp{\%xx} escape.
Letters, digits, and the characters ``\code{_,.-}'' are never quoted. Letters, digits, and the characters \character{_,.-} are never quoted.
The optional \var{addsafe} parameter specifies additional characters The optional \var{addsafe} parameter specifies additional characters
that should not be quoted --- its default value is \code{'/'}. that should not be quoted --- its default value is \code{'/'}.
...@@ -65,7 +65,7 @@ Example: \code{quote('/\~connolly/')} yields \code{'/\%7econnolly/'}. ...@@ -65,7 +65,7 @@ Example: \code{quote('/\~connolly/')} yields \code{'/\%7econnolly/'}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{quote_plus}{string\optional{\, addsafe}} \begin{funcdesc}{quote_plus}{string\optional{\, addsafe}}
Like \code{quote()}, but also replaces spaces by plus signs, as Like \function{quote()}, but also replaces spaces by plus signs, as
required for quoting HTML form values. required for quoting HTML form values.
\end{funcdesc} \end{funcdesc}
...@@ -76,7 +76,7 @@ Example: \code{unquote('/\%7Econnolly/')} yields \code{'/\~connolly/'}. ...@@ -76,7 +76,7 @@ Example: \code{unquote('/\%7Econnolly/')} yields \code{'/\~connolly/'}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{unquote_plus}{string} \begin{funcdesc}{unquote_plus}{string}
Like \code{unquote()}, but also replaces plus signs by spaces, as Like \function{unquote()}, but also replaces plus signs by spaces, as
required for unquoting HTML form values. required for unquoting HTML form values.
\end{funcdesc} \end{funcdesc}
...@@ -87,13 +87,14 @@ Restrictions: ...@@ -87,13 +87,14 @@ Restrictions:
\item \item
Currently, only the following protocols are supported: HTTP, (versions Currently, only the following protocols are supported: HTTP, (versions
0.9 and 1.0), Gopher (but not Gopher-+), FTP, and local files. 0.9 and 1.0), Gopher (but not Gopher-+), FTP, and local files.
\index{HTTP} \indexii{HTTP}{protocol}
\index{Gopher} \indexii{Gopher}{protocol}
\index{FTP} \indexii{FTP}{protocol}
\item \item
The caching feature of \code{urlretrieve()} has been disabled until I The caching feature of \function{urlretrieve()} has been disabled
find the time to hack proper processing of Expiration time headers. until I find the time to hack proper processing of Expiration time
headers.
\item \item
There should be a function to query whether a particular URL is in There should be a function to query whether a particular URL is in
...@@ -105,29 +106,27 @@ but the file can't be opened, the URL is re-interpreted using the FTP ...@@ -105,29 +106,27 @@ but the file can't be opened, the URL is re-interpreted using the FTP
protocol. This can sometimes cause confusing error messages. protocol. This can sometimes cause confusing error messages.
\item \item
The \code{urlopen()} and \code{urlretrieve()} functions can cause The \function{urlopen()} and \function{urlretrieve()} functions can
arbitrarily long delays while waiting for a network connection to be cause arbitrarily long delays while waiting for a network connection
set up. This means that it is difficult to build an interactive to be set up. This means that it is difficult to build an interactive
web client using these functions without using threads. web client using these functions without using threads.
\item \item
The data returned by \code{urlopen()} or \code{urlretrieve()} is the The data returned by \function{urlopen()} or \function{urlretrieve()}
raw data returned by the server. This may be binary data (e.g. an is the raw data returned by the server. This may be binary data
image), plain text or (for example) HTML. The HTTP protocol provides (e.g. an image), plain text or (for example) HTML. The HTTP protocol
type information in the reply header, which can be inspected by provides type information in the reply header, which can be inspected
looking at the \code{Content-type} header. For the Gopher protocol, by looking at the \code{content-type} header. For the Gopher protocol,
type information is encoded in the URL; there is currently no easy way type information is encoded in the URL; there is currently no easy way
to extract it. If the returned data is HTML, you can use the module to extract it. If the returned data is HTML, you can use the module
\code{htmllib} to parse it. \module{htmllib}\refstmodindex{htmllib} to parse it.
\index{HTML}% \index{HTML}
\index{HTTP}% \indexii{HTTP}{protocol}
\index{Gopher}% \indexii{Gopher}{protocol}
\refstmodindex{htmllib}
\item \item
Although the \code{urllib} module contains (undocumented) routines to Although the \module{urllib} module contains (undocumented) routines
parse and unparse URL strings, the recommended interface for URL to parse and unparse URL strings, the recommended interface for URL
manipulation is in module \code{urlparse}. manipulation is in module \module{urlparse}\refstmodindex{urlparse}.
\refstmodindex{urlparse}
\end{itemize} \end{itemize}
This diff is collapsed.
...@@ -5,59 +5,59 @@ ...@@ -5,59 +5,59 @@
\index{World-Wide Web} \index{World-Wide Web}
\index{URL} \index{URL}
\setindexsubitem{(in module urllib)}
This module provides a high-level interface for fetching data across This module provides a high-level interface for fetching data across
the World-Wide Web. In particular, the \code{urlopen()} function is the World-Wide Web. In particular, the \function{urlopen()} function
similar to the built-in function \code{open()}, but accepts URLs is similar to the built-in function \function{open()}, but accepts
(Universal Resource Locators) instead of filenames. Some restrictions Universal Resource Locators (URLs) instead of filenames. Some
apply --- it can only open URLs for reading, and no seek operations restrictions apply --- it can only open URLs for reading, and no seek
are available. operations are available.
It defines the following public functions: It defines the following public functions:
\begin{funcdesc}{urlopen}{url} \begin{funcdesc}{urlopen}{url}
Open a network object denoted by a URL for reading. If the URL does Open a network object denoted by a URL for reading. If the URL does
not have a scheme identifier, or if it has \samp{file:} as its scheme not have a scheme identifier, or if it has \file{file:} as its scheme
identifier, this opens a local file; otherwise it opens a socket to a identifier, this opens a local file; otherwise it opens a socket to a
server somewhere on the network. If the connection cannot be made, or server somewhere on the network. If the connection cannot be made, or
if the server returns an error code, the \code{IOError} exception is if the server returns an error code, the \exception{IOError} exception
raised. If all went well, a file-like object is returned. This is raised. If all went well, a file-like object is returned. This
supports the following methods: \code{read()}, \code{readline()}, supports the following methods: \method{read()}, \method{readline()},
\code{readlines()}, \code{fileno()}, \code{close()} and \code{info()}. \method{readlines()}, \method{fileno()}, \method{close()} and
\method{info()}.
Except for the last one, these methods have the same interface as for Except for the last one, these methods have the same interface as for
file objects --- see the section on File Objects earlier in this file objects --- see section \ref{bltin-file-objects} in this
manual. (It's not a built-in file object, however, so it can't be manual. (It is not a built-in file object, however, so it can't be
used at those few places where a true built-in file object is used at those few places where a true built-in file object is
required.) required.)
The \code{info()} method returns an instance of the class The \method{info()} method returns an instance of the class
\code{mimetools.Message} containing the headers received from the server, \class{mimetools.Message} containing the headers received from the
if the protocol uses such headers (currently the only supported server, if the protocol uses such headers (currently the only
protocol that uses this is HTTP). See the description of the supported protocol that uses this is HTTP). See the description of
\code{mimetools} module. the \module{mimetools}\refstmodindex{mimetools} module.
\refstmodindex{mimetools}
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{urlretrieve}{url} \begin{funcdesc}{urlretrieve}{url}
Copy a network object denoted by a URL to a local file, if necessary. Copy a network object denoted by a URL to a local file, if necessary.
If the URL points to a local file, or a valid cached copy of the If the URL points to a local file, or a valid cached copy of the
object exists, the object is not copied. Return a tuple (\var{filename}, object exists, the object is not copied. Return a tuple
\var{headers}) where \var{filename} is the local file name under which \code{(\var{filename}, \var{headers})} where \var{filename} is the
the object can be found, and \var{headers} is either \code{None} (for local file name under which the object can be found, and \var{headers}
a local object) or whatever the \code{info()} method of the object is either \code{None} (for a local object) or whatever the
returned by \code{urlopen()} returned (for a remote object, possibly \method{info()} method of the object returned by \function{urlopen()}
cached). Exceptions are the same as for \code{urlopen()}. returned (for a remote object, possibly cached). Exceptions are the
same as for \function{urlopen()}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{urlcleanup}{} \begin{funcdesc}{urlcleanup}{}
Clear the cache that may have been built up by previous calls to Clear the cache that may have been built up by previous calls to
\code{urlretrieve()}. \function{urlretrieve()}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{quote}{string\optional{\, addsafe}} \begin{funcdesc}{quote}{string\optional{\, addsafe}}
Replace special characters in \var{string} using the \code{\%xx} escape. Replace special characters in \var{string} using the \samp{\%xx} escape.
Letters, digits, and the characters ``\code{_,.-}'' are never quoted. Letters, digits, and the characters \character{_,.-} are never quoted.
The optional \var{addsafe} parameter specifies additional characters The optional \var{addsafe} parameter specifies additional characters
that should not be quoted --- its default value is \code{'/'}. that should not be quoted --- its default value is \code{'/'}.
...@@ -65,7 +65,7 @@ Example: \code{quote('/\~connolly/')} yields \code{'/\%7econnolly/'}. ...@@ -65,7 +65,7 @@ Example: \code{quote('/\~connolly/')} yields \code{'/\%7econnolly/'}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{quote_plus}{string\optional{\, addsafe}} \begin{funcdesc}{quote_plus}{string\optional{\, addsafe}}
Like \code{quote()}, but also replaces spaces by plus signs, as Like \function{quote()}, but also replaces spaces by plus signs, as
required for quoting HTML form values. required for quoting HTML form values.
\end{funcdesc} \end{funcdesc}
...@@ -76,7 +76,7 @@ Example: \code{unquote('/\%7Econnolly/')} yields \code{'/\~connolly/'}. ...@@ -76,7 +76,7 @@ Example: \code{unquote('/\%7Econnolly/')} yields \code{'/\~connolly/'}.
\end{funcdesc} \end{funcdesc}
\begin{funcdesc}{unquote_plus}{string} \begin{funcdesc}{unquote_plus}{string}
Like \code{unquote()}, but also replaces plus signs by spaces, as Like \function{unquote()}, but also replaces plus signs by spaces, as
required for unquoting HTML form values. required for unquoting HTML form values.
\end{funcdesc} \end{funcdesc}
...@@ -87,13 +87,14 @@ Restrictions: ...@@ -87,13 +87,14 @@ Restrictions:
\item \item
Currently, only the following protocols are supported: HTTP, (versions Currently, only the following protocols are supported: HTTP, (versions
0.9 and 1.0), Gopher (but not Gopher-+), FTP, and local files. 0.9 and 1.0), Gopher (but not Gopher-+), FTP, and local files.
\index{HTTP} \indexii{HTTP}{protocol}
\index{Gopher} \indexii{Gopher}{protocol}
\index{FTP} \indexii{FTP}{protocol}
\item \item
The caching feature of \code{urlretrieve()} has been disabled until I The caching feature of \function{urlretrieve()} has been disabled
find the time to hack proper processing of Expiration time headers. until I find the time to hack proper processing of Expiration time
headers.
\item \item
There should be a function to query whether a particular URL is in There should be a function to query whether a particular URL is in
...@@ -105,29 +106,27 @@ but the file can't be opened, the URL is re-interpreted using the FTP ...@@ -105,29 +106,27 @@ but the file can't be opened, the URL is re-interpreted using the FTP
protocol. This can sometimes cause confusing error messages. protocol. This can sometimes cause confusing error messages.
\item \item
The \code{urlopen()} and \code{urlretrieve()} functions can cause The \function{urlopen()} and \function{urlretrieve()} functions can
arbitrarily long delays while waiting for a network connection to be cause arbitrarily long delays while waiting for a network connection
set up. This means that it is difficult to build an interactive to be set up. This means that it is difficult to build an interactive
web client using these functions without using threads. web client using these functions without using threads.
\item \item
The data returned by \code{urlopen()} or \code{urlretrieve()} is the The data returned by \function{urlopen()} or \function{urlretrieve()}
raw data returned by the server. This may be binary data (e.g. an is the raw data returned by the server. This may be binary data
image), plain text or (for example) HTML. The HTTP protocol provides (e.g. an image), plain text or (for example) HTML. The HTTP protocol
type information in the reply header, which can be inspected by provides type information in the reply header, which can be inspected
looking at the \code{Content-type} header. For the Gopher protocol, by looking at the \code{content-type} header. For the Gopher protocol,
type information is encoded in the URL; there is currently no easy way type information is encoded in the URL; there is currently no easy way
to extract it. If the returned data is HTML, you can use the module to extract it. If the returned data is HTML, you can use the module
\code{htmllib} to parse it. \module{htmllib}\refstmodindex{htmllib} to parse it.
\index{HTML}% \index{HTML}
\index{HTTP}% \indexii{HTTP}{protocol}
\index{Gopher}% \indexii{Gopher}{protocol}
\refstmodindex{htmllib}
\item \item
Although the \code{urllib} module contains (undocumented) routines to Although the \module{urllib} module contains (undocumented) routines
parse and unparse URL strings, the recommended interface for URL to parse and unparse URL strings, the recommended interface for URL
manipulation is in module \code{urlparse}. manipulation is in module \module{urlparse}\refstmodindex{urlparse}.
\refstmodindex{urlparse}
\end{itemize} \end{itemize}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment