Commit 5e634638 authored by Barry Warsaw's avatar Barry Warsaw

The email package documentation, currently organized the way I think

Fred prefers.  I'm not sure I like this organization, so it may change.
parent 26991a7f
This diff is collapsed.
\section{\module{email.Encoders} ---
Email message payload encoders}
\declaremodule{standard}{email.Encoders}
\modulesynopsis{Encoders for email message payloads.}
\sectionauthor{Barry A. Warsaw}{barry@zope.com}
\versionadded{2.2}
When creating \class{Message} objects from scratch, you often need to
encode the payloads for transport through compliant mail servers.
This is especially true for \code{image/*} and \code{text/*} type
messages containing binary data.
The \module{email} package provides some convenient encodings in its
\module{Encoders} module. These encoders are actually used by the
\class{MIMEImage} and \class{MIMEText} class constructors to provide default
encodings. All encoder functions take exactly one argument, the
message object to encode. They usually extract the payload, encode
it, and reset the payload to this newly encoded value. They should also
set the \code{Content-Transfer-Encoding:} header as appropriate.
Here are the encoding functions provided:
\begin{funcdesc}{encode_quopri}{msg}
Encodes the payload into \emph{Quoted-Printable} form and sets the
\code{Content-Transfer-Encoding:} header to
\code{quoted-printable}\footnote{Note that encoding with
\method{encode_quopri()} also encodes all tabs and space characters in
the data.}.
This is a good encoding to use when most of your payload is normal
printable data, but contains a few unprintable characters.
\end{funcdesc}
\begin{funcdesc}{encode_base64}{msg}
Encodes the payload into \emph{Base64} form and sets the
\code{Content-Transfer-Encoding:} header to
\code{base64}. This is a good encoding to use when most of your payload
is unprintable data since it is a more compact form than
Quoted-Printable. The drawback of Base64 encoding is that it
renders the text non-human readable.
\end{funcdesc}
\begin{funcdesc}{encode_7or8bit}{msg}
This doesn't actually modify the message's payload, but it does set
the \code{Content-Transfer-Encoding:} header to either \code{7bit} or
\code{8bit} as appropriate, based on the payload data.
\end{funcdesc}
\begin{funcdesc}{encode_noop}{msg}
This does nothing; it doesn't even set the
\code{Content-Transfer-Encoding:} header.
\end{funcdesc}
\section{\module{email.Errors} ---
email package exception classes}
\declaremodule{standard}{email.Exceptions}
\modulesynopsis{The exception classes used by the email package.}
\sectionauthor{Barry A. Warsaw}{barry@zope.com}
\versionadded{2.2}
The following exception classes are defined in the
\module{email.Errors} module:
\begin{excclassdesc}{MessageError}{}
This is the base class for all exceptions that the \module{email}
package can raise. It is derived from the standard
\exception{Exception} class and defines no additional methods.
\end{excclassdesc}
\begin{excclassdesc}{MessageParseError}{}
This is the base class for exceptions thrown by the \class{Parser}
class. It is derived from \exception{MessageError}.
\end{excclassdesc}
\begin{excclassdesc}{HeaderParseError}{}
Raised under some error conditions when parsing the \rfc{2822} headers of
a message, this class is derived from \exception{MessageParseError}.
It can be raised from the \method{Parser.parse()} or
\method{Parser.parsestr()} methods.
Situations where it can be raised include finding a \emph{Unix-From}
header after the first \rfc{2822} header of the message, finding a
continuation line before the first \rfc{2822} header is found, or finding
a line in the headers which is neither a header or a continuation
line.
\end{excclassdesc}
\begin{excclassdesc}{BoundaryError}{}
Raised under some error conditions when parsing the \rfc{2822} headers of
a message, this class is derived from \exception{MessageParseError}.
It can be raised from the \method{Parser.parse()} or
\method{Parser.parsestr()} methods.
Situations where it can be raised include not being able to find the
starting or terminating boundary in a \code{multipart/*} message.
\end{excclassdesc}
\begin{excclassdesc}{MultipartConversionError}{}
Raised when a payload is added to a \class{Message} object using
\method{add_payload()}, but the payload is already a scalar and the
message's \code{Content-Type:} main type is not either \code{multipart}
or missing. \exception{MultipartConversionError} multiply inherits
from \exception{MessageError} and the built-in \exception{TypeError}.
\end{excclassdesc}
\section{\module{email.Generator} ---
Generating flat text from an email message object tree}
\declaremodule{standard}{email.Generator}
\modulesynopsis{Generate flat text email messages to from a message
object tree.}
\sectionauthor{Barry A. Warsaw}{barry@zope.com}
\versionadded{2.2}
The \class{Generator} class is used to render a message object model
into its flat text representation, including MIME encoding any
sub-messages, generating the correct \rfc{2822} headers, etc. Here
are the public methods of the \class{Generator} class.
\begin{classdesc}{Generator}{outfp\optional{, mangle_from_\optional{,
maxheaderlen}}}
The constructor for the \class{Generator} class takes a file-like
object called \var{outfp} for an argument. \var{outfp} must support
the \method{write()} method and be usable as the output file in a
Python 2.0 extended print statement.
Optional \var{mangle_from_} is a flag that, when true, puts a ``>''
character in front of any line in the body that starts exactly as
\samp{From } (i.e. \code{From} followed by a space at the front of the
line). This is the only guaranteed portable way to avoid having such
lines be mistaken for \emph{Unix-From} headers (see
\url{http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html}
for details).
Optional \var{maxheaderlen} specifies the longest length for a
non-continued header. When a header line is longer than
\var{maxheaderlen} (in characters, with tabs expanded to 8 spaces),
the header will be broken on semicolons and continued as per
\rfc{2822}. If no semicolon is found, then the header is left alone.
Set to zero to disable wrapping headers. Default is 78, as
recommended (but not required) by \rfc{2822}.
\end{classdesc}
The other public \class{Generator} methods are:
\begin{methoddesc}[Generator]{__call__}{msg\optional{, unixfrom}}
Print the textual representation of the message object tree rooted at
\var{msg} to the output file specified when the \class{Generator}
instance was created. Sub-objects are visited depth-first and the
resulting text will be properly MIME encoded.
Optional \var{unixfrom} is a flag that forces the printing of the
\emph{Unix-From} (a.k.a. envelope header or \code{From_} header)
delimiter before the first \rfc{2822} header of the root message
object. If the root object has no \emph{Unix-From} header, a standard
one is crafted. By default, this is set to 0 to inhibit the printing
of the \emph{Unix-From} delimiter.
Note that for sub-objects, no \emph{Unix-From} header is ever printed.
\end{methoddesc}
\begin{methoddesc}[Generator]{write}{s}
Write the string \var{s} to the underlying file object,
i.e. \var{outfp} passed to \class{Generator}'s constructor. This
provides just enough file-like API for \class{Generator} instances to
be used in extended print statements.
\end{methoddesc}
As a convenience, see the methods \method{Message.as_string()} and
\code{str(aMessage)}, a.k.a. \method{Message.__str__()}, which
simplify the generation of a formatted string representation of a
message object. For more detail, see \refmodule{email.Message}.
\section{\module{email.Iterators} ---
Message object tree iterators}
\declaremodule{standard}{email.Iterators}
\modulesynopsis{Iterate over a message object tree.}
\sectionauthor{Barry A. Warsaw}{barry@zope.com}
\versionadded{2.2}
Iterating over a message object tree is fairly easy with the
\method{Message.walk()} method. The \module{email.Iterators} module
provides some useful higher level iterations over message object
trees.
\begin{funcdesc}{body_line_iterator}{msg}
This iterates over all the payloads in all the subparts of \var{msg},
returning the string payloads line-by-line. It skips over all the
subpart headers, and it skips over any subpart with a payload that
isn't a Python string. This is somewhat equivalent to reading the
flat text representation of the message from a file using
\method{readline()}, skipping over all the intervening headers.
\end{funcdesc}
\begin{funcdesc}{typed_subpart_iterator}{msg\optional{,
maintype\optional{, subtype}}}
This iterates over all the subparts of \var{msg}, returning only those
subparts that match the MIME type specified by \var{maintype} and
\var{subtype}.
Note that \var{subtype} is optional; if omitted, then subpart MIME
type matching is done only with the main type. \var{maintype} is
optional too; it defaults to \code{text}.
Thus, by default \function{typed_subpart_iterator()} returns each
subpart that has a MIME type of \code{text/*}.
\end{funcdesc}
This diff is collapsed.
\section{\module{email.Parser} ---
Parsing flat text email messages}
\declaremodule{standard}{email.Parser}
\modulesynopsis{Parse flat text email messages to produce a message
object tree.}
\sectionauthor{Barry A. Warsaw}{barry@zope.com}
\versionadded{2.2}
The \module{Parser} module provides a single class, the \class{Parser}
class, which is used to take a message in flat text form and create
the associated object model. The resulting object tree can then be
manipulated using the \class{Message} class interface as described in
\refmodule{email.Message}, and turned over
to a generator (as described in \refmodule{emamil.Generator}) to
return the textual representation of the message. It is intended that
the \class{Parser} to \class{Generator} path be idempotent if the
object model isn't modified in between.
\subsection{Parser class API}
\begin{classdesc}{Parser}{\optional{_class}}
The constructor for the \class{Parser} class takes a single optional
argument \var{_class}. This must be callable factory (i.e. a function
or a class), and it is used whenever a sub-message object needs to be
created. It defaults to \class{Message} (see
\refmodule{email.Message}). \var{_class} will be called with zero
arguments.
\end{classdesc}
The other public \class{Parser} methods are:
\begin{methoddesc}[Parser]{parse}{fp}
Read all the data from the file-like object \var{fp}, parse the
resulting text, and return the root message object. \var{fp} must
support both the \method{readline()} and the \method{read()} methods
on file-like objects.
The text contained in \var{fp} must be formatted as a block of \rfc{2822}
style headers and header continuation lines, optionally preceeded by a
\emph{Unix-From} header. The header block is terminated either by the
end of the data or by a blank line. Following the header block is the
body of the message (which may contain MIME-encoded subparts).
\end{methoddesc}
\begin{methoddesc}[Parser]{parsestr}{text}
Similar to the \method{parse()} method, except it takes a string
object instead of a file-like object. Calling this method on a string
is exactly equivalent to wrapping \var{text} in a \class{StringIO}
instance first and calling \method{parse()}.
\end{methoddesc}
Since creating a message object tree from a string or a file object is
such a common task, two functions are provided as a convenience. They
are available in the top-level \module{email} package namespace.
\begin{funcdesc}{message_from_string}{s\optional{, _class}}
Return a message object tree from a string. This is exactly
equivalent to \code{Parser().parsestr(s)}. Optional \var{_class} is
interpreted as with the \class{Parser} class constructor.
\end{funcdesc}
\begin{funcdesc}{message_from_file}{fp\optional{, _class}}
Return a message object tree from an open file object. This is exactly
equivalent to \code{Parser().parse(fp)}. Optional \var{_class} is
interpreted as with the \class{Parser} class constructor.
\end{funcdesc}
Here's an example of how you might use this at an interactive Python
prompt:
\begin{verbatim}
>>> import email
>>> msg = email.message_from_string(myString)
\end{verbatim}
\subsection{Additional notes}
Here are some notes on the parsing semantics:
\begin{itemize}
\item Most non-\code{multipart} type messages are parsed as a single
message object with a string payload. These objects will return
0 for \method{is_multipart()}.
\item One exception is for \code{message/delivery-status} type
messages. Because such the body of such messages consist of
blocks of headers, \class{Parser} will create a non-multipart
object containing non-multipart subobjects for each header
block.
\item Another exception is for \code{message/*} types (i.e. more
general than \code{message/delivery-status}. These are
typically \code{message/rfc822} type messages, represented as a
non-multipart object containing a singleton payload, another
non-multipart \class{Message} instance.
\end{itemize}
\section{\module{email.Utils} ---
Miscellaneous email package utilities}
\declaremodule{standard}{email.Utils}
\modulesynopsis{Miscellaneous email package utilities.}
\sectionauthor{Barry A. Warsaw}{barry@zope.com}
\versionadded{2.2}
There are several useful utilities provided with the \module{email}
package.
\begin{funcdesc}{quote}{str}
Return a new string with backslashes in \var{str} replaced by two
backslashes and double quotes replaced by backslash-double quote.
\end{funcdesc}
\begin{funcdesc}{unquote}{str}
Return a new string which is an \emph{unquoted} version of \var{str}.
If \var{str} ends and begins with double quotes, they are stripped
off. Likewise if \var{str} ends and begins with angle brackets, they
are stripped off.
\end{funcdesc}
\begin{funcdesc}{parseaddr}{address}
Parse address -- which should be the value of some address-containing
field such as \code{To:} or \code{Cc:} -- into its constituent
``realname'' and ``email address'' parts. Returns a tuple of that
information, unless the parse fails, in which case a 2-tuple of
\code{(None, None)} is returned.
\end{funcdesc}
\begin{funcdesc}{dump_address_pair}{pair}
The inverse of \method{parseaddr()}, this takes a 2-tuple of the form
\code{(realname, email_address)} and returns the string value suitable
for a \code{To:} or \code{Cc:} header. If the first element of
\var{pair} is false, then the second element is returned unmodified.
\end{funcdesc}
\begin{funcdesc}{getaddresses}{fieldvalues}
This method returns a list of 2-tuples of the form returned by
\code{parseaddr()}. \var{fieldvalues} is a sequence of header field
values as might be returned by \method{Message.getall()}. Here's a
simple example that gets all the recipients of a message:
\begin{verbatim}
from email.Utils import getaddresses
tos = msg.get_all('to')
ccs = msg.get_all('cc')
resent_tos = msg.get_all('resent-to')
resent_ccs = msg.get_all('resent-cc')
all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)
\end{verbatim}
\end{funcdesc}
\begin{funcdesc}{decode}{s}
This method decodes a string according to the rules in \rfc{2047}. It
returns the decoded string as a Python unicode string.
\end{funcdesc}
\begin{funcdesc}{encode}{s\optional{, charset\optional{, encoding}}}
This method encodes a string according to the rules in \rfc{2047}. It
is not actually the inverse of \function{decode()} since it doesn't
handle multiple character sets or multiple string parts needing
encoding. In fact, the input string \var{s} must already be encoded
in the \var{charset} character set (Python can't reliably guess what
character set a string might be encoded in). The default
\var{charset} is \samp{iso-8859-1}.
\var{encoding} must be either the letter \samp{q} for
Quoted-Printable or \samp{b} for Base64 encoding. If
neither, a \code{ValueError} is raised. Both the \var{charset} and
the \var{encoding} strings are case-insensitive, and coerced to lower
case in the returned string.
\end{funcdesc}
\begin{funcdesc}{parsedate}{date}
Attempts to parse a date according to the rules in \rfc{2822}.
however, some mailers don't follow that format as specified, so
\function{parsedate()} tries to guess correctly in such cases.
\var{date} is a string containing an \rfc{2822} date, such as
\code{"Mon, 20 Nov 1995 19:12:08 -0500"}. If it succeeds in parsing
the date, \function{parsedate()} returns a 9-tuple that can be passed
directly to \function{time.mktime()}; otherwise \code{None} will be
returned. Note that fields 6, 7, and 8 of the result tuple are not
usable.
\end{funcdesc}
\begin{funcdesc}{parsedate_tz}{date}
Performs the same function as \function{parsedate()}, but returns
either \code{None} or a 10-tuple; the first 9 elements make up a tuple
that can be passed directly to \function{time.mktime()}, and the tenth
is the offset of the date's timezone from UTC (which is the official
term for Greenwich Mean Time)\footnote{Note that the sign of the timezone
offset is the opposite of the sign of the \code{time.timezone}
variable for the same timezone; the latter variable follows the
\POSIX{} standard while this module follows \rfc{2822}.}. If the input
string has no timezone, the last element of the tuple returned is
\code{None}. Note that fields 6, 7, and 8 of the result tuple are not
usable.
\end{funcdesc}
\begin{funcdesc}{mktime_tz}{tuple}
Turn a 10-tuple as returned by \function{parsedate_tz()} into a UTC
timestamp. It the timezone item in the tuple is \code{None}, assume
local time. Minor deficiency: \function{mktime_tz()} interprets the
first 8 elements of \var{tuple} as a local time and then compensates
for the timezone difference. This may yield a slight error around
changes in daylight savings time, though not worth worring about for
common use.
\end{funcdesc}
\begin{funcdesc}{formatdate}{\optional{timeval}}
Returns the time formatted as per Internet standards \rfc{2822}
and updated by \rfc{1123}. If \var{timeval} is provided, then it
should be a floating point time value as expected by
\method{time.gmtime()}, otherwise the current time is used.
\end{funcdesc}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment