Commit a198645f authored by R David Murray's avatar R David Murray

#1753718: clarify RFC compliance and bytes/string argument types.

Patch includes contributions by Isobel Hooper, incorporating suggestions from
Paul Winkler.  Reviewed by Martin Panter.

In addition to accepting the corrections for the RFC compliance wording, I
went through and corrected all the argument and return types, and made the
pattern of how the arguments and return types are documented consistent.
So, this patch also addresses #20782, though I had forgotten about that issue
and its patch.
parent a17ca19d
...@@ -21,13 +21,19 @@ safely sent by email, used as parts of URLs, or included as part of an HTTP ...@@ -21,13 +21,19 @@ safely sent by email, used as parts of URLs, or included as part of an HTTP
POST request. The encoding algorithm is not the same as the POST request. The encoding algorithm is not the same as the
:program:`uuencode` program. :program:`uuencode` program.
There are two :rfc:`3548` interfaces provided by this module. The modern There are two interfaces provided by this module. The modern interface
interface supports encoding and decoding ASCII byte string objects using all supports encoding :term:`bytes-like objects <bytes-like object>` to ASCII
three :rfc:`3548` defined alphabets (normal, URL-safe, and filesystem-safe). :class:`bytes`, and decoding :term:`bytes-like objects <bytes-like object>` or
Additionally, the decoding functions of the modern interface also accept strings containing ASCII to :class:`bytes`. All three :rfc:`3548` defined
Unicode strings containing only ASCII characters. The legacy interface provides alphabets (normal, URL-safe, and filesystem-safe) are supported.
for encoding and decoding to and from file-like objects as well as byte
strings, but only using the Base64 standard alphabet. The legacy interface does not support decoding from strings, but it does
provide functions for encoding and decoding to and from :term:`file objects
<file object>`. It only supports the Base64 standard alphabet, and it adds
newlines every 76 characters as per :rfc:`2045`. Note that if you are looking
for :rfc:`2045` support you probably want to be looking at the :mod:`email`
package instead.
.. versionchanged:: 3.3 .. versionchanged:: 3.3
ASCII-only Unicode strings are now accepted by the decoding functions of ASCII-only Unicode strings are now accepted by the decoding functions of
...@@ -41,26 +47,26 @@ The modern interface provides: ...@@ -41,26 +47,26 @@ The modern interface provides:
.. function:: b64encode(s, altchars=None) .. function:: b64encode(s, altchars=None)
Encode a byte string using Base64. Encode the :term:`bytes-like object` *s* using Base64 and return the encoded
:class:`bytes`.
*s* is the string to encode. Optional *altchars* must be a string of at least Optional *altchars* must be a :term:`bytes-like object` of at least
length 2 (additional characters are ignored) which specifies an alternative length 2 (additional characters are ignored) which specifies an alternative
alphabet for the ``+`` and ``/`` characters. This allows an application to e.g. alphabet for the ``+`` and ``/`` characters. This allows an application to e.g.
generate URL or filesystem safe Base64 strings. The default is ``None``, for generate URL or filesystem safe Base64 strings. The default is ``None``, for
which the standard Base64 alphabet is used. which the standard Base64 alphabet is used.
The encoded byte string is returned.
.. function:: b64decode(s, altchars=None, validate=False) .. function:: b64decode(s, altchars=None, validate=False)
Decode a Base64 encoded byte string. Decode the Base64 encoded :term:`bytes-like object` or ASCII string
*s* and return the decoded :class:`bytes`.
*s* is the byte string to decode. Optional *altchars* must be a string of Optional *altchars* must be a :term:`bytes-like object` or ASCII string of
at least length 2 (additional characters are ignored) which specifies the at least length 2 (additional characters are ignored) which specifies the
alternative alphabet used instead of the ``+`` and ``/`` characters. alternative alphabet used instead of the ``+`` and ``/`` characters.
The decoded string is returned. A :exc:`binascii.Error` exception is raised A :exc:`binascii.Error` exception is raised
if *s* is incorrectly padded. if *s* is incorrectly padded.
If *validate* is ``False`` (the default), non-base64-alphabet characters are If *validate* is ``False`` (the default), non-base64-alphabet characters are
...@@ -71,38 +77,44 @@ The modern interface provides: ...@@ -71,38 +77,44 @@ The modern interface provides:
.. function:: standard_b64encode(s) .. function:: standard_b64encode(s)
Encode byte string *s* using the standard Base64 alphabet. Encode :term:`bytes-like object` *s* using the standard Base64 alphabet
and return the encoded :class:`bytes`.
.. function:: standard_b64decode(s) .. function:: standard_b64decode(s)
Decode byte string *s* using the standard Base64 alphabet. Decode :term:`bytes-like object` or ASCII string *s* using the standard
Base64 alphabet and return the decoded :class:`bytes`.
.. function:: urlsafe_b64encode(s) .. function:: urlsafe_b64encode(s)
Encode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of Encode :term:`bytes-like object` *s* using a URL-safe alphabet, which
``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. The result substitutes ``-`` instead of ``+`` and ``_`` instead of ``/`` in the
standard Base64 alphabet, and return the encoded :class:`bytes`. The result
can still contain ``=``. can still contain ``=``.
.. function:: urlsafe_b64decode(s) .. function:: urlsafe_b64decode(s)
Decode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of Decode :term:`bytes-like object` or ASCII string *s* using a URL-safe
``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. alphabet, which substitutes ``-`` instead of ``+`` and ``_`` instead of
``/`` in the standard Base64 alphabet, and return the decoded
:class:`bytes`.
.. function:: b32encode(s) .. function:: b32encode(s)
Encode a byte string using Base32. *s* is the string to encode. The encoded string Encode the :term:`bytes-like object` *s* using Base32 and return the
is returned. encoded :class:`bytes`.
.. function:: b32decode(s, casefold=False, map01=None) .. function:: b32decode(s, casefold=False, map01=None)
Decode a Base32 encoded byte string. Decode the Base32 encoded :term:`bytes-like object` or ASCII string *s* and
return the decoded :class:`bytes`.
*s* is the byte string to decode. Optional *casefold* is a flag specifying Optional *casefold* is a flag specifying
whether a lowercase alphabet is acceptable as input. For security purposes, whether a lowercase alphabet is acceptable as input. For security purposes,
the default is ``False``. the default is ``False``.
...@@ -113,46 +125,45 @@ The modern interface provides: ...@@ -113,46 +125,45 @@ The modern interface provides:
digit 0 is always mapped to the letter O). For security purposes the default is digit 0 is always mapped to the letter O). For security purposes the default is
``None``, so that 0 and 1 are not allowed in the input. ``None``, so that 0 and 1 are not allowed in the input.
The decoded byte string is returned. A :exc:`binascii.Error` is raised if *s* is A :exc:`binascii.Error` is raised if *s* is
incorrectly padded or if there are non-alphabet characters present in the incorrectly padded or if there are non-alphabet characters present in the
string. input.
.. function:: b16encode(s) .. function:: b16encode(s)
Encode a byte string using Base16. Encode the :term:`bytes-like object` *s* using Base16 and return the
encoded :class:`bytes`.
*s* is the string to encode. The encoded byte string is returned.
.. function:: b16decode(s, casefold=False) .. function:: b16decode(s, casefold=False)
Decode a Base16 encoded byte string. Decode the Base16 encoded :term:`bytes-like object` or ASCII string *s* and
return the decoded :class:`bytes`.
*s* is the string to decode. Optional *casefold* is a flag specifying whether a Optional *casefold* is a flag specifying whether a
lowercase alphabet is acceptable as input. For security purposes, the default lowercase alphabet is acceptable as input. For security purposes, the default
is ``False``. is ``False``.
The decoded byte string is returned. A :exc:`TypeError` is raised if *s* were A :exc:`TypeError` is raised if *s* is
incorrectly padded or if there are non-alphabet characters present in the incorrectly padded or if there are non-alphabet characters present in the
string. input.
.. function:: a85encode(s, *, foldspaces=False, wrapcol=0, pad=False, adobe=False) .. function:: a85encode(s, *, foldspaces=False, wrapcol=0, pad=False, adobe=False)
Encode a byte string using Ascii85. Encode the :term:`bytes-like object` *s* using Ascii85 and return the
encoded :class:`bytes`.
*s* is the string to encode. The encoded byte string is returned.
*foldspaces* is an optional flag that uses the special short sequence 'y' *foldspaces* is an optional flag that uses the special short sequence 'y'
instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This
feature is not supported by the "standard" Ascii85 encoding. feature is not supported by the "standard" Ascii85 encoding.
*wrapcol* controls whether the output should have newline (``'\n'``) *wrapcol* controls whether the output should have newline (``b'\n'``)
characters added to it. If this is non-zero, each output line will be characters added to it. If this is non-zero, each output line will be
at most this many characters long. at most this many characters long.
*pad* controls whether the input string is padded to a multiple of 4 *pad* controls whether the input is padded to a multiple of 4
before encoding. Note that the ``btoa`` implementation always pads. before encoding. Note that the ``btoa`` implementation always pads.
*adobe* controls whether the encoded byte sequence is framed with ``<~`` *adobe* controls whether the encoded byte sequence is framed with ``<~``
...@@ -163,9 +174,8 @@ The modern interface provides: ...@@ -163,9 +174,8 @@ The modern interface provides:
.. function:: a85decode(s, *, foldspaces=False, adobe=False, ignorechars=b' \\t\\n\\r\\v') .. function:: a85decode(s, *, foldspaces=False, adobe=False, ignorechars=b' \\t\\n\\r\\v')
Decode an Ascii85 encoded byte string. Decode the Ascii85 encoded :term:`bytes-like object` or ASCII string *s* and
return the decoded :class:`bytes`.
*s* is the byte string to decode.
*foldspaces* is a flag that specifies whether the 'y' short sequence *foldspaces* is a flag that specifies whether the 'y' short sequence
should be accepted as shorthand for 4 consecutive spaces (ASCII 0x20). should be accepted as shorthand for 4 consecutive spaces (ASCII 0x20).
...@@ -174,7 +184,8 @@ The modern interface provides: ...@@ -174,7 +184,8 @@ The modern interface provides:
*adobe* controls whether the input sequence is in Adobe Ascii85 format *adobe* controls whether the input sequence is in Adobe Ascii85 format
(i.e. is framed with <~ and ~>). (i.e. is framed with <~ and ~>).
*ignorechars* should be a byte string containing characters to ignore *ignorechars* should be a :term:`bytes-like object` or ASCII string
containing characters to ignore
from the input. This should only contain whitespace characters, and by from the input. This should only contain whitespace characters, and by
default contains all whitespace characters in ASCII. default contains all whitespace characters in ASCII.
...@@ -183,18 +194,19 @@ The modern interface provides: ...@@ -183,18 +194,19 @@ The modern interface provides:
.. function:: b85encode(s, pad=False) .. function:: b85encode(s, pad=False)
Encode a byte string using base85, as used in e.g. git-style binary Encode the :term:`bytes-like object` *s* using base85 (as used in e.g.
diffs. git-style binary diffs) and return the encoded :class:`bytes`.
If *pad* is true, the input is padded with "\\0" so its length is a If *pad* is true, the input is padded with ``b'\0'`` so its length is a
multiple of 4 characters before encoding. multiple of 4 bytes before encoding.
.. versionadded:: 3.4 .. versionadded:: 3.4
.. function:: b85decode(b) .. function:: b85decode(b)
Decode base85-encoded byte string. Padding is implicitly removed, if Decode the base85-encoded :term:`bytes-like object` or ASCII string *b* and
return the decoded :class:`bytes`. Padding is implicitly removed, if
necessary. necessary.
.. versionadded:: 3.4 .. versionadded:: 3.4
...@@ -214,15 +226,15 @@ The legacy interface: ...@@ -214,15 +226,15 @@ The legacy interface:
Decode the contents of the binary *input* file and write the resulting binary Decode the contents of the binary *input* file and write the resulting binary
data to the *output* file. *input* and *output* must be :term:`file objects data to the *output* file. *input* and *output* must be :term:`file objects
<file object>`. *input* will be read until ``input.read()`` returns an empty <file object>`. *input* will be read until ``input.readline()`` returns an
bytes object. empty bytes object.
.. function:: decodebytes(s) .. function:: decodebytes(s)
decodestring(s) decodestring(s)
Decode the byte string *s*, which must contain one or more lines of base64 Decode the :term:`bytes-like object` *s*, which must contain one or more
encoded data, and return a byte string containing the resulting binary data. lines of base64 encoded data, and return the decoded :class:`bytes`.
``decodestring`` is a deprecated alias. ``decodestring`` is a deprecated alias.
.. versionadded:: 3.1 .. versionadded:: 3.1
...@@ -233,17 +245,19 @@ The legacy interface: ...@@ -233,17 +245,19 @@ The legacy interface:
Encode the contents of the binary *input* file and write the resulting base64 Encode the contents of the binary *input* file and write the resulting base64
encoded data to the *output* file. *input* and *output* must be :term:`file encoded data to the *output* file. *input* and *output* must be :term:`file
objects <file object>`. *input* will be read until ``input.read()`` returns objects <file object>`. *input* will be read until ``input.read()`` returns
an empty bytes object. :func:`encode` returns the encoded data plus a trailing an empty bytes object. :func:`encode` inserts a newline character (``b'\n'``)
newline character (``b'\n'``). after every 76 bytes of the output, as well as ensuring that the output
always ends with a newline, as per :rfc:`2045` (MIME).
.. function:: encodebytes(s) .. function:: encodebytes(s)
encodestring(s) encodestring(s)
Encode the byte string *s*, which can contain arbitrary binary data, and Encode the :term:`bytes-like object` *s*, which can contain arbitrary binary
return a byte string containing one or more lines of base64-encoded data. data, and return :class:`bytes` containing the base64-encoded data, with newlines
:func:`encodebytes` returns a string containing one or more lines of (``b'\n'``) inserted after every 76 bytes of output, and ensuring that
base64-encoded data always including an extra trailing newline (``b'\n'``). there is a trailing newline, as per :rfc:`2045` (MIME).
``encodestring`` is a deprecated alias. ``encodestring`` is a deprecated alias.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment