Commit d05c9ff8 authored by Alexandre Vassalotti's avatar Alexandre Vassalotti

Issue #6784: Strings from Python 2 can now be unpickled as bytes objects.

Initial patch by Merlijn van Deen.

I've added a few unrelated docstring fixes in the patch while I was at
it, which makes the documentation for pickle a bit more consistent.
parent ee07b947
...@@ -173,7 +173,7 @@ The :mod:`pickle` module provides the following constants: ...@@ -173,7 +173,7 @@ The :mod:`pickle` module provides the following constants:
An integer, the default :ref:`protocol version <pickle-protocols>` used An integer, the default :ref:`protocol version <pickle-protocols>` used
for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
default protocol is 3, a new protocol designed for Python 3.0. default protocol is 3, a new protocol designed for Python 3.
The :mod:`pickle` module provides the following functions to make the pickling The :mod:`pickle` module provides the following functions to make the pickling
...@@ -184,9 +184,9 @@ process more convenient: ...@@ -184,9 +184,9 @@ process more convenient:
Write a pickled representation of *obj* to the open :term:`file object` *file*. Write a pickled representation of *obj* to the open :term:`file object` *file*.
This is equivalent to ``Pickler(file, protocol).dump(obj)``. This is equivalent to ``Pickler(file, protocol).dump(obj)``.
The optional *protocol* argument tells the pickler to use the given protocol; The optional *protocol* argument tells the pickler to use the given
supported protocols are 0, 1, 2, 3. The default protocol is 3; a protocol; supported protocols are 0, 1, 2, 3. The default protocol is 3; a
backward-incompatible protocol designed for Python 3.0. backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version Specifying a negative protocol version selects the highest protocol version
supported. The higher the protocol used, the more recent the version of supported. The higher the protocol used, the more recent the version of
...@@ -198,64 +198,66 @@ process more convenient: ...@@ -198,64 +198,66 @@ process more convenient:
interface. interface.
If *fix_imports* is true and *protocol* is less than 3, pickle will try to If *fix_imports* is true and *protocol* is less than 3, pickle will try to
map the new Python 3.x names to the old module names used in Python 2.x, map the new Python 3 names to the old module names used in Python 2, so
so that the pickle data stream is readable with Python 2.x. that the pickle data stream is readable with Python 2.
.. function:: dumps(obj, protocol=None, \*, fix_imports=True) .. function:: dumps(obj, protocol=None, \*, fix_imports=True)
Return the pickled representation of the object as a :class:`bytes` Return the pickled representation of the object as a :class:`bytes` object,
object, instead of writing it to a file. instead of writing it to a file.
The optional *protocol* argument tells the pickler to use the given protocol; The optional *protocol* argument tells the pickler to use the given
supported protocols are 0, 1, 2, 3. The default protocol is 3; a protocol; supported protocols are 0, 1, 2, 3 and 4. The default protocol
backward-incompatible protocol designed for Python 3.0. is 3; a backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version Specifying a negative protocol version selects the highest protocol version
supported. The higher the protocol used, the more recent the version of supported. The higher the protocol used, the more recent the version of
Python needed to read the pickle produced. Python needed to read the pickle produced.
If *fix_imports* is true and *protocol* is less than 3, pickle will try to If *fix_imports* is true and *protocol* is less than 3, pickle will try to
map the new Python 3.x names to the old module names used in Python 2.x, map the new Python 3 names to the old module names used in Python 2, so
so that the pickle data stream is readable with Python 2.x. that the pickle data stream is readable with Python 2.
.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict") .. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
Read a pickled object representation from the open :term:`file object` *file* Read a pickled object representation from the open :term:`file object`
and return the reconstituted object hierarchy specified therein. This is *file* and return the reconstituted object hierarchy specified therein.
equivalent to ``Unpickler(file).load()``. This is equivalent to ``Unpickler(file).load()``.
The protocol version of the pickle is detected automatically, so no protocol The protocol version of the pickle is detected automatically, so no
argument is needed. Bytes past the pickled object's representation are protocol argument is needed. Bytes past the pickled object's
ignored. representation are ignored.
The argument *file* must have two methods, a read() method that takes an The argument *file* must have two methods, a read() method that takes an
integer argument, and a readline() method that requires no arguments. Both integer argument, and a readline() method that requires no arguments. Both
methods should return bytes. Thus *file* can be an on-disk file opened methods should return bytes. Thus *file* can be an on-disk file opened for
for binary reading, a :class:`io.BytesIO` object, or any other custom object binary reading, a :class:`io.BytesIO` object, or any other custom object
that meets this interface. that meets this interface.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*, Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatibility support for pickle stream generated which are used to control compatibility support for pickle stream generated
by Python 2.x. If *fix_imports* is true, pickle will try to map the old by Python 2. If *fix_imports* is true, pickle will try to map the old
Python 2.x names to the new names used in Python 3.x. The *encoding* and Python 2 names to the new names used in Python 3. The *encoding* and
*errors* tell pickle how to decode 8-bit string instances pickled by Python *errors* tell pickle how to decode 8-bit string instances pickled by Python
2.x; these default to 'ASCII' and 'strict', respectively. 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
be 'bytes' to read these 8-bit string instances as bytes objects.
.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict") .. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict")
Read a pickled object hierarchy from a :class:`bytes` object and return the Read a pickled object hierarchy from a :class:`bytes` object and return the
reconstituted object hierarchy specified therein reconstituted object hierarchy specified therein
The protocol version of the pickle is detected automatically, so no protocol The protocol version of the pickle is detected automatically, so no
argument is needed. Bytes past the pickled object's representation are protocol argument is needed. Bytes past the pickled object's
ignored. representation are ignored.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*, Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatibility support for pickle stream generated which are used to control compatibility support for pickle stream generated
by Python 2.x. If *fix_imports* is true, pickle will try to map the old by Python 2. If *fix_imports* is true, pickle will try to map the old
Python 2.x names to the new names used in Python 3.x. The *encoding* and Python 2 names to the new names used in Python 3. The *encoding* and
*errors* tell pickle how to decode 8-bit string instances pickled by Python *errors* tell pickle how to decode 8-bit string instances pickled by Python
2.x; these default to 'ASCII' and 'strict', respectively. 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
be 'bytes' to read these 8-bit string instances as bytes objects.
The :mod:`pickle` module defines three exceptions: The :mod:`pickle` module defines three exceptions:
...@@ -290,9 +292,9 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and ...@@ -290,9 +292,9 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
This takes a binary file for writing a pickle data stream. This takes a binary file for writing a pickle data stream.
The optional *protocol* argument tells the pickler to use the given protocol; The optional *protocol* argument tells the pickler to use the given
supported protocols are 0, 1, 2, 3. The default protocol is 3; a protocol; supported protocols are 0, 1, 2, 3 and 4. The default protocol
backward-incompatible protocol designed for Python 3.0. is 3; a backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version Specifying a negative protocol version selects the highest protocol version
supported. The higher the protocol used, the more recent the version of supported. The higher the protocol used, the more recent the version of
...@@ -300,11 +302,12 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and ...@@ -300,11 +302,12 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
The *file* argument must have a write() method that accepts a single bytes The *file* argument must have a write() method that accepts a single bytes
argument. It can thus be an on-disk file opened for binary writing, a argument. It can thus be an on-disk file opened for binary writing, a
:class:`io.BytesIO` instance, or any other custom object that meets this interface. :class:`io.BytesIO` instance, or any other custom object that meets this
interface.
If *fix_imports* is true and *protocol* is less than 3, pickle will try to If *fix_imports* is true and *protocol* is less than 3, pickle will try to
map the new Python 3.x names to the old module names used in Python 2.x, map the new Python 3 names to the old module names used in Python 2, so
so that the pickle data stream is readable with Python 2.x. that the pickle data stream is readable with Python 2.
.. method:: dump(obj) .. method:: dump(obj)
...@@ -366,16 +369,17 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and ...@@ -366,16 +369,17 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
The argument *file* must have two methods, a read() method that takes an The argument *file* must have two methods, a read() method that takes an
integer argument, and a readline() method that requires no arguments. Both integer argument, and a readline() method that requires no arguments. Both
methods should return bytes. Thus *file* can be an on-disk file object opened methods should return bytes. Thus *file* can be an on-disk file object
for binary reading, a :class:`io.BytesIO` object, or any other custom object opened for binary reading, a :class:`io.BytesIO` object, or any other
that meets this interface. custom object that meets this interface.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*, Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatibility support for pickle stream generated which are used to control compatibility support for pickle stream generated
by Python 2.x. If *fix_imports* is true, pickle will try to map the old by Python 2. If *fix_imports* is true, pickle will try to map the old
Python 2.x names to the new names used in Python 3.x. The *encoding* and Python 2 names to the new names used in Python 3. The *encoding* and
*errors* tell pickle how to decode 8-bit string instances pickled by Python *errors* tell pickle how to decode 8-bit string instances pickled by Python
2.x; these default to 'ASCII' and 'strict', respectively. 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
be 'bytes' to read these ß8-bit string instances as bytes objects.
.. method:: load() .. method:: load()
......
...@@ -348,24 +348,25 @@ class _Pickler: ...@@ -348,24 +348,25 @@ class _Pickler:
def __init__(self, file, protocol=None, *, fix_imports=True): def __init__(self, file, protocol=None, *, fix_imports=True):
"""This takes a binary file for writing a pickle data stream. """This takes a binary file for writing a pickle data stream.
The optional protocol argument tells the pickler to use the The optional *protocol* argument tells the pickler to use the
given protocol; supported protocols are 0, 1, 2, 3 and 4. The given protocol; supported protocols are 0, 1, 2, 3 and 4. The
default protocol is 3; a backward-incompatible protocol designed for default protocol is 3; a backward-incompatible protocol designed
Python 3. for Python 3.
Specifying a negative protocol version selects the highest Specifying a negative protocol version selects the highest
protocol version supported. The higher the protocol used, the protocol version supported. The higher the protocol used, the
more recent the version of Python needed to read the pickle more recent the version of Python needed to read the pickle
produced. produced.
The file argument must have a write() method that accepts a single The *file* argument must have a write() method that accepts a
bytes argument. It can thus be a file object opened for binary single bytes argument. It can thus be a file object opened for
writing, a io.BytesIO instance, or any other custom object that binary writing, a io.BytesIO instance, or any other custom
meets this interface. object that meets this interface.
If fix_imports is True and protocol is less than 3, pickle will try to If *fix_imports* is True and *protocol* is less than 3, pickle
map the new Python 3 names to the old module names used in Python 2, will try to map the new Python 3 names to the old module names
so that the pickle data stream is readable with Python 2. used in Python 2, so that the pickle data stream is readable
with Python 2.
""" """
if protocol is None: if protocol is None:
protocol = DEFAULT_PROTOCOL protocol = DEFAULT_PROTOCOL
...@@ -389,10 +390,9 @@ class _Pickler: ...@@ -389,10 +390,9 @@ class _Pickler:
"""Clears the pickler's "memo". """Clears the pickler's "memo".
The memo is the data structure that remembers which objects the The memo is the data structure that remembers which objects the
pickler has already seen, so that shared or recursive objects are pickler has already seen, so that shared or recursive objects
pickled by reference and not by value. This method is useful when are pickled by reference and not by value. This method is
re-using picklers. useful when re-using picklers.
""" """
self.memo.clear() self.memo.clear()
...@@ -975,8 +975,14 @@ class _Unpickler: ...@@ -975,8 +975,14 @@ class _Unpickler:
encoding="ASCII", errors="strict"): encoding="ASCII", errors="strict"):
"""This takes a binary file for reading a pickle data stream. """This takes a binary file for reading a pickle data stream.
The protocol version of the pickle is detected automatically, so no The protocol version of the pickle is detected automatically, so
proto argument is needed. no proto argument is needed.
The argument *file* must have two methods, a read() method that
takes an integer argument, and a readline() method that requires
no arguments. Both methods should return bytes. Thus *file*
can be a binary file object opened for reading, a io.BytesIO
object, or any other custom object that meets this interface.
The file-like object must have two methods, a read() method The file-like object must have two methods, a read() method
that takes an integer argument, and a readline() method that that takes an integer argument, and a readline() method that
...@@ -985,13 +991,14 @@ class _Unpickler: ...@@ -985,13 +991,14 @@ class _Unpickler:
reading, a BytesIO object, or any other custom object that reading, a BytesIO object, or any other custom object that
meets this interface. meets this interface.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*, Optional keyword arguments are *fix_imports*, *encoding* and
which are used to control compatiblity support for pickle stream *errors*, which are used to control compatiblity support for
generated by Python 2.x. If *fix_imports* is True, pickle will try to pickle stream generated by Python 2. If *fix_imports* is True,
map the old Python 2.x names to the new names used in Python 3.x. The pickle will try to map the old Python 2 names to the new names
*encoding* and *errors* tell pickle how to decode 8-bit string used in Python 3. The *encoding* and *errors* tell pickle how
instances pickled by Python 2.x; these default to 'ASCII' and to decode 8-bit string instances pickled by Python 2; these
'strict', respectively. default to 'ASCII' and 'strict', respectively. *encoding* can be
'bytes' to read theses 8-bit string instances as bytes objects.
""" """
self._file_readline = file.readline self._file_readline = file.readline
self._file_read = file.read self._file_read = file.read
...@@ -1139,6 +1146,15 @@ class _Unpickler: ...@@ -1139,6 +1146,15 @@ class _Unpickler:
self.append(unpack('>d', self.read(8))[0]) self.append(unpack('>d', self.read(8))[0])
dispatch[BINFLOAT[0]] = load_binfloat dispatch[BINFLOAT[0]] = load_binfloat
def _decode_string(self, value):
# Used to allow strings from Python 2 to be decoded either as
# bytes or Unicode strings. This should be used only with the
# STRING, BINSTRING and SHORT_BINSTRING opcodes.
if self.encoding == "bytes":
return value
else:
return value.decode(self.encoding, self.errors)
def load_string(self): def load_string(self):
data = self.readline()[:-1] data = self.readline()[:-1]
# Strip outermost quotes # Strip outermost quotes
...@@ -1146,8 +1162,7 @@ class _Unpickler: ...@@ -1146,8 +1162,7 @@ class _Unpickler:
data = data[1:-1] data = data[1:-1]
else: else:
raise UnpicklingError("the STRING opcode argument must be quoted") raise UnpicklingError("the STRING opcode argument must be quoted")
self.append(codecs.escape_decode(data)[0] self.append(self._decode_string(codecs.escape_decode(data)[0]))
.decode(self.encoding, self.errors))
dispatch[STRING[0]] = load_string dispatch[STRING[0]] = load_string
def load_binstring(self): def load_binstring(self):
...@@ -1156,8 +1171,7 @@ class _Unpickler: ...@@ -1156,8 +1171,7 @@ class _Unpickler:
if len < 0: if len < 0:
raise UnpicklingError("BINSTRING pickle has negative byte count") raise UnpicklingError("BINSTRING pickle has negative byte count")
data = self.read(len) data = self.read(len)
value = str(data, self.encoding, self.errors) self.append(self._decode_string(data))
self.append(value)
dispatch[BINSTRING[0]] = load_binstring dispatch[BINSTRING[0]] = load_binstring
def load_binbytes(self): def load_binbytes(self):
...@@ -1191,8 +1205,7 @@ class _Unpickler: ...@@ -1191,8 +1205,7 @@ class _Unpickler:
def load_short_binstring(self): def load_short_binstring(self):
len = self.read(1)[0] len = self.read(1)[0]
data = self.read(len) data = self.read(len)
value = str(data, self.encoding, self.errors) self.append(self._decode_string(data))
self.append(value)
dispatch[SHORT_BINSTRING[0]] = load_short_binstring dispatch[SHORT_BINSTRING[0]] = load_short_binstring
def load_short_binbytes(self): def load_short_binbytes(self):
......
...@@ -969,36 +969,30 @@ class StackObject(object): ...@@ -969,36 +969,30 @@ class StackObject(object):
return self.name return self.name
pyint = StackObject( pyint = pylong = StackObject(
name='int', name='int',
obtype=int, obtype=int,
doc="A short (as opposed to long) Python integer object.") doc="A Python integer object.")
pylong = StackObject(
name='long',
obtype=int,
doc="A long (as opposed to short) Python integer object.")
pyinteger_or_bool = StackObject( pyinteger_or_bool = StackObject(
name='int_or_bool', name='int_or_bool',
obtype=(int, bool), obtype=(int, bool),
doc="A Python integer object (short or long), or " doc="A Python integer or boolean object.")
"a Python bool.")
pybool = StackObject( pybool = StackObject(
name='bool', name='bool',
obtype=(bool,), obtype=bool,
doc="A Python bool object.") doc="A Python boolean object.")
pyfloat = StackObject( pyfloat = StackObject(
name='float', name='float',
obtype=float, obtype=float,
doc="A Python float object.") doc="A Python float object.")
pystring = StackObject( pybytes_or_str = pystring = StackObject(
name='string', name='bytes_or_str',
obtype=bytes, obtype=(bytes, str),
doc="A Python (8-bit) string object.") doc="A Python bytes or (Unicode) string object.")
pybytes = StackObject( pybytes = StackObject(
name='bytes', name='bytes',
...@@ -1050,32 +1044,32 @@ markobject = StackObject( ...@@ -1050,32 +1044,32 @@ markobject = StackObject(
obtype=StackObject, obtype=StackObject,
doc="""'The mark' is a unique object. doc="""'The mark' is a unique object.
Opcodes that operate on a variable number of objects Opcodes that operate on a variable number of objects
generally don't embed the count of objects in the opcode, generally don't embed the count of objects in the opcode,
or pull it off the stack. Instead the MARK opcode is used or pull it off the stack. Instead the MARK opcode is used
to push a special marker object on the stack, and then to push a special marker object on the stack, and then
some other opcodes grab all the objects from the top of some other opcodes grab all the objects from the top of
the stack down to (but not including) the topmost marker the stack down to (but not including) the topmost marker
object. object.
""") """)
stackslice = StackObject( stackslice = StackObject(
name="stackslice", name="stackslice",
obtype=StackObject, obtype=StackObject,
doc="""An object representing a contiguous slice of the stack. doc="""An object representing a contiguous slice of the stack.
This is used in conjunction with markobject, to represent all This is used in conjunction with markobject, to represent all
of the stack following the topmost markobject. For example, of the stack following the topmost markobject. For example,
the POP_MARK opcode changes the stack from the POP_MARK opcode changes the stack from
[..., markobject, stackslice] [..., markobject, stackslice]
to to
[...] [...]
No matter how many object are on the stack after the topmost No matter how many object are on the stack after the topmost
markobject, POP_MARK gets rid of all of them (including the markobject, POP_MARK gets rid of all of them (including the
topmost markobject too). topmost markobject too).
""") """)
############################################################################## ##############################################################################
# Descriptors for pickle opcodes. # Descriptors for pickle opcodes.
...@@ -1212,7 +1206,7 @@ opcodes = [ ...@@ -1212,7 +1206,7 @@ opcodes = [
code='L', code='L',
arg=decimalnl_long, arg=decimalnl_long,
stack_before=[], stack_before=[],
stack_after=[pylong], stack_after=[pyint],
proto=0, proto=0,
doc="""Push a long integer. doc="""Push a long integer.
...@@ -1230,7 +1224,7 @@ opcodes = [ ...@@ -1230,7 +1224,7 @@ opcodes = [
code='\x8a', code='\x8a',
arg=long1, arg=long1,
stack_before=[], stack_before=[],
stack_after=[pylong], stack_after=[pyint],
proto=2, proto=2,
doc="""Long integer using one-byte length. doc="""Long integer using one-byte length.
...@@ -1241,7 +1235,7 @@ opcodes = [ ...@@ -1241,7 +1235,7 @@ opcodes = [
code='\x8b', code='\x8b',
arg=long4, arg=long4,
stack_before=[], stack_before=[],
stack_after=[pylong], stack_after=[pyint],
proto=2, proto=2,
doc="""Long integer using found-byte length. doc="""Long integer using found-byte length.
...@@ -1254,45 +1248,50 @@ opcodes = [ ...@@ -1254,45 +1248,50 @@ opcodes = [
code='S', code='S',
arg=stringnl, arg=stringnl,
stack_before=[], stack_before=[],
stack_after=[pystring], stack_after=[pybytes_or_str],
proto=0, proto=0,
doc="""Push a Python string object. doc="""Push a Python string object.
The argument is a repr-style string, with bracketing quote characters, The argument is a repr-style string, with bracketing quote characters,
and perhaps embedded escapes. The argument extends until the next and perhaps embedded escapes. The argument extends until the next
newline character. (Actually, they are decoded into a str instance newline character. These are usually decoded into a str instance
using the encoding given to the Unpickler constructor. or the default, using the encoding given to the Unpickler constructor. or the default,
'ASCII'.) 'ASCII'. If the encoding given was 'bytes' however, they will be
decoded as bytes object instead.
"""), """),
I(name='BINSTRING', I(name='BINSTRING',
code='T', code='T',
arg=string4, arg=string4,
stack_before=[], stack_before=[],
stack_after=[pystring], stack_after=[pybytes_or_str],
proto=1, proto=1,
doc="""Push a Python string object. doc="""Push a Python string object.
There are two arguments: the first is a 4-byte little-endian signed int There are two arguments: the first is a 4-byte little-endian
giving the number of bytes in the string, and the second is that many signed int giving the number of bytes in the string, and the
bytes, which are taken literally as the string content. (Actually, second is that many bytes, which are taken literally as the string
they are decoded into a str instance using the encoding given to the content. These are usually decoded into a str instance using the
Unpickler constructor. or the default, 'ASCII'.) encoding given to the Unpickler constructor. or the default,
'ASCII'. If the encoding given was 'bytes' however, they will be
decoded as bytes object instead.
"""), """),
I(name='SHORT_BINSTRING', I(name='SHORT_BINSTRING',
code='U', code='U',
arg=string1, arg=string1,
stack_before=[], stack_before=[],
stack_after=[pystring], stack_after=[pybytes_or_str],
proto=1, proto=1,
doc="""Push a Python string object. doc="""Push a Python string object.
There are two arguments: the first is a 1-byte unsigned int giving There are two arguments: the first is a 1-byte unsigned int giving
the number of bytes in the string, and the second is that many bytes, the number of bytes in the string, and the second is that many
which are taken literally as the string content. (Actually, they bytes, which are taken literally as the string content. These are
are decoded into a str instance using the encoding given to the usually decoded into a str instance using the encoding given to
Unpickler constructor. or the default, 'ASCII'.) the Unpickler constructor. or the default, 'ASCII'. If the
encoding given was 'bytes' however, they will be decoded as bytes
object instead.
"""), """),
# Bytes (protocol 3 only; older protocols don't support bytes at all) # Bytes (protocol 3 only; older protocols don't support bytes at all)
......
...@@ -1305,6 +1305,35 @@ class AbstractPickleTests(unittest.TestCase): ...@@ -1305,6 +1305,35 @@ class AbstractPickleTests(unittest.TestCase):
dumped = self.dumps(set([3]), 2) dumped = self.dumps(set([3]), 2)
self.assertEqual(dumped, DATA6) self.assertEqual(dumped, DATA6)
def test_load_python2_str_as_bytes(self):
# From Python 2: pickle.dumps('a\x00\xa0', protocol=0)
self.assertEqual(self.loads(b"S'a\\x00\\xa0'\n.",
encoding="bytes"), b'a\x00\xa0')
# From Python 2: pickle.dumps('a\x00\xa0', protocol=1)
self.assertEqual(self.loads(b'U\x03a\x00\xa0.',
encoding="bytes"), b'a\x00\xa0')
# From Python 2: pickle.dumps('a\x00\xa0', protocol=2)
self.assertEqual(self.loads(b'\x80\x02U\x03a\x00\xa0.',
encoding="bytes"), b'a\x00\xa0')
def test_load_python2_unicode_as_str(self):
# From Python 2: pickle.dumps(u'π', protocol=0)
self.assertEqual(self.loads(b'V\\u03c0\n.',
encoding='bytes'), 'π')
# From Python 2: pickle.dumps(u'π', protocol=1)
self.assertEqual(self.loads(b'X\x02\x00\x00\x00\xcf\x80.',
encoding="bytes"), 'π')
# From Python 2: pickle.dumps(u'π', protocol=2)
self.assertEqual(self.loads(b'\x80\x02X\x02\x00\x00\x00\xcf\x80.',
encoding="bytes"), 'π')
def test_load_long_python2_str_as_bytes(self):
# From Python 2: pickle.dumps('x' * 300, protocol=1)
self.assertEqual(self.loads(pickle.BINSTRING +
struct.pack("<I", 300) +
b'x' * 300 + pickle.STOP,
encoding='bytes'), b'x' * 300)
def test_large_pickles(self): def test_large_pickles(self):
# Test the correctness of internal buffering routines when handling # Test the correctness of internal buffering routines when handling
# large data. # large data.
...@@ -1566,7 +1595,6 @@ class AbstractPickleTests(unittest.TestCase): ...@@ -1566,7 +1595,6 @@ class AbstractPickleTests(unittest.TestCase):
unpickled = self.loads(self.dumps(method, proto)) unpickled = self.loads(self.dumps(method, proto))
self.assertEqual(method(obj), unpickled(obj)) self.assertEqual(method(obj), unpickled(obj))
def test_c_methods(self): def test_c_methods(self):
global Subclass global Subclass
class Subclass(tuple): class Subclass(tuple):
......
...@@ -83,13 +83,17 @@ class PyPicklerUnpicklerObjectTests(AbstractPicklerUnpicklerObjectTests): ...@@ -83,13 +83,17 @@ class PyPicklerUnpicklerObjectTests(AbstractPicklerUnpicklerObjectTests):
class PyDispatchTableTests(AbstractDispatchTableTests): class PyDispatchTableTests(AbstractDispatchTableTests):
pickler_class = pickle._Pickler pickler_class = pickle._Pickler
def get_dispatch_table(self): def get_dispatch_table(self):
return pickle.dispatch_table.copy() return pickle.dispatch_table.copy()
class PyChainDispatchTableTests(AbstractDispatchTableTests): class PyChainDispatchTableTests(AbstractDispatchTableTests):
pickler_class = pickle._Pickler pickler_class = pickle._Pickler
def get_dispatch_table(self): def get_dispatch_table(self):
return collections.ChainMap({}, pickle.dispatch_table) return collections.ChainMap({}, pickle.dispatch_table)
......
...@@ -293,6 +293,7 @@ Kushal Das ...@@ -293,6 +293,7 @@ Kushal Das
Jonathan Dasteel Jonathan Dasteel
Pierre-Yves David Pierre-Yves David
A. Jesse Jiryu Davis A. Jesse Jiryu Davis
Merlijn van Deen
John DeGood John DeGood
Ned Deily Ned Deily
Vincent Delft Vincent Delft
......
...@@ -23,6 +23,10 @@ Library ...@@ -23,6 +23,10 @@ Library
- Issue #19296: Silence compiler warning in dbm_open - Issue #19296: Silence compiler warning in dbm_open
- Issue #6784: Strings from Python 2 can now be unpickled as bytes
objects by setting the encoding argument of Unpickler to be 'bytes'.
Initial patch by Merlijn van Deen.
- Issue #19839: Fix regression in bz2 module's handling of non-bzip2 data at - Issue #19839: Fix regression in bz2 module's handling of non-bzip2 data at
EOF, and analogous bug in lzma module. EOF, and analogous bug in lzma module.
......
...@@ -4016,48 +4016,44 @@ _pickle.Pickler.__init__ ...@@ -4016,48 +4016,44 @@ _pickle.Pickler.__init__
This takes a binary file for writing a pickle data stream. This takes a binary file for writing a pickle data stream.
The optional protocol argument tells the pickler to use the The optional *protocol* argument tells the pickler to use the given
given protocol; supported protocols are 0, 1, 2, 3 and 4. The protocol; supported protocols are 0, 1, 2, 3 and 4. The default
default protocol is 3; a backward-incompatible protocol designed for protocol is 3; a backward-incompatible protocol designed for Python 3.
Python 3.
Specifying a negative protocol version selects the highest Specifying a negative protocol version selects the highest protocol
protocol version supported. The higher the protocol used, the version supported. The higher the protocol used, the more recent the
more recent the version of Python needed to read the pickle version of Python needed to read the pickle produced.
produced.
The file argument must have a write() method that accepts a single The *file* argument must have a write() method that accepts a single
bytes argument. It can thus be a file object opened for binary bytes argument. It can thus be a file object opened for binary
writing, a io.BytesIO instance, or any other custom object that writing, a io.BytesIO instance, or any other custom object that meets
meets this interface. this interface.
If fix_imports is True and protocol is less than 3, pickle will try to If *fix_imports* is True and protocol is less than 3, pickle will try
map the new Python 3 names to the old module names used in Python 2, to map the new Python 3 names to the old module names used in Python
so that the pickle data stream is readable with Python 2. 2, so that the pickle data stream is readable with Python 2.
[clinic]*/ [clinic]*/
PyDoc_STRVAR(_pickle_Pickler___init____doc__, PyDoc_STRVAR(_pickle_Pickler___init____doc__,
"__init__(file, protocol=None, fix_imports=True)\n" "__init__(file, protocol=None, fix_imports=True)\n"
"This takes a binary file for writing a pickle data stream.\n" "This takes a binary file for writing a pickle data stream.\n"
"\n" "\n"
"The optional protocol argument tells the pickler to use the\n" "The optional *protocol* argument tells the pickler to use the given\n"
"given protocol; supported protocols are 0, 1, 2, 3 and 4. The\n" "protocol; supported protocols are 0, 1, 2, 3 and 4. The default\n"
"default protocol is 3; a backward-incompatible protocol designed for\n" "protocol is 3; a backward-incompatible protocol designed for Python 3.\n"
"Python 3.\n"
"\n" "\n"
"Specifying a negative protocol version selects the highest\n" "Specifying a negative protocol version selects the highest protocol\n"
"protocol version supported. The higher the protocol used, the\n" "version supported. The higher the protocol used, the more recent the\n"
"more recent the version of Python needed to read the pickle\n" "version of Python needed to read the pickle produced.\n"
"produced.\n"
"\n" "\n"
"The file argument must have a write() method that accepts a single\n" "The *file* argument must have a write() method that accepts a single\n"
"bytes argument. It can thus be a file object opened for binary\n" "bytes argument. It can thus be a file object opened for binary\n"
"writing, a io.BytesIO instance, or any other custom object that\n" "writing, a io.BytesIO instance, or any other custom object that meets\n"
"meets this interface.\n" "this interface.\n"
"\n" "\n"
"If fix_imports is True and protocol is less than 3, pickle will try to\n" "If *fix_imports* is True and protocol is less than 3, pickle will try\n"
"map the new Python 3 names to the old module names used in Python 2,\n" "to map the new Python 3 names to the old module names used in Python\n"
"so that the pickle data stream is readable with Python 2."); "2, so that the pickle data stream is readable with Python 2.");
#define _PICKLE_PICKLER___INIT___METHODDEF \ #define _PICKLE_PICKLER___INIT___METHODDEF \
{"__init__", (PyCFunction)_pickle_Pickler___init__, METH_VARARGS|METH_KEYWORDS, _pickle_Pickler___init____doc__}, {"__init__", (PyCFunction)_pickle_Pickler___init__, METH_VARARGS|METH_KEYWORDS, _pickle_Pickler___init____doc__},
...@@ -4086,7 +4082,7 @@ exit: ...@@ -4086,7 +4082,7 @@ exit:
static PyObject * static PyObject *
_pickle_Pickler___init___impl(PicklerObject *self, PyObject *file, PyObject *protocol, int fix_imports) _pickle_Pickler___init___impl(PicklerObject *self, PyObject *file, PyObject *protocol, int fix_imports)
/*[clinic checksum: c99ff417bd703a74affc4b708167e56e135e8969]*/ /*[clinic checksum: 2b5ce6452544600478cf9f4b701ab9d9b5efbab9]*/
{ {
_Py_IDENTIFIER(persistent_id); _Py_IDENTIFIER(persistent_id);
_Py_IDENTIFIER(dispatch_table); _Py_IDENTIFIER(dispatch_table);
...@@ -4831,7 +4827,7 @@ static int ...@@ -4831,7 +4827,7 @@ static int
load_string(UnpicklerObject *self) load_string(UnpicklerObject *self)
{ {
PyObject *bytes; PyObject *bytes;
PyObject *str = NULL; PyObject *obj;
Py_ssize_t len; Py_ssize_t len;
char *s, *p; char *s, *p;
...@@ -4857,19 +4853,28 @@ load_string(UnpicklerObject *self) ...@@ -4857,19 +4853,28 @@ load_string(UnpicklerObject *self)
bytes = PyBytes_DecodeEscape(p, len, NULL, 0, NULL); bytes = PyBytes_DecodeEscape(p, len, NULL, 0, NULL);
if (bytes == NULL) if (bytes == NULL)
return -1; return -1;
str = PyUnicode_FromEncodedObject(bytes, self->encoding, self->errors);
/* Leave the Python 2.x strings as bytes if the *encoding* given to the
Unpickler was 'bytes'. Otherwise, convert them to unicode. */
if (strcmp(self->encoding, "bytes") == 0) {
obj = bytes;
}
else {
obj = PyUnicode_FromEncodedObject(bytes, self->encoding, self->errors);
Py_DECREF(bytes); Py_DECREF(bytes);
if (str == NULL) if (obj == NULL) {
return -1; return -1;
}
}
PDATA_PUSH(self->stack, str, -1); PDATA_PUSH(self->stack, obj, -1);
return 0; return 0;
} }
static int static int
load_counted_binbytes(UnpicklerObject *self, int nbytes) load_counted_binstring(UnpicklerObject *self, int nbytes)
{ {
PyObject *bytes; PyObject *obj;
Py_ssize_t size; Py_ssize_t size;
char *s; char *s;
...@@ -4878,8 +4883,9 @@ load_counted_binbytes(UnpicklerObject *self, int nbytes) ...@@ -4878,8 +4883,9 @@ load_counted_binbytes(UnpicklerObject *self, int nbytes)
size = calc_binsize(s, nbytes); size = calc_binsize(s, nbytes);
if (size < 0) { if (size < 0) {
PyErr_Format(PyExc_OverflowError, PickleState *st = _Pickle_GetGlobalState();
"BINBYTES exceeds system's maximum size of %zd bytes", PyErr_Format(st->UnpicklingError,
"BINSTRING exceeds system's maximum size of %zd bytes",
PY_SSIZE_T_MAX); PY_SSIZE_T_MAX);
return -1; return -1;
} }
...@@ -4887,18 +4893,26 @@ load_counted_binbytes(UnpicklerObject *self, int nbytes) ...@@ -4887,18 +4893,26 @@ load_counted_binbytes(UnpicklerObject *self, int nbytes)
if (_Unpickler_Read(self, &s, size) < 0) if (_Unpickler_Read(self, &s, size) < 0)
return -1; return -1;
bytes = PyBytes_FromStringAndSize(s, size); /* Convert Python 2.x strings to bytes if the *encoding* given to the
if (bytes == NULL) Unpickler was 'bytes'. Otherwise, convert them to unicode. */
if (strcmp(self->encoding, "bytes") == 0) {
obj = PyBytes_FromStringAndSize(s, size);
}
else {
obj = PyUnicode_Decode(s, size, self->encoding, self->errors);
}
if (obj == NULL) {
return -1; return -1;
}
PDATA_PUSH(self->stack, bytes, -1); PDATA_PUSH(self->stack, obj, -1);
return 0; return 0;
} }
static int static int
load_counted_binstring(UnpicklerObject *self, int nbytes) load_counted_binbytes(UnpicklerObject *self, int nbytes)
{ {
PyObject *str; PyObject *bytes;
Py_ssize_t size; Py_ssize_t size;
char *s; char *s;
...@@ -4907,21 +4921,20 @@ load_counted_binstring(UnpicklerObject *self, int nbytes) ...@@ -4907,21 +4921,20 @@ load_counted_binstring(UnpicklerObject *self, int nbytes)
size = calc_binsize(s, nbytes); size = calc_binsize(s, nbytes);
if (size < 0) { if (size < 0) {
PickleState *st = _Pickle_GetGlobalState(); PyErr_Format(PyExc_OverflowError,
PyErr_Format(st->UnpicklingError, "BINBYTES exceeds system's maximum size of %zd bytes",
"BINSTRING exceeds system's maximum size of %zd bytes",
PY_SSIZE_T_MAX); PY_SSIZE_T_MAX);
return -1; return -1;
} }
if (_Unpickler_Read(self, &s, size) < 0) if (_Unpickler_Read(self, &s, size) < 0)
return -1; return -1;
/* Convert Python 2.x strings to unicode. */
str = PyUnicode_Decode(s, size, self->encoding, self->errors); bytes = PyBytes_FromStringAndSize(s, size);
if (str == NULL) if (bytes == NULL)
return -1; return -1;
PDATA_PUSH(self->stack, str, -1); PDATA_PUSH(self->stack, bytes, -1);
return 0; return 0;
} }
...@@ -6258,25 +6271,25 @@ _pickle.Unpickler.load ...@@ -6258,25 +6271,25 @@ _pickle.Unpickler.load
Load a pickle. Load a pickle.
Read a pickled object representation from the open file object given in Read a pickled object representation from the open file object given
the constructor, and return the reconstituted object hierarchy specified in the constructor, and return the reconstituted object hierarchy
therein. specified therein.
[clinic]*/ [clinic]*/
PyDoc_STRVAR(_pickle_Unpickler_load__doc__, PyDoc_STRVAR(_pickle_Unpickler_load__doc__,
"load()\n" "load()\n"
"Load a pickle.\n" "Load a pickle.\n"
"\n" "\n"
"Read a pickled object representation from the open file object given in\n" "Read a pickled object representation from the open file object given\n"
"the constructor, and return the reconstituted object hierarchy specified\n" "in the constructor, and return the reconstituted object hierarchy\n"
"therein."); "specified therein.");
#define _PICKLE_UNPICKLER_LOAD_METHODDEF \ #define _PICKLE_UNPICKLER_LOAD_METHODDEF \
{"load", (PyCFunction)_pickle_Unpickler_load, METH_NOARGS, _pickle_Unpickler_load__doc__}, {"load", (PyCFunction)_pickle_Unpickler_load, METH_NOARGS, _pickle_Unpickler_load__doc__},
static PyObject * static PyObject *
_pickle_Unpickler_load(PyObject *self) _pickle_Unpickler_load(PyObject *self)
/*[clinic checksum: 9a30ba4e4d9221d4dcd705e1471ab11b2c9e3ac6]*/ /*[clinic checksum: c2ae1263f0dd000f34ccf0fe59d7c544464babc4]*/
{ {
UnpicklerObject *unpickler = (UnpicklerObject*)self; UnpicklerObject *unpickler = (UnpicklerObject*)self;
...@@ -6310,8 +6323,9 @@ _pickle.Unpickler.find_class ...@@ -6310,8 +6323,9 @@ _pickle.Unpickler.find_class
Return an object from a specified module. Return an object from a specified module.
If necessary, the module will be imported. Subclasses may override this If necessary, the module will be imported. Subclasses may override
method (e.g. to restrict unpickling of arbitrary classes and functions). this method (e.g. to restrict unpickling of arbitrary classes and
functions).
This method is called whenever a class or a function object is This method is called whenever a class or a function object is
needed. Both arguments passed are str objects. needed. Both arguments passed are str objects.
...@@ -6321,8 +6335,9 @@ PyDoc_STRVAR(_pickle_Unpickler_find_class__doc__, ...@@ -6321,8 +6335,9 @@ PyDoc_STRVAR(_pickle_Unpickler_find_class__doc__,
"find_class(module_name, global_name)\n" "find_class(module_name, global_name)\n"
"Return an object from a specified module.\n" "Return an object from a specified module.\n"
"\n" "\n"
"If necessary, the module will be imported. Subclasses may override this\n" "If necessary, the module will be imported. Subclasses may override\n"
"method (e.g. to restrict unpickling of arbitrary classes and functions).\n" "this method (e.g. to restrict unpickling of arbitrary classes and\n"
"functions).\n"
"\n" "\n"
"This method is called whenever a class or a function object is\n" "This method is called whenever a class or a function object is\n"
"needed. Both arguments passed are str objects."); "needed. Both arguments passed are str objects.");
...@@ -6352,7 +6367,7 @@ exit: ...@@ -6352,7 +6367,7 @@ exit:
static PyObject * static PyObject *
_pickle_Unpickler_find_class_impl(UnpicklerObject *self, PyObject *module_name, PyObject *global_name) _pickle_Unpickler_find_class_impl(UnpicklerObject *self, PyObject *module_name, PyObject *global_name)
/*[clinic checksum: b7d05d4dd8adc698e5780c1ac2be0f5062d33915]*/ /*[clinic checksum: 1f353d13a32c9d94feb1466b3c2d0529a7e5650e]*/
{ {
PyObject *global; PyObject *global;
PyObject *modules_dict; PyObject *modules_dict;
...@@ -6515,23 +6530,23 @@ _pickle.Unpickler.__init__ ...@@ -6515,23 +6530,23 @@ _pickle.Unpickler.__init__
This takes a binary file for reading a pickle data stream. This takes a binary file for reading a pickle data stream.
The protocol version of the pickle is detected automatically, so no The protocol version of the pickle is detected automatically, so no
proto argument is needed. protocol argument is needed. Bytes past the pickled object's
representation are ignored.
The file-like object must have two methods, a read() method The argument *file* must have two methods, a read() method that takes
that takes an integer argument, and a readline() method that an integer argument, and a readline() method that requires no
requires no arguments. Both methods should return bytes. arguments. Both methods should return bytes. Thus *file* can be a
Thus file-like object can be a binary file object opened for binary file object opened for reading, a io.BytesIO object, or any
reading, a BytesIO object, or any other custom object that other custom object that meets this interface.
meets this interface.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*, Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatiblity support for pickle stream which are used to control compatiblity support for pickle stream
generated by Python 2.x. If *fix_imports* is True, pickle will try to generated by Python 2. If *fix_imports* is True, pickle will try to
map the old Python 2.x names to the new names used in Python 3.x. The map the old Python 2 names to the new names used in Python 3. The
*encoding* and *errors* tell pickle how to decode 8-bit string *encoding* and *errors* tell pickle how to decode 8-bit string
instances pickled by Python 2.x; these default to 'ASCII' and instances pickled by Python 2; these default to 'ASCII' and 'strict',
'strict', respectively. respectively. The *encoding* can be 'bytes' to read these 8-bit
string instances as bytes objects.
[clinic]*/ [clinic]*/
PyDoc_STRVAR(_pickle_Unpickler___init____doc__, PyDoc_STRVAR(_pickle_Unpickler___init____doc__,
...@@ -6539,22 +6554,23 @@ PyDoc_STRVAR(_pickle_Unpickler___init____doc__, ...@@ -6539,22 +6554,23 @@ PyDoc_STRVAR(_pickle_Unpickler___init____doc__,
"This takes a binary file for reading a pickle data stream.\n" "This takes a binary file for reading a pickle data stream.\n"
"\n" "\n"
"The protocol version of the pickle is detected automatically, so no\n" "The protocol version of the pickle is detected automatically, so no\n"
"proto argument is needed.\n" "protocol argument is needed. Bytes past the pickled object\'s\n"
"representation are ignored.\n"
"\n" "\n"
"The file-like object must have two methods, a read() method\n" "The argument *file* must have two methods, a read() method that takes\n"
"that takes an integer argument, and a readline() method that\n" "an integer argument, and a readline() method that requires no\n"
"requires no arguments. Both methods should return bytes.\n" "arguments. Both methods should return bytes. Thus *file* can be a\n"
"Thus file-like object can be a binary file object opened for\n" "binary file object opened for reading, a io.BytesIO object, or any\n"
"reading, a BytesIO object, or any other custom object that\n" "other custom object that meets this interface.\n"
"meets this interface.\n"
"\n" "\n"
"Optional keyword arguments are *fix_imports*, *encoding* and *errors*,\n" "Optional keyword arguments are *fix_imports*, *encoding* and *errors*,\n"
"which are used to control compatiblity support for pickle stream\n" "which are used to control compatiblity support for pickle stream\n"
"generated by Python 2.x. If *fix_imports* is True, pickle will try to\n" "generated by Python 2. If *fix_imports* is True, pickle will try to\n"
"map the old Python 2.x names to the new names used in Python 3.x. The\n" "map the old Python 2 names to the new names used in Python 3. The\n"
"*encoding* and *errors* tell pickle how to decode 8-bit string\n" "*encoding* and *errors* tell pickle how to decode 8-bit string\n"
"instances pickled by Python 2.x; these default to \'ASCII\' and\n" "instances pickled by Python 2; these default to \'ASCII\' and \'strict\',\n"
"\'strict\', respectively."); "respectively. The *encoding* can be \'bytes\' to read these 8-bit\n"
"string instances as bytes objects.");
#define _PICKLE_UNPICKLER___INIT___METHODDEF \ #define _PICKLE_UNPICKLER___INIT___METHODDEF \
{"__init__", (PyCFunction)_pickle_Unpickler___init__, METH_VARARGS|METH_KEYWORDS, _pickle_Unpickler___init____doc__}, {"__init__", (PyCFunction)_pickle_Unpickler___init__, METH_VARARGS|METH_KEYWORDS, _pickle_Unpickler___init____doc__},
...@@ -6584,7 +6600,7 @@ exit: ...@@ -6584,7 +6600,7 @@ exit:
static PyObject * static PyObject *
_pickle_Unpickler___init___impl(UnpicklerObject *self, PyObject *file, int fix_imports, const char *encoding, const char *errors) _pickle_Unpickler___init___impl(UnpicklerObject *self, PyObject *file, int fix_imports, const char *encoding, const char *errors)
/*[clinic checksum: bed0d8bbe1c647960ccc6f997b33bf33935fa56f]*/ /*[clinic checksum: 9ce6783224e220573d42a94fe1bb7199d6f1c5a6]*/
{ {
_Py_IDENTIFIER(persistent_load); _Py_IDENTIFIER(persistent_load);
...@@ -7033,48 +7049,50 @@ _pickle.dump ...@@ -7033,48 +7049,50 @@ _pickle.dump
Write a pickled representation of obj to the open file object file. Write a pickled representation of obj to the open file object file.
This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may be more This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may
efficient. be more efficient.
The optional protocol argument tells the pickler to use the given protocol The optional *protocol* argument tells the pickler to use the given
supported protocols are 0, 1, 2, 3. The default protocol is 3; a protocol supported protocols are 0, 1, 2, 3 and 4. The default
backward-incompatible protocol designed for Python 3.0. protocol is 3; a backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version Specifying a negative protocol version selects the highest protocol
supported. The higher the protocol used, the more recent the version of version supported. The higher the protocol used, the more recent the
Python needed to read the pickle produced. version of Python needed to read the pickle produced.
The file argument must have a write() method that accepts a single bytes The *file* argument must have a write() method that accepts a single
argument. It can thus be a file object opened for binary writing, a bytes argument. It can thus be a file object opened for binary
io.BytesIO instance, or any other custom object that meets this interface. writing, a io.BytesIO instance, or any other custom object that meets
this interface.
If fix_imports is True and protocol is less than 3, pickle will try to If *fix_imports* is True and protocol is less than 3, pickle will try
map the new Python 3.x names to the old module names used in Python 2.x, to map the new Python 3 names to the old module names used in Python
so that the pickle data stream is readable with Python 2.x. 2, so that the pickle data stream is readable with Python 2.
[clinic]*/ [clinic]*/
PyDoc_STRVAR(_pickle_dump__doc__, PyDoc_STRVAR(_pickle_dump__doc__,
"dump(obj, file, protocol=None, *, fix_imports=True)\n" "dump(obj, file, protocol=None, *, fix_imports=True)\n"
"Write a pickled representation of obj to the open file object file.\n" "Write a pickled representation of obj to the open file object file.\n"
"\n" "\n"
"This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may be more\n" "This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may\n"
"efficient.\n" "be more efficient.\n"
"\n" "\n"
"The optional protocol argument tells the pickler to use the given protocol\n" "The optional *protocol* argument tells the pickler to use the given\n"
"supported protocols are 0, 1, 2, 3. The default protocol is 3; a\n" "protocol supported protocols are 0, 1, 2, 3 and 4. The default\n"
"backward-incompatible protocol designed for Python 3.0.\n" "protocol is 3; a backward-incompatible protocol designed for Python 3.\n"
"\n" "\n"
"Specifying a negative protocol version selects the highest protocol version\n" "Specifying a negative protocol version selects the highest protocol\n"
"supported. The higher the protocol used, the more recent the version of\n" "version supported. The higher the protocol used, the more recent the\n"
"Python needed to read the pickle produced.\n" "version of Python needed to read the pickle produced.\n"
"\n" "\n"
"The file argument must have a write() method that accepts a single bytes\n" "The *file* argument must have a write() method that accepts a single\n"
"argument. It can thus be a file object opened for binary writing, a\n" "bytes argument. It can thus be a file object opened for binary\n"
"io.BytesIO instance, or any other custom object that meets this interface.\n" "writing, a io.BytesIO instance, or any other custom object that meets\n"
"this interface.\n"
"\n" "\n"
"If fix_imports is True and protocol is less than 3, pickle will try to\n" "If *fix_imports* is True and protocol is less than 3, pickle will try\n"
"map the new Python 3.x names to the old module names used in Python 2.x,\n" "to map the new Python 3 names to the old module names used in Python\n"
"so that the pickle data stream is readable with Python 2.x."); "2, so that the pickle data stream is readable with Python 2.");
#define _PICKLE_DUMP_METHODDEF \ #define _PICKLE_DUMP_METHODDEF \
{"dump", (PyCFunction)_pickle_dump, METH_VARARGS|METH_KEYWORDS, _pickle_dump__doc__}, {"dump", (PyCFunction)_pickle_dump, METH_VARARGS|METH_KEYWORDS, _pickle_dump__doc__},
...@@ -7104,7 +7122,7 @@ exit: ...@@ -7104,7 +7122,7 @@ exit:
static PyObject * static PyObject *
_pickle_dump_impl(PyModuleDef *module, PyObject *obj, PyObject *file, PyObject *protocol, int fix_imports) _pickle_dump_impl(PyModuleDef *module, PyObject *obj, PyObject *file, PyObject *protocol, int fix_imports)
/*[clinic checksum: e442721b16052d921b5e3fbd146d0a62e94a459e]*/ /*[clinic checksum: eb5c23e64da34477178230b704d2cc9c6b6650ea]*/
{ {
PicklerObject *pickler = _Pickler_New(); PicklerObject *pickler = _Pickler_New();
...@@ -7142,34 +7160,34 @@ _pickle.dumps ...@@ -7142,34 +7160,34 @@ _pickle.dumps
Return the pickled representation of the object as a bytes object. Return the pickled representation of the object as a bytes object.
The optional protocol argument tells the pickler to use the given protocol; The optional *protocol* argument tells the pickler to use the given
supported protocols are 0, 1, 2, 3. The default protocol is 3; a protocol; supported protocols are 0, 1, 2, 3 and 4. The default
backward-incompatible protocol designed for Python 3.0. protocol is 3; a backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version Specifying a negative protocol version selects the highest protocol
supported. The higher the protocol used, the more recent the version of version supported. The higher the protocol used, the more recent the
Python needed to read the pickle produced. version of Python needed to read the pickle produced.
If fix_imports is True and *protocol* is less than 3, pickle will try to If *fix_imports* is True and *protocol* is less than 3, pickle will
map the new Python 3.x names to the old module names used in Python 2.x, try to map the new Python 3 names to the old module names used in
so that the pickle data stream is readable with Python 2.x. Python 2, so that the pickle data stream is readable with Python 2.
[clinic]*/ [clinic]*/
PyDoc_STRVAR(_pickle_dumps__doc__, PyDoc_STRVAR(_pickle_dumps__doc__,
"dumps(obj, protocol=None, *, fix_imports=True)\n" "dumps(obj, protocol=None, *, fix_imports=True)\n"
"Return the pickled representation of the object as a bytes object.\n" "Return the pickled representation of the object as a bytes object.\n"
"\n" "\n"
"The optional protocol argument tells the pickler to use the given protocol;\n" "The optional *protocol* argument tells the pickler to use the given\n"
"supported protocols are 0, 1, 2, 3. The default protocol is 3; a\n" "protocol; supported protocols are 0, 1, 2, 3 and 4. The default\n"
"backward-incompatible protocol designed for Python 3.0.\n" "protocol is 3; a backward-incompatible protocol designed for Python 3.\n"
"\n" "\n"
"Specifying a negative protocol version selects the highest protocol version\n" "Specifying a negative protocol version selects the highest protocol\n"
"supported. The higher the protocol used, the more recent the version of\n" "version supported. The higher the protocol used, the more recent the\n"
"Python needed to read the pickle produced.\n" "version of Python needed to read the pickle produced.\n"
"\n" "\n"
"If fix_imports is True and *protocol* is less than 3, pickle will try to\n" "If *fix_imports* is True and *protocol* is less than 3, pickle will\n"
"map the new Python 3.x names to the old module names used in Python 2.x,\n" "try to map the new Python 3 names to the old module names used in\n"
"so that the pickle data stream is readable with Python 2.x."); "Python 2, so that the pickle data stream is readable with Python 2.");
#define _PICKLE_DUMPS_METHODDEF \ #define _PICKLE_DUMPS_METHODDEF \
{"dumps", (PyCFunction)_pickle_dumps, METH_VARARGS|METH_KEYWORDS, _pickle_dumps__doc__}, {"dumps", (PyCFunction)_pickle_dumps, METH_VARARGS|METH_KEYWORDS, _pickle_dumps__doc__},
...@@ -7198,7 +7216,7 @@ exit: ...@@ -7198,7 +7216,7 @@ exit:
static PyObject * static PyObject *
_pickle_dumps_impl(PyModuleDef *module, PyObject *obj, PyObject *protocol, int fix_imports) _pickle_dumps_impl(PyModuleDef *module, PyObject *obj, PyObject *protocol, int fix_imports)
/*[clinic checksum: df6262c4c487f537f47aec8a1709318204c1e174]*/ /*[clinic checksum: e9b915d61202a9692cb6c6718db74fe54fc9c4d1]*/
{ {
PyObject *result; PyObject *result;
PicklerObject *pickler = _Pickler_New(); PicklerObject *pickler = _Pickler_New();
...@@ -7231,50 +7249,56 @@ _pickle.load ...@@ -7231,50 +7249,56 @@ _pickle.load
encoding: str = 'ASCII' encoding: str = 'ASCII'
errors: str = 'strict' errors: str = 'strict'
Return a reconstituted object from the pickle data stored in a file. Read and return an object from the pickle data stored in a file.
This is equivalent to ``Unpickler(file).load()``, but may be more efficient. This is equivalent to ``Unpickler(file).load()``, but may be more
efficient.
The protocol version of the pickle is detected automatically, so no protocol The protocol version of the pickle is detected automatically, so no
argument is needed. Bytes past the pickled object's representation are protocol argument is needed. Bytes past the pickled object's
ignored. representation are ignored.
The argument file must have two methods, a read() method that takes an The argument *file* must have two methods, a read() method that takes
integer argument, and a readline() method that requires no arguments. Both an integer argument, and a readline() method that requires no
methods should return bytes. Thus *file* can be a binary file object opened arguments. Both methods should return bytes. Thus *file* can be a
for reading, a BytesIO object, or any other custom object that meets this binary file object opened for reading, a io.BytesIO object, or any
interface. other custom object that meets this interface.
Optional keyword arguments are fix_imports, encoding and errors, Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatiblity support for pickle stream generated which are used to control compatiblity support for pickle stream
by Python 2.x. If fix_imports is True, pickle will try to map the old generated by Python 2. If *fix_imports* is True, pickle will try to
Python 2.x names to the new names used in Python 3.x. The encoding and map the old Python 2 names to the new names used in Python 3. The
errors tell pickle how to decode 8-bit string instances pickled by Python *encoding* and *errors* tell pickle how to decode 8-bit string
2.x; these default to 'ASCII' and 'strict', respectively. instances pickled by Python 2; these default to 'ASCII' and 'strict',
respectively. The *encoding* can be 'bytes' to read these 8-bit
string instances as bytes objects.
[clinic]*/ [clinic]*/
PyDoc_STRVAR(_pickle_load__doc__, PyDoc_STRVAR(_pickle_load__doc__,
"load(file, *, fix_imports=True, encoding=\'ASCII\', errors=\'strict\')\n" "load(file, *, fix_imports=True, encoding=\'ASCII\', errors=\'strict\')\n"
"Return a reconstituted object from the pickle data stored in a file.\n" "Read and return an object from the pickle data stored in a file.\n"
"\n" "\n"
"This is equivalent to ``Unpickler(file).load()``, but may be more efficient.\n" "This is equivalent to ``Unpickler(file).load()``, but may be more\n"
"efficient.\n"
"\n" "\n"
"The protocol version of the pickle is detected automatically, so no protocol\n" "The protocol version of the pickle is detected automatically, so no\n"
"argument is needed. Bytes past the pickled object\'s representation are\n" "protocol argument is needed. Bytes past the pickled object\'s\n"
"ignored.\n" "representation are ignored.\n"
"\n" "\n"
"The argument file must have two methods, a read() method that takes an\n" "The argument *file* must have two methods, a read() method that takes\n"
"integer argument, and a readline() method that requires no arguments. Both\n" "an integer argument, and a readline() method that requires no\n"
"methods should return bytes. Thus *file* can be a binary file object opened\n" "arguments. Both methods should return bytes. Thus *file* can be a\n"
"for reading, a BytesIO object, or any other custom object that meets this\n" "binary file object opened for reading, a io.BytesIO object, or any\n"
"interface.\n" "other custom object that meets this interface.\n"
"\n" "\n"
"Optional keyword arguments are fix_imports, encoding and errors,\n" "Optional keyword arguments are *fix_imports*, *encoding* and *errors*,\n"
"which are used to control compatiblity support for pickle stream generated\n" "which are used to control compatiblity support for pickle stream\n"
"by Python 2.x. If fix_imports is True, pickle will try to map the old\n" "generated by Python 2. If *fix_imports* is True, pickle will try to\n"
"Python 2.x names to the new names used in Python 3.x. The encoding and\n" "map the old Python 2 names to the new names used in Python 3. The\n"
"errors tell pickle how to decode 8-bit string instances pickled by Python\n" "*encoding* and *errors* tell pickle how to decode 8-bit string\n"
"2.x; these default to \'ASCII\' and \'strict\', respectively."); "instances pickled by Python 2; these default to \'ASCII\' and \'strict\',\n"
"respectively. The *encoding* can be \'bytes\' to read these 8-bit\n"
"string instances as bytes objects.");
#define _PICKLE_LOAD_METHODDEF \ #define _PICKLE_LOAD_METHODDEF \
{"load", (PyCFunction)_pickle_load, METH_VARARGS|METH_KEYWORDS, _pickle_load__doc__}, {"load", (PyCFunction)_pickle_load, METH_VARARGS|METH_KEYWORDS, _pickle_load__doc__},
...@@ -7304,7 +7328,7 @@ exit: ...@@ -7304,7 +7328,7 @@ exit:
static PyObject * static PyObject *
_pickle_load_impl(PyModuleDef *module, PyObject *file, int fix_imports, const char *encoding, const char *errors) _pickle_load_impl(PyModuleDef *module, PyObject *file, int fix_imports, const char *encoding, const char *errors)
/*[clinic checksum: e10796f6765b22ce48dca6940f11b3933853ca35]*/ /*[clinic checksum: b41f06970e57acf2fd602e4b7f88e3f3e1e53087]*/
{ {
PyObject *result; PyObject *result;
UnpicklerObject *unpickler = _Unpickler_New(); UnpicklerObject *unpickler = _Unpickler_New();
...@@ -7339,34 +7363,38 @@ _pickle.loads ...@@ -7339,34 +7363,38 @@ _pickle.loads
encoding: str = 'ASCII' encoding: str = 'ASCII'
errors: str = 'strict' errors: str = 'strict'
Return a reconstituted object from the given pickle data. Read and return an object from the given pickle data.
The protocol version of the pickle is detected automatically, so no protocol The protocol version of the pickle is detected automatically, so no
argument is needed. Bytes past the pickled object's representation are protocol argument is needed. Bytes past the pickled object's
ignored. representation are ignored.
Optional keyword arguments are fix_imports, encoding and errors, which Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
are used to control compatiblity support for pickle stream generated which are used to control compatiblity support for pickle stream
by Python 2.x. If fix_imports is True, pickle will try to map the old generated by Python 2. If *fix_imports* is True, pickle will try to
Python 2.x names to the new names used in Python 3.x. The encoding and map the old Python 2 names to the new names used in Python 3. The
errors tell pickle how to decode 8-bit string instances pickled by Python *encoding* and *errors* tell pickle how to decode 8-bit string
2.x; these default to 'ASCII' and 'strict', respectively. instances pickled by Python 2; these default to 'ASCII' and 'strict',
respectively. The *encoding* can be 'bytes' to read these 8-bit
string instances as bytes objects.
[clinic]*/ [clinic]*/
PyDoc_STRVAR(_pickle_loads__doc__, PyDoc_STRVAR(_pickle_loads__doc__,
"loads(data, *, fix_imports=True, encoding=\'ASCII\', errors=\'strict\')\n" "loads(data, *, fix_imports=True, encoding=\'ASCII\', errors=\'strict\')\n"
"Return a reconstituted object from the given pickle data.\n" "Read and return an object from the given pickle data.\n"
"\n" "\n"
"The protocol version of the pickle is detected automatically, so no protocol\n" "The protocol version of the pickle is detected automatically, so no\n"
"argument is needed. Bytes past the pickled object\'s representation are\n" "protocol argument is needed. Bytes past the pickled object\'s\n"
"ignored.\n" "representation are ignored.\n"
"\n" "\n"
"Optional keyword arguments are fix_imports, encoding and errors, which\n" "Optional keyword arguments are *fix_imports*, *encoding* and *errors*,\n"
"are used to control compatiblity support for pickle stream generated\n" "which are used to control compatiblity support for pickle stream\n"
"by Python 2.x. If fix_imports is True, pickle will try to map the old\n" "generated by Python 2. If *fix_imports* is True, pickle will try to\n"
"Python 2.x names to the new names used in Python 3.x. The encoding and\n" "map the old Python 2 names to the new names used in Python 3. The\n"
"errors tell pickle how to decode 8-bit string instances pickled by Python\n" "*encoding* and *errors* tell pickle how to decode 8-bit string\n"
"2.x; these default to \'ASCII\' and \'strict\', respectively."); "instances pickled by Python 2; these default to \'ASCII\' and \'strict\',\n"
"respectively. The *encoding* can be \'bytes\' to read these 8-bit\n"
"string instances as bytes objects.");
#define _PICKLE_LOADS_METHODDEF \ #define _PICKLE_LOADS_METHODDEF \
{"loads", (PyCFunction)_pickle_loads, METH_VARARGS|METH_KEYWORDS, _pickle_loads__doc__}, {"loads", (PyCFunction)_pickle_loads, METH_VARARGS|METH_KEYWORDS, _pickle_loads__doc__},
...@@ -7396,7 +7424,7 @@ exit: ...@@ -7396,7 +7424,7 @@ exit:
static PyObject * static PyObject *
_pickle_loads_impl(PyModuleDef *module, PyObject *data, int fix_imports, const char *encoding, const char *errors) _pickle_loads_impl(PyModuleDef *module, PyObject *data, int fix_imports, const char *encoding, const char *errors)
/*[clinic checksum: 29ee725efcbf51a3533c19cb8261a8e267b7080a]*/ /*[clinic checksum: 0663de43aca6c21508a777e29d98c9c3a6e7f72d]*/
{ {
PyObject *result; PyObject *result;
UnpicklerObject *unpickler = _Unpickler_New(); UnpicklerObject *unpickler = _Unpickler_New();
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment