Commit d4d60554 authored by Antoine Pitrou's avatar Antoine Pitrou

Issue #19900: improve generalities at the start of the pickle module doc

parent b1a92a4c
...@@ -15,13 +15,14 @@ ...@@ -15,13 +15,14 @@
.. sectionauthor:: Barry Warsaw <barry@python.org> .. sectionauthor:: Barry Warsaw <barry@python.org>
The :mod:`pickle` module implements a fundamental, but powerful algorithm for The :mod:`pickle` module implements binary protocols for serializing and
serializing and de-serializing a Python object structure. "Pickling" is the de-serializing a Python object structure. *"Pickling"* is the process
process whereby a Python object hierarchy is converted into a byte stream, and whereby a Python object hierarchy is converted into a byte stream, and
"unpickling" is the inverse operation, whereby a byte stream is converted back *"unpickling"* is the inverse operation, whereby a byte stream
into an object hierarchy. Pickling (and unpickling) is alternatively known as (from a :term:`binary file` or :term:`bytes-like object`) is converted
"serialization", "marshalling," [#]_ or "flattening", however, to avoid back into an object hierarchy. Pickling (and unpickling) is alternatively
confusion, the terms used here are "pickling" and "unpickling".. known as "serialization", "marshalling," [#]_ or "flattening"; however, to
avoid confusion, the terms used here are "pickling" and "unpickling".
.. warning:: .. warning::
...@@ -33,9 +34,8 @@ confusion, the terms used here are "pickling" and "unpickling".. ...@@ -33,9 +34,8 @@ confusion, the terms used here are "pickling" and "unpickling"..
Relationship to other Python modules Relationship to other Python modules
------------------------------------ ------------------------------------
The :mod:`pickle` module has an transparent optimizer (:mod:`_pickle`) written Comparison with ``marshal``
in C. It is used whenever available. Otherwise the pure Python implementation is ^^^^^^^^^^^^^^^^^^^^^^^^^^^
used.
Python has a more primitive serialization module called :mod:`marshal`, but in Python has a more primitive serialization module called :mod:`marshal`, but in
general :mod:`pickle` should always be the preferred way to serialize Python general :mod:`pickle` should always be the preferred way to serialize Python
...@@ -69,17 +69,30 @@ The :mod:`pickle` module differs from :mod:`marshal` in several significant ways ...@@ -69,17 +69,30 @@ The :mod:`pickle` module differs from :mod:`marshal` in several significant ways
The :mod:`pickle` serialization format is guaranteed to be backwards compatible The :mod:`pickle` serialization format is guaranteed to be backwards compatible
across Python releases. across Python releases.
Note that serialization is a more primitive notion than persistence; although Comparison with ``json``
:mod:`pickle` reads and writes file objects, it does not handle the issue of ^^^^^^^^^^^^^^^^^^^^^^^^
naming persistent objects, nor the (even more complicated) issue of concurrent
access to persistent objects. The :mod:`pickle` module can transform a complex
object into a byte stream and it can transform the byte stream into an object
with the same internal structure. Perhaps the most obvious thing to do with
these byte streams is to write them onto a file, but it is also conceivable to
send them across a network or store them in a database. The module
:mod:`shelve` provides a simple interface to pickle and unpickle objects on
DBM-style database files.
There are fundamental differences between the pickle protocols and
`JSON (JavaScript Object Notation) <http://json.org>`_:
* JSON is a text serialization format (it outputs unicode text, although
most of the time it is then encoded to ``utf-8``), while pickle is
a binary serialization format;
* JSON is human-readable, while pickle is not;
* JSON is interoperable and widely used outside of the Python ecosystem,
while pickle is Python-specific;
* JSON, by default, can only represent a subset of the Python built-in
types, and no custom classes; pickle can represent an extremely large
number of Python types (many of them automatically, by clever usage
of Python's introspection facilities; complex cases can be tackled by
implementing :ref:`specific object APIs <pickle-inst>`).
.. seealso::
The :mod:`json` module: a standard library module allowing JSON
serialization and deserialization.
Data stream format Data stream format
------------------ ------------------
...@@ -117,6 +130,18 @@ There are currently 4 different protocols which can be used for pickling. ...@@ -117,6 +130,18 @@ There are currently 4 different protocols which can be used for pickling.
the default as well as the current recommended protocol; use it whenever the default as well as the current recommended protocol; use it whenever
possible. possible.
.. note::
Serialization is a more primitive notion than persistence; although
:mod:`pickle` reads and writes file objects, it does not handle the issue of
naming persistent objects, nor the (even more complicated) issue of concurrent
access to persistent objects. The :mod:`pickle` module can transform a complex
object into a byte stream and it can transform the byte stream into an object
with the same internal structure. Perhaps the most obvious thing to do with
these byte streams is to write them onto a file, but it is also conceivable to
send them across a network or store them in a database. The :mod:`shelve`
module provides a simple interface to pickle and unpickle objects on
DBM-style database files.
Module Interface Module Interface
---------------- ----------------
...@@ -793,6 +818,14 @@ alternatives such as the marshalling API in :mod:`xmlrpc.client` or ...@@ -793,6 +818,14 @@ alternatives such as the marshalling API in :mod:`xmlrpc.client` or
third-party solutions. third-party solutions.
Performance
-----------
Recent versions of the pickle protocol (from protocol 2 and upwards) feature
efficient binary encodings for several common features and built-in types.
Also, the :mod:`pickle` module has a transparent optimizer written in C.
.. _pickle-example: .. _pickle-example:
Examples Examples
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment