Commit 19ed27ec authored by Victor Stinner's avatar Victor Stinner

Optimize pickle.load() and pickle.loads()

Issue #27056: Optimize pickle.load() and pickle.loads(), up to 10% faster to
deserialize a lot of small objects.
parent 744c34e2
...@@ -467,6 +467,9 @@ Optimizations ...@@ -467,6 +467,9 @@ Optimizations
with a short lifetime, and use :c:func:`malloc` for larger memory blocks. with a short lifetime, and use :c:func:`malloc` for larger memory blocks.
(Contributed by Victor Stinner in :issue:`26249`). (Contributed by Victor Stinner in :issue:`26249`).
* :func:`pickle.load` and :func:`pickle.loads` are now up to 10% faster when
deserializing many small objects (Contributed by Victor Stinner in
:issue:`27056`).
Build and C API Changes Build and C API Changes
======================= =======================
......
...@@ -16,6 +16,9 @@ Core and Builtins ...@@ -16,6 +16,9 @@ Core and Builtins
Library Library
------- -------
- Issue #27056: Optimize pickle.load() and pickle.loads(), up to 10% faster
to deserialize a lot of small objects.
What's New in Python 3.6.0 alpha 1? What's New in Python 3.6.0 alpha 1?
=================================== ===================================
...@@ -341,7 +344,7 @@ Library ...@@ -341,7 +344,7 @@ Library
- Issue #26977: Removed unnecessary, and ignored, call to sum of squares helper - Issue #26977: Removed unnecessary, and ignored, call to sum of squares helper
in statistics.pvariance. in statistics.pvariance.
- Issue #26002: Use bisect in statistics.median instead of a linear search. - Issue #26002: Use bisect in statistics.median instead of a linear search.
Patch by Upendra Kuma. Patch by Upendra Kuma.
- Issue #25974: Make use of new Decimal.as_integer_ratio() method in statistics - Issue #25974: Make use of new Decimal.as_integer_ratio() method in statistics
......
...@@ -1197,21 +1197,9 @@ _Unpickler_ReadFromFile(UnpicklerObject *self, Py_ssize_t n) ...@@ -1197,21 +1197,9 @@ _Unpickler_ReadFromFile(UnpicklerObject *self, Py_ssize_t n)
return read_size; return read_size;
} }
/* Read `n` bytes from the unpickler's data source, storing the result in `*s`. /* Don't call it directly: use _Unpickler_Read() */
This should be used for all data reads, rather than accessing the unpickler's
input buffer directly. This method deals correctly with reading from input
streams, which the input buffer doesn't deal with.
Note that when reading from a file-like object, self->next_read_idx won't
be updated (it should remain at 0 for the entire unpickling process). You
should use this function's return value to know how many bytes you can
consume.
Returns -1 (with an exception set) on failure. On success, return the
number of chars read. */
static Py_ssize_t static Py_ssize_t
_Unpickler_Read(UnpicklerObject *self, char **s, Py_ssize_t n) _Unpickler_ReadImpl(UnpicklerObject *self, char **s, Py_ssize_t n)
{ {
Py_ssize_t num_read; Py_ssize_t num_read;
...@@ -1222,11 +1210,10 @@ _Unpickler_Read(UnpicklerObject *self, char **s, Py_ssize_t n) ...@@ -1222,11 +1210,10 @@ _Unpickler_Read(UnpicklerObject *self, char **s, Py_ssize_t n)
"read would overflow (invalid bytecode)"); "read would overflow (invalid bytecode)");
return -1; return -1;
} }
if (self->next_read_idx + n <= self->input_len) {
*s = self->input_buffer + self->next_read_idx; /* This case is handled by the _Unpickler_Read() macro for efficiency */
self->next_read_idx += n; assert(self->next_read_idx + n > self->input_len);
return n;
}
if (!self->read) { if (!self->read) {
PyErr_Format(PyExc_EOFError, "Ran out of input"); PyErr_Format(PyExc_EOFError, "Ran out of input");
return -1; return -1;
...@@ -1243,6 +1230,26 @@ _Unpickler_Read(UnpicklerObject *self, char **s, Py_ssize_t n) ...@@ -1243,6 +1230,26 @@ _Unpickler_Read(UnpicklerObject *self, char **s, Py_ssize_t n)
return n; return n;
} }
/* Read `n` bytes from the unpickler's data source, storing the result in `*s`.
This should be used for all data reads, rather than accessing the unpickler's
input buffer directly. This method deals correctly with reading from input
streams, which the input buffer doesn't deal with.
Note that when reading from a file-like object, self->next_read_idx won't
be updated (it should remain at 0 for the entire unpickling process). You
should use this function's return value to know how many bytes you can
consume.
Returns -1 (with an exception set) on failure. On success, return the
number of chars read. */
#define _Unpickler_Read(self, s, n) \
(((self)->next_read_idx + (n) <= (self)->input_len) \
? (*(s) = (self)->input_buffer + (self)->next_read_idx, \
(self)->next_read_idx += (n), \
(n)) \
: _Unpickler_ReadImpl(self, (s), (n)))
static Py_ssize_t static Py_ssize_t
_Unpickler_CopyLine(UnpicklerObject *self, char *line, Py_ssize_t len, _Unpickler_CopyLine(UnpicklerObject *self, char *line, Py_ssize_t len,
char **result) char **result)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment