Commit 28e4533a authored by Christian Theune's avatar Christian Theune

- Merged to ZODB trunk again

parent 6e9247ba
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -13,5 +13,5 @@ -
src -
zpkg.conf -
zpkgsetup -
buildsupport -
ZODB 3.6
ZODB 3.7
......@@ -28,15 +28,14 @@ ZoneAlarm. Many particularly slow tests are skipped unless you pass
ZODB 3.6 requires Python 2.3.4 or later. For best results, we recommend
Python 2.3.5. Python 2.4.1 can also be used.
ZODB 3.7 requires Python 2.4.2 or later.
The Zope 2.8 release, and Zope3 releases, should be compatible with this
version of ZODB. Note that Zope 2.7 and higher includes ZEO, so this package
should only be needed to run a ZEO server.
ZEO servers and clients are wholly compatible among 3.3, 3.3.1, 3.4, 3.5, and
3.6; a ZEO client from any of those versions can talk with a ZEO server from
ZEO servers and clients are wholly compatible among 3.3, 3.4, 3.5, 3.6 and
3.7; a ZEO client from any of those versions can talk with a ZEO server from
Trying to mix ZEO clients and servers from 3.3 or later from ZODB releases
......@@ -93,12 +92,14 @@ script::
This should now make all of ZODB accessible to your Python programs.
Testing for Developers
When working from a ZODB checkout, do an in-place build instead::
% python build_ext -i
ZODB comes with a large test suite that can be run from the source
directory before ZODB is installed. The simplest way to run the tests
followed by::
% python -v
\title{ZODB/ZEO Programming Guide}
\author{A.M.\ Kuchling}
......@@ -34,7 +34,7 @@ import zpkgsetup.publication
import zpkgsetup.setup
# Note that must be able to recognize the VERSION line.
VERSION = "3.6.0a3"
VERSION = "3.7.0a0"
context = zpkgsetup.setup.SetupContext(
"ZODB", VERSION, __file__)
Developer Information
This document provides information for developers who maintain or extend
`BTrees` are defined using a "template", roughly akin to a C++ template. To
create a new family of `BTrees`, create a source file that defines macros used
to handle differences in key and value types:
Configuration Macros
A string to hold an RCS/CVS Id key to be included in compiled binaries.
A string (like "IO" or "OO") that provides the prefix used for the module.
This gets used to generate type names and the internal module name string.
An int giving the maximum bucket size (number of key/value pairs). When a
bucket gets larger than this due to an insertion *into a BTREE*, it
splits. Inserting into a bucket directly doesn't split, and functions
that produce a bucket output (e.g., ``union()``) also have no bound on how
large a bucket may get. Someday this will be tunable on `BTree`.
An ``int`` giving the maximum size (number of children) of an internal
btree node. Someday this will be tunable on ``BTree`` instances.
Macros for Keys
The C type declaration for keys (e.g., ``int`` or ``PyObject*``).
Define if ``KEY_TYPE`` is a ``PyObject*`, else ``undef``.
Tests whether the ``PyObject* K`` can be converted to the (``C``) key type
(``KEY_TYPE``). The macro should return a boolean (zero for false,
non-zero for true). When it returns false, its caller should probably set
a ``TypeError`` exception.
Like Python's ``cmp()``. Compares K(ey) to T(arget), where ``K``
and ``T`` are ``C`` values of type `KEY_TYPE`. ``V`` is assigned an `int`
value depending on the outcome::
< 0 if K < T
== 0 if K == T
> 0 if K > T
This macro acts like an ``if``, where the following statement is executed
only if a Python exception has been raised because the values could not be
``K`` is a value of ``KEY_TYPE``. If ``KEY_TYPE`` is a flavor of
``PyObject*``, write this to do ``Py_DECREF(K)``. Else (e.g.,
``KEY_TYPE`` is ``int``) make it a nop.
``K`` is a value of `KEY_TYPE`. If `KEY_TYPE` is a flavor of
``PyObject*``, write this to do ``Py_INCREF(K)``. Else (e.g., `KEY_TYPE`
is ``int``) make it a nop.
``COPY_KEY(K, E)``
Like ``K=E``. Copy a key from ``E`` to ``K``, both of ``KEY_TYPE``. Note
that this doesn't ``decref K`` or ``incref E`` when ``KEY_TYPE`` is a
``PyObject*``; the caller is responsible for keeping refcounts straight.
Roughly like ``O=K``. ``O`` is a ``PyObject*``, and the macro must build
a Python object form of ``K``, assign it to ``O``, and ensure that ``O``
owns the reference to its new value. It may do this by creating a new
Python object based on ``K`` (e.g., ``PyInt_FromLong(K)`` when
``KEY_TYPE`` is ``int``), or simply by doing ``Py_INCREF(K)`` if
``KEY_TYPE`` is a ``PyObject*``.
Copy an argument to the target without creating a new reference to
``ARG``. ``ARG`` is a ``PyObject*``, and ``TARGET`` is of type
``KEY_TYPE``. If this can't be done (for example, ``KEY_CHECK(ARG)``
returns false), set a Python error and set status to ``0``. If there is
no error, leave status alone.
Macros for Values
The C type declaration for values (e.g., ``int`` or ``PyObject*``).
Define if ``VALUE_TYPE`` is a ``PyObject*``, else ``undef``.
Like Python's ``cmp()``. Compares ``X`` to ``Y``, where ``X`` & ``Y`` are
``C`` values of type ``VALUE_TYPE``. The macro returns an ``int``, with
< 0 if X < Y
== 0 if X == Y
> 0 if X > Y
Bug: There is no provision for determining whether the comparison attempt
failed (set a Python exception).
Like ``DECREF_KEY``, except applied to values of ``VALUE_TYPE``.
Like ``INCREF_KEY``, except applied to values of ``VALUE_TYPE``.
Like ``COPY_KEY``, except applied to values of ``VALUE_TYPE``.
Like ``COPY_KEY_TO_OBJECT``, except applied to values of ``VALUE_TYPE``.
Like ``COPY_KEY_FROM_ARG``, except applied to values of ``VALUE_TYPE``.
Normalize the value, ``V``, using the parameter ``MIN``. This is almost
certainly a YAGNI. It is a no-op for most types. For integers, ``V`` is
replaced by ``V/MIN`` only if ``MIN > 0``.
Macros for Set Operations
A value of ``VALUE_TYPE`` specifying the value to associate with set
elements when sets are merged with mappings via weighed union or weighted
``MERGE(O1, w1, O2, w2)``
Performs a weighted merge of two values, ``O1`` and ``O2``, using weights
``w1`` and ``w2``. The result must be of ``VALUE_TYPE``. Note that
weighted unions and weighted intersections are not enabled if this macro
is left undefined.
Computes a weighted value for ``O``. The result must be of
``VALUE_TYPE``. This is used for "filling out" weighted unions, i.e. to
compute a weighted value for keys that appear in only one of the input
mappings. If left undefined, ``MERGE_WEIGHT`` defaults to::
#define MERGE_WEIGHT(O, w) (O)
The value doesn't matter. If defined, `SetOpTemplate.c` compiles code for
a ``multiunion()`` function (compute a union of many input sets at high
speed). This currently makes sense only for structures with integer keys.
BTree Clues
More or less random bits of helpful info.
+ In papers and textbooks, this flavor of BTree is usually called a B+-Tree,
where "+" is a superscript.
+ All keys and all values live in the bucket leaf nodes. Keys in interior
(BTree) nodes merely serve to guide a search efficiently toward the correct
+ When a key is deleted, it's physically removed from the bucket it's in, but
this doesn't propagate back up the tree: since keys in interior nodes only
serve to guide searches, it's OK-- and saves time --to leave "stale" keys in
interior nodes.
+ No attempt is made to rebalance the tree after a deletion, unless a bucket
thereby becomes entirely empty. "Classic BTrees" do rebalance, keeping all
buckets at least half full (provided there are enough keys in the entire
tree to fill half a bucket). The tradeoffs are murky. Pathological cases
in the presence of deletion do exist. Pathologies include trees tending
toward only one key per bucket, and buckets at differing depths (all buckets
are at the same depth in a classic BTree).
+ ``DEFAULT_MAX_BUCKET_SIZE`` and ``DEFAULT_MAX_BTREE_SIZE`` are chosen mostly
to "even out" pickle sizes in storage. That's why, e.g., an `IIBTree` has
larger values than an `OOBTree`: pickles store ints more efficiently than
they can store arbitrary Python objects.
+ In a non-empty BTree, every bucket node contains at least one key, and every
BTree node contains at least one child and a non-NULL firstbucket pointer.
However, a BTree node may not contain any keys.
+ An empty BTree consists solely of a BTree node with ``len==0`` and
+ Although a BTree can become unbalanced under a mix of inserts and deletes
(meaning both that there's nothing stronger that can be said about buckets
than that they're not empty, and that buckets can appear at different
depths), a BTree node always has children of the same kind: they're all
buckets, or they're all BTree nodes.
The ``BTREE_SEARCH`` Macro
For notational ease, consider a fixed BTree node ``x``, and let
K(i) mean x->data.key[i]
C(i) mean all the keys reachable from x->data.child[i]
For each ``i`` in ``0`` to ``x->len-1`` inclusive,
K(i) <= C(i) < K(i+1)
is a BTree node invariant, where we pretend that ``K(0)`` holds a key smaller
than any possible key, and ``K(x->len)`` holds a key larger than any possible
key. (Note that ``K(x->len)`` doesn't actually exist, and ``K(0)`` is never
used although space for it exists in non-empty BTree nodes.)
When searching for a key ``k``, then, the child pointer we want to follow is
the one at index ``i`` such that ``K(i) <= k < K(i+1)``. There can be at most
one such ``i``, since the ``K(i)`` are strictly increasing. And there is at
least one such ``i`` provided the tree isn't empty (so that ``0 < len``). For
the moment, assume the tree isn't empty (we'll get back to that later).
The macro's chief loop invariant is
K(lo) < k < K(hi)
This holds trivially at the start, since ``lo`` is set to ``0``, and ``hi`` to
``x->len``, and we pretend ``K(0)`` is minus infinity and ``K(len)`` is plus
infinity. Inside the loop, if ``K(i) < k`` we set ``lo`` to ``i``, and if
``K(i) > k`` we set ``hi`` to ``i``. These obviously preserve the invariant.
If ``K(i) == k``, the loop breaks and sets the result to ``i``, and since
``K(i) == k`` in that case ``i`` is obviously the correct result.
Other cases depend on how ``i = floor((lo + hi)/2)`` works, exactly. Suppose
``lo + d = hi`` for some ``d >= 0``. Then ``i = floor((lo + lo + d)/2) =
floor(lo + d/2) = lo + floor(d/2)``. So:
a. ``[d == 0] (lo == i == hi)`` if and only if ``(lo == hi)``.
b. ``[d == 1] (lo == i < hi)`` if and only if ``(lo+1 == hi)``.
c. ``[d > 1] (lo < i < hi)`` if and only if ``(lo+1 < hi)``.
If the node is empty ``(x->len == 0)``, then ``lo==i==hi==0`` at the start,
and the loop exits immediately (the first ``i > lo`` test fails), without
entering the body.
Else ``lo < hi`` at the start, and the invariant ``K(lo) < k < K(hi)`` holds.
If ``lo+1 < hi``, we're in case (c): ``i`` is strictly between ``lo`` and
``hi``, so the loop body is entered, and regardless of whether the body sets
the new ``lo`` or the new ``hi`` to ``i``, the new ``lo`` is strictly less
than the new ``hi``, and the difference between the new ``lo`` and new ``hi``
is strictly less than the difference between the old ``lo`` and old ``hi``.
So long as the new ``lo + 1`` remains < the new ``hi``, we stay in this case.
We can't stay in this case forever, though: because ``hi-lo`` decreases on
each trip but remains > ``0``, ``lo+1 == hi`` must eventually become true.
(In fact, it becomes true quickly, in about ``log2(x->len)`` trips; the point
is more that ``lo`` doesn't equal ``hi`` when the loop ends, it has to end
with ``lo+1==hi`` and ``i==lo``).
Then we're in case (b): ``i==lo==hi-1`` then, and the loop exits. The
invariant still holds, with ``lo==i`` and ``hi==lo+1==i+1``::
K(i) < k < K(i+1)
so ``i`` is again the correct answer.
Optimization points:
+ Division by 2 is done via shift rather via "/2". These are signed ints, and
almost all C compilers treat signed int division as truncating, and shifting
is not the same as truncation for signed int division. The compiler has no
way to know these values aren't negative, so has to generate longer-winded
code for "/2". But we know these values aren't negative, and exploit it.
+ The order of _cmp comparisons matters. We're in an interior BTree node, and
are looking at only a tiny fraction of all the keys that exist. So finding
the key exactly in this node is unlikely, and checking ``_cmp == 0`` is a
waste of time to the same extent. It doesn't matter whether we check for
``_cmp < 0`` or ``_cmp > 0`` first, so long as we do both before worrying
about equality.
+ At the start of a routine, it's better to run this macro even if ``x->len``
is ``0`` (check for that afterwards). We just called a function and so
probably drained the pipeline. If the first thing we do then is read up
``self->len`` and check it against ``0``, we just sit there waiting for the
data to get read up, and then another immediate test-and-branch, and for a
very unlikely case (BTree nodes are rarely empty). It's better to get into
the loop right away so the normal case makes progress ASAP.
This has a different job than ``BTREE_SEARCH``: the key ``0`` slot is
legitimate in a bucket, and we want to find the index at which the key
belongs. If the key is larger than the bucket's largest key, a new slot at
index len is where it belongs, else it belongs at the smallest ``i`` with
``keys[i]`` >= the key we're looking for. We also need to know whether or not
the key is present (``BTREE_SEARCH`` didn't care; it only wanted to find the
next node to search).
The mechanics of the search are quite similar, though. The primary
loop invariant changes to (say we're searching for key ``k``)::
K(lo-1) < k < K(hi)
where ``K(i)`` means ``keys[i]``, and we pretend ``K(-1)`` is minus infinity
and ``K(len)`` is plus infinity.
If the bucket is empty, ``lo=hi=i=0`` at the start, the loop body is never
entered, and the macro sets ``INDEX`` to 0 and ``ABSENT`` to true. That's why
``_cmp`` is initialized to 1 (``_cmp`` becomes ``ABSENT``).
Else the bucket is not empty, lo<hi at the start, and the loop body is
entered. The invariant is obviously satisfied then, as ``lo=0`` and
If ``K[i]<k``, ``lo`` is set to ``i+1``, preserving that ``K(lo-1) = K[i] <
If ``K[i]>k``, ``hi`` is set to ``i``, preserving that ``K[hi] = K[i] > k``.
If the loop exits after either of those, ``_cmp != 0``, so ``ABSENT`` becomes
If ``K[i]=k``, the loop breaks, so that ``INDEX`` becomes ``i``, and
``ABSENT`` becomes false (``_cmp=0`` in this case).
The same case analysis for ``BTREE_SEARCH`` on ``lo`` and ``hi`` holds here:
a. ``(lo == i == hi)`` if and only if ``(lo == hi)``.
b. ``(lo == i < hi)`` if and only if ``(lo+1 == hi)``.
c. ``(lo < i < hi)`` if and only if ``(lo+1 < hi)``.
So long as ``lo+1 < hi``, we're in case (c), and either break with equality
(in which case the right results are obviously computed) or narrow the range.
If equality doesn't obtain, the range eventually narrows to cases (a) or (b).
To go from (c) to (a), we must have ``lo+2==hi`` at the start, and
``K[i]=K[lo+1]<k``. Then the new lo gets set to ``i+1 = lo+2 = hi``, and the
loop exits with ``lo=hi=i`` and ``_cmp<0``. This is correct, because we know
that ``k != K(i)`` (loop invariant! we actually know something stronger, that
``k < K(hi)``; since ``i=hi``, this implies ``k != K(i)``).
Else (c) eventually falls into case (b), ``lo+1==hi`` and ``i==lo``. The
invariant tells us ``K(lo-1) < k < K(hi) = K(lo+1)``, so if the key is present
it must be at ``K(lo)``. ``i==lo`` in this case, so we test ``K(lo)`` against
``k``. As always, if equality obtains we do the right thing, else case #b
becomes case (a).
When (b) becomes (a), the last comparison was non-equal, so ``_cmp`` is
non-zero, and the loop exits because ``lo==hi==i`` in case (a). The invariant
then tells us ``K(lo-1) < k < K(lo)``, so the key is in fact not present, it's
correct to exit with ``_cmp`` non-zero, and ``i==lo`` is again the index at
which ``k`` belongs.
Optimization points:
+ As for ``BTREE_SEARCH``, shifting of signed ints is cheaper than division.
+ Unlike as for ``BTREE_SEARCH``, there's nothing special about searching an
empty bucket, and the macro computes thoroughly sensible results in that
+ The order of ``_cmp`` comparisons differs from ``BTREE_SEARCH``. When
searching a bucket, it's much more likely (than when searching a BTree node)
that the key is present, so testing ``__cmp==0`` isn't a systematic waste of
cycles. At the extreme, if all searches are successful (key present), on
average this saves one comparison per search, against leaving the
determination of ``_cmp==0`` implicit (as ``BTREE_SEARCH`` does). But even
on successful searches, ``__cmp != 0`` is a more popular outcome than
``__cmp == 0`` across iterations (unless the bucket has only a few keys), so
it's important to check one of the inequality cases first. It turns out
it's better on average to check ``K(i) < key`` (than to check ``K(i) >
key``), because when it pays it narrows the range more (we get a little
boost from setting ``lo=i+1`` in this case; the other case sets ``hi=i``,
which isn't as much of a narrowing).
This document provides information for developers who maintain or
extend BTrees.
BTrees are defined using a "template", roughly akin to a a C++
template. To create a new family of BTrees, create a source file that
defines macros used to handle differences in key and value types:
Configuration Macros
A string to hold an RCS/CVS Id key to be included in compiled binaries.
A string (like "IO" or "OO") that provides the prefix used for the
module. This gets used to generate type names and the internal module
name string.
An int giving the maximum bucket size (number of key/value pairs).
When a bucket gets larger than this due to an insertion *into a BTREE*,
it splits. Inserting into a bucket directly doesn't split, and
functions that produce a bucket output (e.g., union()) also have no
bound on how large a bucket may get. Someday this will be tunable
on BTree instances.
An int giving the maximum size (number of children) of an internal
btree node. Someday this will be tunable on BTree instances.
Macros for Keys
The C type declaration for keys (e.g., int or PyObject*).
Define if KEY_TYPE is a PyObject*, else undef.
Tests whether the PyObject* K can be converted to the (C) key type
(KEY_TYPE). The macro should return a boolean (zero for false,
non-zero for true). When it returns false, its caller should probably
set a TypeError exception.
Like Python's cmp(). Compares K(ey) to T(arget), where K & T are C
values of type KEY_TYPE. V is assigned an int value depending on
the outcome:
< 0 if K < T
== 0 if K == T
> 0 if K > T
This macro acts like an 'if', where the following statement is
executed only if a Python exception has been raised because the
values could not be compared.
K is a value of KEY_TYPE. If KEY_TYPE is a flavor of PyObject*, write
this to do Py_DECREF(K). Else (e.g., KEY_TYPE is int) make it a nop.
K is a value of KEY_TYPE. If KEY_TYPE is a flavor of PyObject*, write
this to do Py_INCREF(K). Else (e.g., KEY_TYPE is int) make it a nop.
Like K=E. Copy a key from E to K, both of KEY_TYPE. Note that this
doesn't decref K or incref E when KEY_TYPE is a PyObject*; the caller
is responsible for keeping refcounts straight.
Roughly like O=K. O is a PyObject*, and the macro must build a Python
object form of K, assign it to O, and ensure that O owns the reference
to its new value. It may do this by creating a new Python object based
on K (e.g., PyInt_FromLong(K) when KEY_TYPE is int), or simply by doing
Py_INCREF(K) if KEY_TYPE is a PyObject*.
Copy an argument to the target without creating a new reference to ARG.
ARG is a PyObject*, and TARGET is of type KEY_TYPE. If this can't be
done (for example, KEY_CHECK(ARG) returns false), set a Python error
and set status to 0. If there is no error, leave status alone.
Macros for Values
The C type declaration for values (e.g., int or PyObject*).
Define if VALUE_TYPE is a PyObject*, else undef.
Like Python's cmp(). Compares X to Y, where X & Y are C values of
type VALUE_TYPE. The macro returns an int, with value
< 0 if X < Y
== 0 if X == Y
> 0 if X > Y
Bug: There is no provision for determining whether the comparison
attempt failed (set a Python exception).
Like DECREF_KEY, except applied to values of VALUE_TYPE.
Like INCREF_KEY, except applied to values of VALUE_TYPE.
Like COPY_KEY, except applied to values of VALUE_TYPE.
Like COPY_KEY_TO_OBJECT, except applied to values of VALUE_TYPE.
Like COPY_KEY_FROM_ARG, except applied to values of VALUE_TYPE.
Normalize the value, V, using the parameter MIN. This is almost
certainly a YAGNI. It is a no op for most types. For integers, V is
replaced by V/MIN only if MIN > 0.
Macros for Set Operations
A value of VALUE_TYPE specifying the value to associate with set
elements when sets are merged with mappings via weighed union or
weighted intersection.
MERGE(O1, w1, O2, w2)
Performs a weighted merge of two values, O1 and O2, using weights w1
and w2. The result must be of VALUE_TYPE. Note that weighted unions
and weighted intersections are not enabled if this macro is left
Computes a weighted value for O. The result must be of VALUE_TYPE.
This is used for "filling out" weighted unions, i.e. to compute a
weighted value for keys that appear in only one of the input
mappings. If left undefined, MERGE_WEIGHT defaults to
#define MERGE_WEIGHT(O, w) (O)
The value doesn't matter. If defined, SetOpTemplate.c compiles
code for a multiunion() function (compute a union of many input sets
at high speed). This currently makes sense only for structures with
integer keys.
BTree Clues
More or less random bits of helpful info.
+ In papers and textbooks, this flavor of BTree is usually called
a B+-Tree, where "+" is a superscript.
+ All keys and all values live in the bucket leaf nodes. Keys in
interior (BTree) nodes merely serve to guide a search efficiently
toward the correct leaf.
+ When a key is deleted, it's physically removed from the bucket
it's in, but this doesn't propagate back up the tree: since keys
in interior nodes only serve to guide searches, it's OK-- and
saves time --to leave "stale" keys in interior nodes.
+ No attempt is made to rebalance the tree after a deletion, unless
a bucket thereby becomes entirely empty. "Classic BTrees" do
rebalance, keeping all buckets at least half full (provided there
are enough keys in the entire tree to fill half a bucket). The
tradeoffs are murky. Pathological cases in the presence of
deletion do exist. Pathologies include trees tending toward only
one key per bucket, and buckets at differing depths (all buckets
are at the same depth in a classic BTree).
mostly to "even out" pickle sizes in storage. That's why, e.g.,
an IIBTree has larger values than an OOBTree: pickles store ints
more efficiently than they can store arbitrary Python objects.
+ In a non-empty BTree, every bucket node contains at least one key,
and every BTree node contains at least one child and a non-NULL
firstbucket pointer. However, a BTree node may not contain any keys.
+ An empty BTree consists solely of a BTree node with len==0 and
+ Although a BTree can become unbalanced under a mix of inserts and
deletes (meaning both that there's nothing stronger that can be
said about buckets than that they're not empty, and that buckets
can appear at different depths), a BTree node always has children
of the same kind: they're all buckets, or they're all BTree nodes.
For notational ease, consider a fixed BTree node x, and let
K(i) mean x->data.key[i]
C(i) mean all the keys reachable from x->data.child[i]
For each i in 0 to x->len-1 inclusive,
K(i) <= C(i) < K(i+1)
is a BTree node invariant, where we pretend that K(0) holds a key
smaller than any possible key, and K(x->len) holds a key larger
than any possible key. (Note that K(x->len) doesn't actually exist,
and K(0) is never used although space for it exists in non-empty
BTree nodes.)
When searching for a key k, then, the child pointer we want to follow
is the one at index i such that K(i) <= k < K(i+1). There can be
at most one such i, since the K(i) are strictly increasing. And there
is at least one such i provided the tree isn't empty (so that 0 < len).
For the moment, assume the tree isn't empty (we'll get back to that
The macro's chief loop invariant is
K(lo) < k < K(hi)
This holds trivially at the start, since lo is set to 0, and hi to
x->len, and we pretend K(0) is minus infinity and K(len) is plus
infinity. Inside the loop, if K(i) < k we set lo to i, and if
K(i) > k we set hi to i. These obviously preserve the invariant.
If K(i) == k, the loop breaks and sets the result to i, and since
K(i) == k in that case i is obviously the correct result.
Other cases depend on how i = floor((lo + hi)/2) works, exactly.
Suppose lo + d = hi for some d >= 0. Then i = floor((lo + lo + d)/2) =
floor(lo + d/2) = lo + floor(d/2). So:
a. [d == 0] (lo == i == hi) if and only if (lo == hi).
b. [d == 1] (lo == i < hi) if and only if (lo+1 == hi).
c. [d > 1] (lo < i < hi) if and only if (lo+1 < hi).
If the node is empty (x->len == 0), then lo==i==hi==0 at the start,
and the loop exits immediately (the first "i > lo" test fails),
without entering the body.
Else lo < hi at the start, and the invariant K(lo) < k < K(hi) holds.
If lo+1 < hi, we're in case #c: i is strictly between lo and hi,
so the loop body is entered, and regardless of whether the body sets
the new lo or the new hi to i, the new lo is strictly less than the
new hi, and the difference between the new lo and new hi is strictly
less than the difference between the old lo and old hi. So long as
the new lo + 1 remains < the new hi, we stay in this case. We can't
stay in this case forever, though: because hi-lo decreases on each
trip but remains > 0, lo+1 == hi must eventually become true. (In
fact, it becomes true quickly, in about log2(x->len) trips; the
point is more that lo doesn't equal hi when the loop ends, it has to
end with lo+1==hi and i==lo).
Then we're in case #b: i==lo==hi-1 then, and the loop exits. The
invariant still holds, with lo==i and hi==lo+1==i+1:
K(i) < k < K(i+1)
so i is again the correct answer.
Optimization points:
+ Division by 2 is done via shift rather via "/2". These are
signed ints, and almost all C compilers treat signed int division
as truncating, and shifting is not the same as truncation for
signed int division. The compiler has no way to know these values
aren't negative, so has to generate longer-winded code for "/2".
But we know these values aren't negative, and exploit it.
+ The order of _cmp comparisons matters. We're in an interior
BTree node, and are looking at only a tiny fraction of all the
keys that exist. So finding the key exactly in this node is
unlikely, and checking _cmp == 0 is a waste of time to the same
extent. It doesn't matter whether we check for _cmp < 0 or
_cmp > 0 first, so long as we do both before worrying about
+ At the start of a routine, it's better to run this macro even
if x->len is 0 (check for that afterwards). We just called a
function and so probably drained the pipeline. If the first thing
we do then is read up self->len and check it against 0, we just
sit there waiting for the data to get read up, and then another
immediate test-and-branch, and for a very unlikely case (BTree
nodes are rarely empty). It's better to get into the loop right
away so the normal case makes progress ASAP.
This has a different job than BTREE_SEARCH: the key 0 slot is
legitimate in a bucket, and we want to find the index at which the
key belongs. If the key is larger than the bucket's largest key, a
new slot at index len is where it belongs, else it belongs at the
smallest i with keys[i] >= the key we're looking for. We also need
to know whether or not the key is present (BTREE_SEARCH didn't care;
it only wanted to find the next node to search).
The mechanics of the search are quite similar, though. The primary
loop invariant changes to (say we're searching for key k):
K(lo-1) < k < K(hi)
where K(i) means keys[i], and we pretend K(-1) is minus infinity and
K(len) is plus infinity.
If the bucket is empty, lo=hi=i=0 at the start, the loop body is never
entered, and the macro sets INDEX to 0 and ABSENT to true. That's why
_cmp is initialized to 1 (_cmp becomes ABSENT).
Else the bucket is not empty, lo<hi at the start, and the loop body
is entered. The invariant is obviously satisfied then, as lo=0 and
If K[i]<k, lo is set to i+1, preserving that K(lo-1) = K[i] < k.
If K[i]>k, hi is set to i, preserving that K[hi] = K[i] > k.
If the loop exits after either of those, _cmp != 0, so ABSENT becomes
If K[i]=k, the loop breaks, so that INDEX becomes i, and ABSENT
becomes false (_cmp=0 in this case).
The same case analysis for BTREE_SEARCH on lo and hi holds here:
a. (lo == i == hi) if and only if (lo == hi).
b. (lo == i < hi) if and only if (lo+1 == hi).
c. (lo < i < hi) if and only if (lo+1 < hi).
So long as lo+1 < hi, we're in case #c, and either break with
equality (in which case the right results are obviously computed) or
narrow the range. If equality doesn't obtain, the range eventually
narrows to cases #a or #b.
To go from #c to #a, we must have lo+2==hi at the start, and
K[i]=K[lo+1]<k. Then the new lo gets set to i+1 = lo+2 = hi, and the
loop exits with lo=hi=i and _cmp<0. This is correct, because we
know that k != K(i) (loop invariant! we actually know something
stronger, that k < K(hi); since i=hi, this implies k != K(i)).
Else #c eventually falls into case #b, lo+1==hi and i==lo. The
invariant tells us K(lo-1) < k < K(hi) = K(lo+1), so if the key
is present it must be at K(lo). i==lo in this case, so we test
K(lo) against k. As always, if equality obtains we do the right
thing, else case #b becomes case #a.
When #b becomes #a, the last comparison was non-equal, so _cmp is
non-zero, and the loop exits because lo==hi==i in case #a. The
invariant then tells us K(lo-1) < k < K(lo), so the key is in fact
not present, it's correct to exit with _cmp non-zero, and i==lo is
again the index at which k belongs.
Optimization points:
+ As for BTREE_SEARCH, shifting of signed ints is cheaper than
+ Unlike as for BTREE_SEARCH, there's nothing special about searching
an empty bucket, and the macro computes thoroughly sensible results
in that case.
+ The order of _cmp comparisons differs from BTREE_SEARCH. When
searching a bucket, it's much more likely (than when searching a
BTree node) that the key is present, so testing __cmp==0 isn't a
systematic waste of cycles. At the extreme, if all searches are
successful (key present), on average this saves one comparison per
search, against leaving the determination of _cmp==0 implicit (as
BTREE_SEARCH does). But even on successful searches, __cmp != 0 is
a more popular outcome than __cmp == 0 across iterations (unless
the bucket has only a few keys), so it's important to check one
of the inequality cases first. It turns out it's better on average
to check K(i) < key (than to check K(i) > key), because when it
pays it narrows the range more (we get a little boost from setting
lo=i+1 in this case; the other case sets hi=i, which isn't as much
of a narrowing).
......@@ -45,7 +45,7 @@ typedef unsigned char char6[6];
#define INCREF_KEY(k)
#define COPY_KEY(KEY, E) (*(KEY)=*(E), (KEY)[1]=(E)[1])
#define COPY_KEY_TO_OBJECT(O, K) O=PyString_FromStringAndSize(K,2)
#define COPY_KEY_TO_OBJECT(O, K) O=PyString_FromStringAndSize((const char*)K,2)
if (KEY_CHECK(ARG)) memcpy(TARGET, PyString_AS_STRING(ARG), 2); else { \
PyErr_SetString(PyExc_TypeError, "expected two-character string key"); \
......@@ -59,7 +59,7 @@ typedef unsigned char char6[6];
#define DECREF_VALUE(k)
#define INCREF_VALUE(k)
#define COPY_VALUE(V, E) (memcpy(V, E, 6))
#define COPY_VALUE_TO_OBJECT(O, K) O=PyString_FromStringAndSize(K,6)
#define COPY_VALUE_TO_OBJECT(O, K) O=PyString_FromStringAndSize((const char*)K,6)
if ((PyString_Check(ARG) && PyString_GET_SIZE(ARG)==6)) \
memcpy(TARGET, PyString_AS_STRING(ARG), 6); else { \
......@@ -16,7 +16,7 @@
from ZODB.POSException import StorageError
class ClientStorageError(StorageError):
"""An error occured in the ZEO Client Storage."""
"""An error occurred in the ZEO Client Storage."""
class UnrecognizedResult(ClientStorageError):
"""A server call returned an unrecognized result."""
......@@ -64,7 +64,7 @@ def log(message, level=logging.INFO, label=None, exc_info=False):
logger.log(level, message, exc_info=exc_info)
class StorageServerError(StorageError):
"""Error reported when an unpickleable exception is raised."""
"""Error reported when an unpicklable exception is raised."""
class ZEOStorage:
"""Proxy to underlying storage for a single remote client."""
......@@ -22,4 +22,4 @@ ZEO is now part of ZODB; ZODB's home on the web is
# The next line must use double quotes, so recognizes it.
version = "3.6.0a3"
version = "3.7.0a0"
......@@ -176,8 +176,14 @@ def shutdown_zeo_server(adminaddr):
# superstition.
for i in range(3):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.timeout:
# On FreeBSD 5.3 the connection just timed out
if i > 0:
except socket.error, e:
if e[0] == errno.ECONNREFUSED and i > 0:
......@@ -460,9 +460,13 @@ class Connection(smac.SizedMessageAsyncConnection, object):
return hasattr(self.obj, name)
def send_reply(self, msgid, ret):
# encode() can pass on a wide variety of exceptions from cPickle.
# While a bare `except` is generally poor practice, in this case
# it's acceptable -- we really do want to catch every exception
# cPickle may raise.
msg = self.marshal.encode(msgid, 0, REPLY, ret)
except self.marshal.errors:
except: # see above
r = short_repr(ret)
......@@ -480,9 +484,13 @@ class Connection(smac.SizedMessageAsyncConnection, object):
if type(err_value) is not types.InstanceType:
err_value = err_type, err_value
# encode() can pass on a wide variety of exceptions from cPickle.
# While a bare `except` is generally poor practice, in this case
# it's acceptable -- we really do want to catch every exception
# cPickle may raise.
msg = self.marshal.encode(msgid, 0, REPLY, (err_type, err_value))
except self.marshal.errors:
except: # see above
r = short_repr(err_value)
......@@ -34,6 +34,7 @@ class BaseStorage(UndoLogCompatible):
A subclass must define the following methods:
......@@ -53,7 +54,6 @@ class BaseStorage(UndoLogCompatible):
If the subclass wants to implement undo, it should implement the
multiple revision methods and:
......@@ -94,9 +94,9 @@ class BaseStorage(UndoLogCompatible):
self._commit_lock_acquire = l.acquire
self._commit_lock_release = l.release
self._tid = `t`
t = time.time()
t = self._ts = TimeStamp(*(time.gmtime(t)[:5] + (t%60,)))
self._tid = repr(t)
# ._oid is the highest oid in use (0 is always in use -- it's
# a reserved oid for the root object). Our new_oid() method
......@@ -189,9 +189,11 @@ class BaseStorage(UndoLogCompatible):
if transaction is not self._transaction:
self._transaction = None
......@@ -226,7 +228,7 @@ class BaseStorage(UndoLogCompatible):
now = time.time()
t = TimeStamp(*(time.gmtime(now)[:5] + (now % 60,)))
self._ts = t = t.laterThan(self._ts)
self._tid = `t`
self._tid = repr(t)
self._ts = TimeStamp(tid)
self._tid = tid
......@@ -45,7 +45,6 @@ from ZODB.POSException import ConflictError, ReadConflictError
from ZODB.POSException import Unsupported
from ZODB.POSException import POSKeyError
from ZODB.serialize import ObjectWriter, ObjectReader, myhasattr
from ZODB.utils import DEPRECATED_ARGUMENT, deprecated36
from ZODB.utils import p64, u64, z64, oid_repr, positive_id
from ZODB import utils
......@@ -132,6 +131,9 @@ class Connection(ExportImport, object):
# will execute atomically by virtue of the GIL. But some storage
# might generate oids where hash or compare invokes Python code. In
# that case, the GIL can't save us.
# Note: since that was written, it was officially declared that the
# type of an oid is str. TODO: remove the related now-unnecessary
# critical sections (if any -- this needs careful thought).
self._inv_lock = threading.Lock()
self._invalidated = d = {}
......@@ -213,10 +215,8 @@ class Connection(ExportImport, object):
self._cache[oid] = obj
return obj
def cacheMinimize(self, dt=DEPRECATED_ARGUMENT):
def cacheMinimize(self):
"""Deactivate all unmodified objects in the cache."""
deprecated36("cacheMinimize() dt= is ignored.")
# TODO: we should test what happens when cacheGC is called mid-transaction.
......@@ -647,8 +647,8 @@ class Connection(ExportImport, object):
# Note: If we invalidate a non-justifiable object (i.e. a
# persistent class), the object will immediately reread it's
# Note: If we invalidate a non-ghostifiable object (i.e. a
# persistent class), the object will immediately reread its
# state. That means that the following call could result in a
# call to self.setstate, which, of course, must succeed. In
# general, it would be better if the read could be delayed
......@@ -787,10 +787,8 @@ class Connection(ExportImport, object):
# dict update could go on in another thread, but we don't care
# because we have to check again after the load anyway.
if (obj._p_oid in self._invalidated
and not myhasattr(obj, "_p_independent")
and not self._invalidated
if (obj._p_oid in self._invalidated and
not myhasattr(obj, "_p_independent")):
# If the object has _p_independent(), we will handle it below.
......@@ -883,16 +881,11 @@ class Connection(ExportImport, object):
assert obj._p_jar is self
if obj._p_oid is None:
# There is some old Zope code that assigns _p_jar
# directly. That is no longer allowed, but we need to
# provide support for old code that still does it.
# The actual complaint here is that an object without
# an oid is being registered. I can't think of any way to
# achieve that without assignment to _p_jar. If there is
# a way, this will be a very confusing warning.
deprecated36("Assigning to _p_jar is deprecated, and will be "
"changed to raise an exception.")
# a way, this will be a very confusing exception.
raise ValueError("assigning to _p_jar is not supported")
elif obj._p_oid in self._added:
# It was registered before it was added to _added.
......@@ -1036,48 +1029,7 @@ class Connection(ExportImport, object):
# DEPRECATED methods
def cacheFullSweep(self, dt=None):
deprecated36("cacheFullSweep is deprecated. "
"Use cacheMinimize instead.")
if dt is None:
def getTransaction(self):
"""Get the current transaction for this connection.
The transaction manager's get method works the same as this
method. You can pass a transaction manager (TM) to
to control which TM the Connection uses.
deprecated36("getTransaction() is deprecated. "
"Use the transaction_manager argument "
"to instead, or access "
".transaction_manager directly on the Connection.")
return self.transaction_manager.get()
def setLocalTransaction(self):
"""Use a transaction bound to the connection rather than the thread.
Returns the transaction manager used by the connection. You
can pass a transaction manager (TM) to to control
which TM the Connection uses.
deprecated36("setLocalTransaction() is deprecated. "
"Use the transaction_manager argument "
"to instead.")
if self.transaction_manager is transaction.manager:
if self._synch:
self.transaction_manager = transaction.TransactionManager()
if self._synch:
return self.transaction_manager
# None at present.
# DEPRECATED methods
......@@ -25,7 +25,6 @@ from ZODB.utils import z64
from ZODB.Connection import Connection
from ZODB.serialize import referencesf
from ZODB.utils import WeakSet
from ZODB.utils import DEPRECATED_ARGUMENT, deprecated36
from zope.interface import implements
from ZODB.interfaces import IDatabase
......@@ -119,6 +118,19 @@ class _ConnectionPool(object):
while len(self.available) > target:
c = self.available.pop(0)
# While application code may still hold a reference to `c`,
# there's little useful that can be done with this Connection
# anymore. Its cache may be holding on to limited resources,
# and we replace the cache with an empty one now so that we
# don't have to wait for gc to reclaim it. Note that it's not
# possible for to return `c` again: `c` can never
# be in an open state again.
# TODO: Perhaps it would be better to break the reference
# cycles between `c` and `c._cache`, so that refcounting reclaims
# both right now. But if user code _does_ have a strong
# reference to `c` now, breaking the cycle would not reclaim `c`
# now, and `c` would be left in a user-visible crazy state.
# Pop an available connection and return it, or return None if none are
# available. In the latter case, the caller should create a new
......@@ -177,9 +189,6 @@ class DB(object):
cacheFullSweep, cacheLastGCTime, cacheMinimize, cacheSize,
cacheDetailSize, getCacheSize, getVersionCacheSize, setCacheSize,
- `Deprecated Methods`: getCacheDeactivateAfter,
getVersionCacheDeactivateAfter, setVersionCacheDeactivateAfter
......@@ -189,12 +198,10 @@ class DB(object):
def __init__(self, storage,
"""Create an object database.
......@@ -206,8 +213,6 @@ class DB(object):
- `version_cache_size`: target size of Connection object cache for
version connections
- `cache_deactivate_after`: ignored
- `version_cache_deactivate_after`: ignored
# Allocate lock.
x = threading.RLock()
......@@ -222,12 +227,6 @@ class DB(object):
self._version_pool_size = version_pool_size
self._version_cache_size = version_cache_size
# warn about use of deprecated arguments
if cache_deactivate_after is not DEPRECATED_ARGUMENT:
deprecated36("cache_deactivate_after has no effect")
if version_cache_deactivate_after is not DEPRECATED_ARGUMENT:
deprecated36("version_cache_deactivate_after has no effect")
self._miv_cache = {}
# Setup storage
......@@ -494,10 +493,7 @@ class DB(object):
def objectCount(self):
return len(self._storage)
def open(self, version='',
mvcc=True, txn_mgr=DEPRECATED_ARGUMENT,
def open(self, version='', mvcc=True,
transaction_manager=None, synch=True):
"""Return a database Connection for use by application code.
......@@ -518,29 +514,6 @@ class DB(object):
register for afterCompletion() calls.
if temporary is not DEPRECATED_ARGUMENT:
deprecated36(" temporary= ignored. "
"open() no longer blocks.")
if force is not DEPRECATED_ARGUMENT:
deprecated36(" force= ignored. "
"open() no longer blocks.")
if waitflag is not DEPRECATED_ARGUMENT:
deprecated36(" waitflag= ignored. "
"open() no longer blocks.")
if transaction is not DEPRECATED_ARGUMENT:
deprecated36(" transaction= ignored.")
if txn_mgr is not DEPRECATED_ARGUMENT:
deprecated36("use transaction_manager= instead of txn_mgr=")
if transaction_manager is None:
transaction_manager = txn_mgr
raise ValueError("cannot specify both transaction_manager= "
"and txn_mgr=")
# pool <- the _ConnectionPool for this version
......@@ -706,23 +679,8 @@ class DB(object):
def versionEmpty(self, version):
return self._storage.versionEmpty(version)
# The following methods are deprecated and have no effect
def getCacheDeactivateAfter(self):
deprecated36("getCacheDeactivateAfter has no effect")
def getVersionCacheDeactivateAfter(self):
deprecated36("getVersionCacheDeactivateAfter has no effect")
def setCacheDeactivateAfter(self, v):
deprecated36("setCacheDeactivateAfter has no effect")
def setVersionCacheDeactivateAfter(self, v):
deprecated36("setVersionCacheDeactivateAfter has no effect")
resource_counter_lock = threading.Lock()
resource_counter = 0
class ResourceManager(object):
"""Transaction participation for a version or undo resource."""
......@@ -734,8 +692,20 @@ class ResourceManager(object):
self.tpc_finish = self._db._storage.tpc_finish
self.tpc_abort = self._db._storage.tpc_abort
# Get a number from a simple thread-safe counter, then
# increment it, for the purpose of sorting ResourceManagers by
# creation order. This ensures that multiple ResourceManagers
# within a transaction commit in a predictable sequence.
global resource_counter
self._count = resource_counter
resource_counter += 1
def sortKey(self):
return "%s:%s" % (self._db._storage.sortKey(), id(self))
return "%s:%016x" % (self._db._storage.sortKey(), self._count)
def tpc_begin(self, txn, sub=False):
if sub:
......@@ -21,7 +21,7 @@ The Demo storage serves two purposes:
- Provide a volatile storage that is useful for giving demonstrations.
The demo storage can have a "base" storage that is used in a
read-only fashion. The base storage must not not to contain version
read-only fashion. The base storage must not contain version
There are three main data structures:
......@@ -890,7 +890,7 @@ class FileStorage(BaseStorage.BaseStorage,
# Hm, an error occured writing out the data. Maybe the
# Hm, an error occurred writing out the data. Maybe the
# disk is full. We don't want any turd at the end.
......@@ -993,7 +993,11 @@ class FileStorage(BaseStorage.BaseStorage,
return "", None
def _transactionalUndoRecord(self, oid, pos, tid, pre, version):
"""Get the indo information for a data record
"""Get the undo information for a data record
'pos' points to the data header for 'oid' in the transaction
being undone. 'tid' refers to the transaction being undone.
'pre' is the 'prev' field of the same data header.
Return a 5-tuple consisting of a pickle, data pointer,
version, packed non-version data pointer, and current
......@@ -31,7 +31,7 @@ class POSKeyError(KeyError, POSError):
return oid_repr(self.args[0])
class TransactionError(POSError):
"""An error occured due to normal transaction processing."""
"""An error occurred due to normal transaction processing."""
class TransactionFailedError(POSError):
"""Cannot perform an operation on a transaction that previously failed.
......@@ -252,7 +252,7 @@ class UndoError(POSError):
return _fmt_undo(self._oid, self._reason)
class MultipleUndoErrors(UndoError):
"""Several undo errors occured during a single transaction."""
"""Several undo errors occurred during a single transaction."""
def __init__(self, errs):
# provide a reason and oid for clients that only look at that
......@@ -13,10 +13,9 @@
# The next line must use double quotes, so recognizes it.
__version__ = "3.6.0a3"
__version__ = "3.7.0a0"
import sys
import __builtin__
from persistent import TimeStamp
from persistent import list
......@@ -30,9 +29,3 @@ sys.modules['ZODB.PersistentList'] = sys.modules['persistent.list']
del mapping, list, sys
from DB import DB
# TODO: get_transaction() scheduled to go away in ZODB 3.6.
from transaction import get_transaction
__builtin__.get_transaction = get_transaction
del __builtin__
Collabortation Diagrams
This file contains several collaboration diagrams for the ZODB.
Simple fetch, modify, commit
C: ZODB.Connection.Connection
S: ZODB.FileStorage.FileStorage
T: transaction.interfaces.ITransaction
TM: transaction.interfaces.ITransactionManager
o1, o2, ...: pre-existing persistent objects
- ``DB``: ``ZODB.DB.DB``
- ``C``: ``ZODB.Connection.Connection``
- ``S``: ``ZODB.FileStorage.FileStorage``
- ``T``: ``transaction.interfaces.ITransaction``
- ``TM``: ``transaction.interfaces.ITransactionManager``
- ``o1``, ``o2``, ...: pre-existing persistent objects
"""Simple fetch, modify, commit."""
create C
......@@ -50,16 +63,23 @@ Scenario
# transactions.
Simple fetch, modify, abort
C: ZODB.Connection.Connection
S: ZODB.FileStorage.FileStorage
T: transaction.interfaces.ITransaction
TM: transaction.interfaces.ITransactionManager
o1, o2, ...: pre-existing persistent objects
- ``DB``: ``ZODB.DB.DB``
- ``C``: ``ZODB.Connection.Connection``
- ``S``: ``ZODB.FileStorage.FileStorage``
- ``T``: ``transaction.interfaces.ITransaction``
- ``TM``: ``transaction.interfaces.ITransactionManager``
- ``o1``, ``o2``, ...: pre-existing persistent objects
"""Simple fetch, modify, abort."""
create C
......@@ -91,15 +111,22 @@ Scenario
# transactions.
T: ITransaction
o1, o2, o3: some persistent objects
C1, C2, C3: resource managers
S1, S2: Transaction savepoint objects
s11, s21, s22: resource-manager savepoints
Rollback of a savepoint
- ``T``: ``transaction.interfaces.ITransaction``
- ``o1``, ``o2``, ``o3``: some persistent objects
- ``C1``, ``C2``, ``C3``: resource managers
- ``S1``, ``S2``: Transaction savepoint objects
- ``s11``, ``s21``, ``s22``: resource-manager savepoints
"""Rollback of a savepoint"""
create T
......@@ -158,9 +158,40 @@
<section type="" name="*" attribute="storage"/>
<key name="cache-size" datatype="integer" default="5000"/>
Target size, in number of objects, of each connection's
object cache.
<key name="pool-size" datatype="integer" default="7"/>
The expected maximum number of simultaneously open connections.
There is no hard limit (as many connections as are requested
will be opened, until system resources are exhausted). Exceeding
pool-size connections causes a warning message to be logged,
and exceeding twice pool-size connections causes a critical
message to be logged.
<key name="version-pool-size" datatype="integer" default="3"/>
The expected maximum number of connections simultaneously open
per version.
<key name="version-cache-size" datatype="integer" default="100"/>
Target size, in number of objects, of each version connection's
object cache.
<key name="database-name" default="unnamed"/>
When multidatabases are in use, this is the name given to this
database in the collection. The name must be unique across all
databases in the collection. The collection must also be given
a mapping from its databases' names to their databases, but that
cannot be specified in a ZODB config file. Applications using
multidatabases typical supply a way to configure the mapping in
their own config files, using the "databases" parameter of a DB
<sectiontype name="blobstorage" datatype=".BlobStorage"
......@@ -92,7 +92,7 @@ class BaseConfig:
class ZODBDatabase(BaseConfig):
def open(self, database_name='unnamed', databases=None):
def open(self, databases=None):
section = self.config
storage =
......@@ -101,8 +101,8 @@ class ZODBDatabase(BaseConfig):
Cross-Database References
......@@ -36,7 +37,7 @@ We'll have a reference to the first object:
>>> tm.commit()
Now, let's open a separate connection to database 2. We use it to
read p2, use p2 to get to p1, and verify that it is in database 1:
read `p2`, use `p2` to get to `p1`, and verify that it is in database 1:
>>> conn =
>>> p2x = conn.root()['p']
......@@ -77,8 +78,8 @@ happens. Consider:
>>> p1.p4 = p4
>>> p2.p4 = p4
In this example, the new object is reachable from both p1 in database
1 and p2 in database 2. If we commit, which database will p4 end up
In this example, the new object is reachable from both `p1` in database
1 and `p2` in database 2. If we commit, which database will `p4` end up
in? This sort of ambiguity can lead to subtle bugs. For that reason,
an error is generated if we commit changes when new objects are
reachable from multiple databases:
......@@ -141,6 +142,7 @@ cross-database references, however, there are a number of facilities
cross-database garbage collection
Garbage collection is done on a database by database basis.
If an object on a database only has references to it from other
databases, then the object will be garbage collected when its
......@@ -148,11 +150,13 @@ cross-database garbage collection
cross-database undo
Undo is only applied to a single database. Fixing this for
multiple databases is going to be extremely difficult. Undo
currently poses consistency problems, so it is not (or should not
be) widely used.
Cross-database aware (tolerant) export/import
The export/import facility needs to be aware, at least, of cross-database
Persistent Classes
......@@ -39,7 +40,7 @@ functions to make them picklable.
Also note that we explictly set the module. Persistent classes don't
live in normal Python modules. Rather, they live in the database. We
use information in __module__ to record where in the database. When
use information in ``__module__`` to record where in the database. When
we want to use a database, we will need to supply a custom class
factory to load instances of the class.
......@@ -176,7 +177,7 @@ until we sync:
Instances of Persistent Classes
We can, of course, store instances of perstent classes in the
We can, of course, store instances of persistent classes in the
>>> c.color = 'blue'
......@@ -228,10 +229,10 @@ Now, if we try to load it, we get a broken oject:
>>> connection2.root()['obs']['p']
<persistent broken __zodb__.P instance '\x00\x00\x00\x00\x00\x00\x00\x04'>
because the module, "__zodb__" can't be loaded. We need to provide a
because the module, `__zodb__` can't be loaded. We need to provide a
class factory that knows about this special module. Here we'll supply a
sample class factory that looks up a class name in the database root
if the module is "__zodb__". It falls back to the normal class lookup
if the module is `__zodb__`. It falls back to the normal class lookup
for other modules:
>>> from ZODB.broken import find_global
......@@ -340,7 +340,7 @@ class ObjectWriter:
if self._jar.get_connection(database_name) is not obj._p_jar:
raise InvalidObjectReference(
"Attempt to store a reference to an object from "
"a separate onnection to the same database or "
"a separate connection to the same database or "
......@@ -8,44 +8,44 @@ subtransactions. When a transaction is committed, a flag is passed
indicating whether it is a subtransaction or a top-level transaction.
Consider the following exampler commit calls:
- commit()
- ``commit()``
A regular top-level transaction is committed.
- commit(1)
- ``commit(1)``
A subtransaction is committed. There is now one subtransaction of
the current top-level transaction.
- commit(1)
- ``commit(1)``
A subtransaction is committed. There are now two subtransactions of
the current top-level transaction.
- abort(1)
- ``abort(1)``
A subtransaction is aborted. There are still two subtransactions of
the current top-level transaction; work done since the last
commit(1) call is discarded.
``commit(1)`` call is discarded.
- commit()
- ``commit()``
We now commit a top-level transaction. The work done in the previous
two subtransactions *plus* work done since the last abort(1) call
two subtransactions *plus* work done since the last ``abort(1)`` call
is saved.
- commit(1)
- ``commit(1)``
A subtransaction is committed. There is now one subtransaction of
the current top-level transaction.
- commit(1)
- ``commit(1)``
A subtransaction is committed. There are now two subtransactions of
the current top-level transaction.
- abort()
- ``abort()``
We now abort a top-level transaction. We discard the work done in
the previous two subtransactions *plus* work done since the last
commit(1) call.
``commit(1)`` call.
......@@ -272,6 +272,54 @@ first popped:
>>> len(pool.available), len(pool.all)
(0, 2)
Next: when a closed Connection is removed from .available due to exceeding
pool_size, that Connection's cache is cleared (this behavior was new in
ZODB 3.6b6). While user code may still hold a reference to that
Connection, once it vanishes from .available it's really not usable for
anything sensible (it can never be in the open state again). Waiting for
gc to reclaim the Connection and its cache eventually works, but that can
take "a long time" and caches can hold on to many objects, and limited
resources (like RDB connections), for the duration.
>>> st.close()
>>> st = Storage()
>>> db = DB(st, pool_size=2)
>>> conn0 =
>>> len(conn0._cache) # empty now
>>> import transaction
>>> conn0.root()['a'] = 1
>>> transaction.commit()
>>> len(conn0._cache) # but now the cache holds the root object
Now open more connections so that the total exceeds pool_size (2):
>>> conn1 =
>>> conn2 =
>>> pool = db._pools['']
>>> len(pool.all), len(pool.available) # all Connections are in use
(3, 0)
Return pool_size (2) Connections to the pool:
>>> conn0.close()
>>> conn1.close()
>>> len(pool.all), len(pool.available)
(3, 2)
>>> len(conn0._cache) # nothing relevant has changed yet
When we close the third connection, conn0 will be booted from .all, and
we expect its cache to be cleared then:
>>> conn2.close()
>>> len(pool.all), len(pool.available)
(2, 2)
>>> len(conn0._cache) # conn0's cache is empty again
>>> del conn0, conn1, conn2
Clean up.
>>> st.close()
# Copyright (c) 2005 Zope Corporation and Contributors.
# All Rights Reserved.
# This software is subject to the provisions of the Zope Public License,
# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
Multi-database tests
Multiple Databases
Multi-database support adds the ability to tie multiple databases into a
collection. The original proposal is in the fishbowl:
......@@ -25,29 +12,29 @@ by Jim Fulton, Christian Theune, and Tim Peters. Overview:
No private attributes were added, and one new method was introduced.
- a new .database_name attribute holds the name of this database
- a new ``.database_name`` attribute holds the name of this database.
- a new .databases attribute maps from database name to DB object; all DBs
in a multi-database collection share the same .databases object
- a new ``.databases`` attribute maps from database name to ``DB`` object; all
databases in a multi-database collection share the same ``.databases`` object
- the DB constructor has new optional arguments with the same names
(database_name= and databases=).
- the ``DB`` constructor has new optional arguments with the same names
(``database_name=`` and ``databases=``).
- a new .connections attribute maps from database name to a Connection for
the database with that name; the .connections mapping object is also
shared among databases in a collection
- a new ``.connections`` attribute maps from database name to a ``Connection``
for the database with that name; the ``.connections`` mapping object is also
shared among databases in a collection.
- a new .get_connection(database_name) method returns a Connection for a
database in the collection; if a connection is already open, it's returned
(this is the value .connections[database_name]), else a new connection is
opened (and stored as .connections[database_name])
- a new ``.get_connection(database_name)`` method returns a ``Connection`` for
a database in the collection; if a connection is already open, it's returned
(this is the value ``.connections[database_name]``), else a new connection
is opened (and stored as ``.connections[database_name]``)
Creating a multi-database starts with creating a named DB:
Creating a multi-database starts with creating a named ``DB``:
>>> from ZODB.tests.test_storage import MinimalMemoryStorage
>>> from ZODB import DB
......@@ -69,7 +56,8 @@ Adding another database to the collection works like this:
... database_name='notroot',
... databases=dbmap)
The new db2 now shares the 'databases' dictionary with db and has two entries:
The new ``db2`` now shares the ``databases`` dictionary with db and has two
>>> db2.databases is db.databases is dbmap
......@@ -87,7 +75,7 @@ It's an error to try to insert a database with a name already in use:
ValueError: database_name 'root' already in databases
Because that failed, db.databases wasn't changed:
Because that failed, ``db.databases`` wasn't changed:
>>> len(db.databases) # still 2
......@@ -127,7 +115,7 @@ Now there are two connections in that collection:
>>> names = cn.connections.keys(); names.sort(); print names
['notroot', 'root']
So long as this database group remains open, the same Connection objects
So long as this database group remains open, the same ``Connection`` objects
are returned:
>>> cn.get_connection('root') is cn
......@@ -151,3 +139,59 @@ Clean up:
>>> for a_db in dbmap.values():
... a_db.close()
Configuration from File
The database name can also be specified in a config file, starting in
ZODB 3.6:
>>> from ZODB.config import databaseFromString
>>> config = """
... <zodb>
... <mappingstorage/>
... database-name this_is_the_name
... </zodb>
... """
>>> db = databaseFromString(config)
>>> print db.database_name
>>> db.databases.keys()
However, the ``.databases`` attribute cannot be configured from file. It
can be passed to the `ZConfig` factory. I'm not sure of the clearest way
to test that here; this is ugly:
>>> from ZODB.config import getDbSchema
>>> import ZConfig
>>> from cStringIO import StringIO
Derive a new `config2` string from the `config` string, specifying a
different database_name:
>>> config2 = config.replace("this_is_the_name", "another_name")
Now get a `ZConfig` factory from `config2`:
>>> f = StringIO(config2)
>>> zconfig, handle = ZConfig.loadConfigFile(getDbSchema(), f)
>>> factory = zconfig.database
The desired ``databases`` mapping can be passed to this factory:
>>> db2 =
>>> print db2.database_name # has the right name
>>> db.databases is db2.databases # shares .databases with `db`
>>> all = db2.databases.keys()
>>> all.sort()
>>> all # and db.database_name & db2.database_name are the keys
['another_name', 'this_is_the_name']
>>> db.close()
>>> db2.close()
Here are some tests that storage sync() methods get called at appropriate
Here are some tests that storage ``sync()`` methods get called at appropriate
times in the life of a transaction. The tested behavior is new in ZODB 3.4.
First define a lightweight storage with a sync() method:
First define a lightweight storage with a ``sync()`` method:
>>> import ZODB
>>> from ZODB.MappingStorage import MappingStorage
......@@ -27,14 +31,14 @@ Sync should not have been called yet.
sync is called by the Connection's afterCompletion() hook after the commit
``sync()`` is called by the Connection's ``afterCompletion()`` hook after the
commit completes.
>>> transaction.commit()
>>> st.sync_called # False before 3.4
sync is also called by the afterCompletion() hook after an abort.
``sync()`` is also called by the ``afterCompletion()`` hook after an abort.
>>> st.sync_called = False
>>> rt['b'] = 2
......@@ -42,8 +46,8 @@ sync is also called by the afterCompletion() hook after an abort.
>>> st.sync_called # False before 3.4
And sync is called whenever we explicitly start a new txn, via the
newTransaction() hook.
And ``sync()`` is called whenever we explicitly start a new transaction, via
the ``newTransaction()`` hook.
>>> st.sync_called = False
>>> dummy = transaction.begin()
......@@ -51,19 +55,19 @@ newTransaction() hook.
Clean up. Closing db isn't enough -- closing a DB doesn't close its
Connections. Leaving our Connection open here can cause the
SimpleStorage.sync() method to get called later, during another test, and
our doctest-synthesized module globals no longer exist then. You get
a weird traceback then ;-)
`Connections`. Leaving our `Connection` open here can cause the
``SimpleStorage.sync()`` method to get called later, during another test, and
our doctest-synthesized module globals no longer exist then. You get a weird
traceback then ;-)
>>> cn.close()
One more, very obscure. It was the case that if the first action a new
threaded transaction manager saw was a begin() call, then synchronizers
registered after that in the same transaction weren't communicated to
the Transaction object, and so the synchronizers' afterCompletion() hooks
threaded transaction manager saw was a ``begin()`` call, then synchronizers
registered after that in the same transaction weren't communicated to the
`Transaction` object, and so the synchronizers' ``afterCompletion()`` hooks
weren't called when the transaction commited. None of the test suites
(ZODB's, Zope 2.8's, or Zope3's) caught that, but apparently Zope3 takes this
(ZODB's, Zope 2.8's, or Zope3's) caught that, but apparently Zope 3 takes this
path at some point when serving pages.
>>> tm = transaction.ThreadTransactionManager()
......@@ -75,14 +79,14 @@ path at some point when serving pages.
>>> st.sync_called
Now ensure that cn.afterCompletion() -> st.sync() gets called by commit
despite that the Connection registered after the transaction began:
Now ensure that ``cn.afterCompletion() -> st.sync()`` gets called by commit
despite that the `Connection` registered after the transaction began:
>>> tm.commit()
>>> st.sync_called
And try the same thing with a non-threaded TM:
And try the same thing with a non-threaded transaction manager:
>>> cn.close()
>>> tm = transaction.TransactionManager()
......@@ -390,11 +390,11 @@ class UserMethodTests(unittest.TestCase):
def test_cache(self):
r"""doctest of cacheMinimize() and cacheFullSweep() methods.
r"""doctest of cacheMinimize().
These tests are fairly minimal, just verifying that the
methods can be called and have some effect. We need other
tests that verify the cache works as intended.
Thus test us minimal, just verifying that the method can be called
and has some effect. We need other tests that verify the cache works
as intended.
>>> db = databaseFromString("<zodb>\n<mappingstorage/>\n</zodb>")
>>> cn =
......@@ -403,71 +403,12 @@ class UserMethodTests(unittest.TestCase):
>>> r._p_state
The next couple of tests are involved because they have to
cater to backwards compatibility issues. The cacheMinimize()
method used to take an argument, but now ignores it.
cacheFullSweep() used to do something different than
cacheMinimize(), but it doesn't anymore. We want to verify
that these methods do something, but all cause deprecation
warnings. To do that, we need a warnings hook.
>>> hook = WarningsHook()
>>> hook.install()
More problems in case this test is run more than once: fool the
warnings module into delivering the warnings despite that they've
been seen before.
>>> import warnings
>>> warnings.filterwarnings("always", category=DeprecationWarning)
>>> r._p_activate()
>>> cn.cacheMinimize(12)
>>> r._p_state
>>> len(hook.warnings)
>>> message, category, filename, lineno = hook.warnings[0]
>>> print message
This will be removed in ZODB 3.6:
cacheMinimize() dt= is ignored.
>>> category.__name__
>>> hook.clear()
cacheFullSweep() is a doozy. It generates two deprecation
warnings, one from the Connection and one from the
cPickleCache. Maybe we should drop the cPickleCache warning,
but it's there for now. When passed an argument, it acts like
cacheGC(). When it isn't passed an argument it acts like
>>> r._p_activate()
>>> cn.cacheFullSweep(12)
>>> r._p_state
>>> r._p_state # up to date
>>> len(hook.warnings)
>>> message, category, filename, lineno = hook.warnings[0]
>>> print message
This will be removed in ZODB 3.6:
cacheFullSweep is deprecated. Use cacheMinimize instead.
>>> category.__name__
>>> message, category, filename, lineno = hook.warnings[1]
>>> message
'No argument expected'
>>> category.__name__
We have to uninstall the hook so that other warnings don't get lost.
>>> hook.uninstall()
Obscure: There is no API call for removing the filter we added, but
filters appears to be a public variable.
>>> del warnings.filters[0]
>>> cn.cacheMinimize()
>>> r._p_state # ghost again
class InvalidationTests(unittest.TestCase):
Savepoints provide a way to save to disk intermediate work done during
a transaction allowing:
Savepoints provide a way to save to disk intermediate work done during a
transaction allowing:
- partial transaction (subtransaction) rollback (abort)
- state of saved objects to be freed, freeing on-line memory for other
Savepoints make it possible to write atomic subroutines that don't
make top-level transaction commitments.
Savepoints make it possible to write atomic subroutines that don't make
top-level transaction commitments.
......@@ -39,13 +41,13 @@ and abort changes:
>>> root['name']
Now, let's look at an application that manages funds for people.
It allows deposits and debits to be entered for multiple people.
It accepts a sequence of entries and generates a sequence of status
messages. For each entry, it applies the change and then validates
the user's account. If the user's account is invalid, we roll back
the change for that entry. The success or failure of an entry is
indicated in the output status. First we'll initialize some accounts:
Now, let's look at an application that manages funds for people. It allows
deposits and debits to be entered for multiple people. It accepts a sequence
of entries and generates a sequence of status messages. For each entry, it
applies the change and then validates the user's account. If the user's
account is invalid, we roll back the change for that entry. The success or
failure of an entry is indicated in the output status. First we'll initialize
some accounts:
>>> root['bob-balance'] = 0.0
>>> root['bob-credit'] = 0.0
......@@ -59,8 +61,8 @@ Now, we'll define a validation function to validate an account:
... if root[name+'-balance'] + root[name+'-credit'] < 0:
... raise ValueError('Overdrawn', name)
And a function to apply entries. If the function fails in some
unexpected way, it rolls back all of its changes and prints the error:
And a function to apply entries. If the function fails in some unexpected
way, it rolls back all of its changes and prints the error:
>>> def apply_entries(entries):
... savepoint = transaction.savepoint()
......@@ -114,9 +116,9 @@ If we provide entries that cause an unexpected error:
Updated sally
Unexpected exception unsupported operand type(s) for +=: 'float' and 'str'
Because the apply_entries used a savepoint for the entire function,
it was able to rollback the partial changes without rolling back
changes made in the previous call to apply_entries:
Because the apply_entries used a savepoint for the entire function, it was
able to rollback the partial changes without rolling back changes made in the
previous call to ``apply_entries``:
>>> root['bob-balance']
......@@ -135,6 +137,7 @@ away:
>>> root['sally-balance']
Savepoint invalidation
......@@ -54,15 +54,6 @@ class DBTests(unittest.TestCase):
# make sure the basic methods are callable
def testSets(self):
# test set methods that have non-trivial implementations
warnings.filterwarnings("error", category=DeprecationWarning)
self.db.setCacheDeactivateAfter, 12)
self.db.setVersionCacheDeactivateAfter, 12)
# Obscure: There is no API call for removing the warning we just
# added, but filters appears to be a public variable.
del warnings.filters[0]
......@@ -213,58 +213,6 @@ class ZODBTests(unittest.TestCase):
def checkLocalTransactions(self):
# Test of transactions that apply to only the connection,
# not the thread.
conn1 =
conn2 =
hook = WarningsHook()
r1 = conn1.root()
r2 = conn2.root()
if r1.has_key('item'):
del r1['item']
r1['item'] = 1
self.assertEqual(r1['item'], 1)
# r2 has not seen a transaction boundary,
# so it should be unchanged.
self.assertEqual(r2.get('item'), None)
# Now r2 is updated.
self.assertEqual(r2['item'], 1)
# Now, for good measure, send an update in the other direction.
r2['item'] = 2
self.assertEqual(r1['item'], 1)
self.assertEqual(r2['item'], 2)
self.assertEqual(r1['item'], 2)
self.assertEqual(r2['item'], 2)
for msg, obj, filename, lineno in hook.warnings:
self.assert_(msg in [
"This will be removed in ZODB 3.6:\n"
"setLocalTransaction() is deprecated. "
"Use the transaction_manager argument "
"to instead.",
"This will be removed in ZODB 3.6:\n"
"getTransaction() is deprecated. "
"Use the transaction_manager argument "
"to instead, or access "
".transaction_manager directly on the Connection."])
def checkReadConflict(self):
self.obj = P()
......@@ -584,57 +532,8 @@ class ZODBTests(unittest.TestCase):
# transaction, and, in fact, when this test was written,
# Transaction.begin() didn't do anything (everything from here
# down failed).
# Oh, bleech. Since Transaction.begin is also deprecated, we have
# to goof around suppressing the deprecation warning.
import warnings
# First verify that Transaction.begin *is* deprecated, by turning
# the warning into an error.
warnings.filterwarnings("error", category=DeprecationWarning)
self.assertRaises(DeprecationWarning, transaction.get().begin)
del warnings.filters[0]
# Now ignore DeprecationWarnings for the duration. Use a
# try/finally block to ensure we reenable DeprecationWarnings
# no matter what.
warnings.filterwarnings("ignore", category=DeprecationWarning)
cn =
rt = cn.root()
rt['a'] = 1
transaction.get().begin() # should abort adding 'a' to the root
rt = cn.root()
self.assertRaises(KeyError, rt.__getitem__, 'a')
# A longstanding bug: this didn't work if changes were only in
# subtransactions.
rt = cn.root()
rt['a'] = 2
rt = cn.root()
self.assertRaises(KeyError, rt.__getitem__, 'a')
# One more time, mixing "top level" and subtransaction changes.
rt = cn.root()
rt['a'] = 3
rt['b'] = 4
rt = cn.root()
self.assertRaises(KeyError, rt.__getitem__, 'a')
self.assertRaises(KeyError, rt.__getitem__, 'b')
del warnings.filters[0]
# Later (ZODB 3.6): Transaction.begin() no longer exists, so the
# rest of this test was tossed.
def checkFailingCommitSticks(self):
# See also checkFailingSubtransactionCommitSticks.
......@@ -829,6 +728,42 @@ class ZODBTests(unittest.TestCase):
def checkMultipleUndoInOneTransaction(self):
# Verify that it's possible to perform multiple undo
# operations within a transaction. If ZODB performs the undo
# operations in a nondeterministic order, this test will often
# fail.
conn =
root = conn.root()
# Add transactions that set root["state"] to (0..5)
for state_num in range(6):
root['state'] = state_num
transaction.get().note('root["state"] = %d' % state_num)
# Undo all but the first. Note that no work is actually
# performed yet.
log = self._db.undoLog()
for i in range(5):
transaction.get().note('undo states 1 through 5')
# Now attempt all those undo operations.
# Sanity check: we should be back to the first state.
self.assertEqual(root['state'], 0)
class PoisonedError(Exception):
......@@ -56,7 +56,7 @@ database open function, but this doesn't work:
Traceback (most recent call last):
InvalidObjectReference: Attempt to store a reference to an object
from a separate onnection to the same database or multidatabase
from a separate connection to the same database or multidatabase
>>> tm.abort()
......@@ -72,7 +72,7 @@ different connections to the same database.
Traceback (most recent call last):
InvalidObjectReference: Attempt to store a reference to an object
from a separate onnection to the same database or multidatabase
from a separate connection to the same database or multidatabase
>>> tm.abort()
......@@ -37,9 +37,9 @@ MinimalMemoryStorage that implements MVCC support, but not much else.
>>> from ZODB import DB
>>> db = DB(MinimalMemoryStorage())
We will use two different connections with the experimental
setLocalTransaction() method to make sure that the connections act
independently, even though they'll be run from a single thread.
We will use two different connections with different transaction managers
to make sure that the connections act independently, even though they'll
be run from a single thread.
>>> import transaction
>>> tm1 = transaction.TransactionManager()
......@@ -14,6 +14,7 @@
"""Tools to simplify transactions within applications."""
from ZODB.POSException import ReadConflictError, ConflictError
import transaction
def _commit(note):
t = transaction.get()
......@@ -39,7 +39,6 @@ __all__ = ['z64',
......@@ -54,13 +53,6 @@ __all__ = ['z64',
# dance.
# Raise DeprecationWarning, noting that the deprecated thing will go
# away in ZODB 3.6. Point to the caller of our caller (i.e., at the
# code using the deprecated thing).
def deprecated36(msg):
warnings.warn("This will be removed in ZODB 3.6:\n%s" % msg,
DeprecationWarning, stacklevel=3)
# Raise DeprecationWarning, noting that the deprecated thing will go
# away in ZODB 3.7. Point to the caller of our caller (i.e., at the
# code using the deprecated thing).
......@@ -39,6 +39,10 @@ class Prefix:
def __cmp__(self, o):
other_path = o.split('/')
if other_path and ' ' in other_path[-1]:
# don't include logged username in comparison
pos = other_path[-1].rfind(' ')
other_path[-1] = other_path[-1][:pos]
return cmp(other_path[:self.length], self.path)
def __repr__(self):
......@@ -28,5 +28,19 @@ class PrefixTest(unittest.TestCase):
for equal in ("", "/", "/def", "/a/b", "/a/b/c", "/a/b/c/d"):
self.assertEqual(p2, equal)
def test_username_info(self):
# Zope Collector 1810; user paths have username appended
p1 = Prefix('/a/b')
for equal in ('/a/b spam', '/a/b/c spam', '/a/b/c/b spam'):
self.assertEqual(p1, equal)
for notEqual in (" spam", "/a/c spam", "/a/bbb spam", "/// spam"):
self.assertNotEqual(p1, notEqual)
p2 = Prefix("")
for equal in (" eggs", "/ eggs", "/def eggs", "/a/b eggs",
"/a/b/c eggs", "/a/b/c/d eggs"):
self.assertEqual(p2, equal)
def test_suite():
return unittest.makeSuite(PrefixTest)
......@@ -2,18 +2,18 @@
Persistence support
(This document is under construction. More basic documentation will
eventually appear here.)
(This document is under construction. More basic documentation will eventually
appear here.)
Overriding __getattr__, __getattribute__, __setattr__, and __delattr__
Overriding `__getattr__`, `__getattribute__`, `__setattr__`, and `__delattr__`
Subclasses can override the attribute-management methods. For the
__getattr__ method, the behavior is like that for regular Python
`__getattr__` method, the behavior is like that for regular Python
classes and for earlier versions of ZODB 3.
For __getattribute__, __setattr__, and __delattr__, it is necessary to
call certain methods defined by persistent.Persistent. Detailed
For `__getattribute__`, __setattr__`, and `__delattr__`, it is necessary
to call certain methods defined by `persistent.Persistent`. Detailed
examples and documentation is provided in the test module,
......@@ -218,7 +218,7 @@ TimeStamp_timeTime(TimeStamp *self)
static PyObject *
TimeStamp_raw(TimeStamp *self)
return PyString_FromStringAndSize(self->data, 8);
return PyString_FromStringAndSize((const char*)self->data, 8);
static PyObject *
......@@ -261,7 +261,7 @@ TimeStamp_laterThan(TimeStamp *self, PyObject *obj)
new[i] = 0;
else {
return TimeStamp_FromString(new);
return TimeStamp_FromString((const char*)new);
......@@ -39,6 +39,7 @@ class PersistentDict(persistent.Persistent, IterableUserDict):
__super_clear = IterableUserDict.clear
__super_update = IterableUserDict.update
__super_setdefault = IterableUserDict.setdefault
__super_pop = IterableUserDict.pop
__super_popitem = IterableUserDict.popitem
__super_p_init = persistent.Persistent.__init__
......@@ -72,6 +73,10 @@ class PersistentDict(persistent.Persistent, IterableUserDict):
self._p_changed = True
return self.__super_setdefault(key, failobj)
def pop(self, key, *args):
self._p_changed = True
return self.__super_pop(key, *args)
def popitem(self):
self._p_changed = True
return self.__super_popitem()
......@@ -167,7 +167,9 @@ class IPersistent(Interface):
It is up to the data manager to assign this.
The special value None is reserved to indicate that an object
id has not been assigned. Non-None object ids must be strings.
id has not been assigned. Non-None object ids must be non-empty
strings. The 8-byte string '\0'*8 (8 NUL bytes) is reserved to
identify the database root object.
_p_changed = Attribute(
......@@ -41,6 +41,8 @@ class PersistentMapping(UserDict, persistent.Persistent):
__super_clear = UserDict.clear
__super_update = UserDict.update
__super_setdefault = UserDict.setdefault
__super_pop = UserDict.pop
__super_popitem = UserDict.popitem
def __delitem__(self, key):
......@@ -66,11 +68,10 @@ class PersistentMapping(UserDict, persistent.Persistent):
self._p_changed = 1
return self.__super_setdefault(key, failobj)
__super_popitem = UserDict.popitem
except AttributeError:
def pop(self, key, *args):
self._p_changed = 1
return self.__super_pop(key, *args)
def popitem(self):
self._p_changed = 1
return self.__super_popitem()
Tests for persistent.Persistent
Tests for `persistent.Persistent`
This document is an extended doc test that covers the basics of the
Persistent base class. The test expects a class named 'P' to be
provided in its globals. The P class implements the Persistent
Persistent base class. The test expects a class named `P` to be
provided in its globals. The `P` class implements the `Persistent`
Test framework
The class P needs to behave like ExampleP. (Note that the code below
The class `P` needs to behave like `ExampleP`. (Note that the code below
is *not* part of the tests.)
class ExampleP(Persistent):
class ExampleP(Persistent):
def __init__(self):
self.x = 0
def inc(self):
......@@ -20,433 +22,437 @@ class ExampleP(Persistent):
The tests use stub data managers. A data manager is responsible for
loading and storing the state of a persistent object. It's stored in
the _p_jar attribute of a persistent object.
the ``_p_jar`` attribute of a persistent object.
>>> class DM:
... def __init__(self):
... self.called = 0
... def register(self, ob):
... self.called += 1
... def setstate(self, ob):
... ob.__setstate__({'x': 42})
>>> class DM:
... def __init__(self):
... self.called = 0
... def register(self, ob):
... self.called += 1
... def setstate(self, ob):
... ob.__setstate__({'x': 42})
>>> class BrokenDM(DM):
... def register(self,ob):
... self.called += 1
... raise NotImplementedError
... def setstate(self,ob):
... raise NotImplementedError
>>> class BrokenDM(DM):
... def register(self,ob):
... self.called += 1
... raise NotImplementedError
... def setstate(self,ob):
... raise NotImplementedError
>>> from persistent import Persistent
>>> from persistent import Persistent
Test Persistent without Data Manager
First do some simple tests of a Persistent instance that does not have
a data manager (_p_jar).
>>> p = P()
>>> p.x
>>> p._p_changed
>>> p._p_state
>>> p._p_jar
>>> p._p_oid
Verify that modifications have no effect on _p_state of _p_changed.
>>> p.x
>>> p._p_changed
>>> p._p_state
a data manager (``_p_jar``).
>>> p = P()
>>> p.x
>>> p._p_changed
>>> p._p_state
>>> p._p_jar
>>> p._p_oid
Verify that modifications have no effect on ``_p_state`` of ``_p_changed``.
>>> p.x
>>> p._p_changed
>>> p._p_state
Try all sorts of different ways to change the object's state.
>>> p._p_deactivate()
>>> p._p_state
>>> p._p_changed = True
>>> p._p_state
>>> del p._p_changed
>>> p._p_changed
>>> p._p_state
>>> p.x
>>> p._p_deactivate()
>>> p._p_state
>>> p._p_changed = True
>>> p._p_state
>>> del p._p_changed
>>> p._p_changed
>>> p._p_state
>>> p.x
Test Persistent with Data Manager
Next try some tests of an object with a data manager. The DM class is
Next try some tests of an object with a data manager. The `DM` class is
a simple testing stub.
>>> p = P()
>>> dm = DM()
>>> p._p_oid = "00000012"
>>> p._p_jar = dm
>>> p._p_changed
>>> dm.called
Modifying the object marks it as changed and registers it with the
data manager. Subsequent modifications don't have additional
>>> p._p_changed
>>> dm.called
>>> p._p_changed
>>> dm.called
>>> p = P()
>>> dm = DM()
>>> p._p_oid = "00000012"
>>> p._p_jar = dm
>>> p._p_changed
>>> dm.called
Modifying the object marks it as changed and registers it with the data
manager. Subsequent modifications don't have additional side-effects.
>>> p._p_changed
>>> dm.called
>>> p._p_changed
>>> dm.called
It's not possible to deactivate a modified object.
>>> p._p_deactivate()
>>> p._p_changed
It is possible to invalidate it. That's the key difference
between deactivation and invalidation.
>>> p._p_invalidate()
>>> p._p_state
Now that the object is a ghost, any attempt to modify it will
require that it be unghosted first. The test data manager
has the odd property that it sets the object's 'x' attribute
to 42 when it is unghosted.
>>> p.x
>>> dm.called
You can manually reset the changed field to False, although
it's not clear why you would want to do that. The object
changes to the UPTODATE state but retains its modifications.
>>> p._p_changed = False
>>> p._p_state
>>> p._p_changed
>>> p.x
>>> p._p_changed
>>> dm.called
__getstate__() and __setstate__()
The next several tests cover the __getstate__() and __setstate__()
>>> p._p_deactivate()
>>> p._p_changed
It is possible to invalidate it. That's the key difference between
deactivation and invalidation.
>>> p._p_invalidate()
>>> p._p_state
Now that the object is a ghost, any attempt to modify it will require that it
be unghosted first. The test data manager has the odd property that it sets
the object's ``x`` attribute to ``42`` when it is unghosted.
>>> p.x
>>> dm.called
You can manually reset the changed field to ``False``, although it's not clear
why you would want to do that. The object changes to the ``UPTODATE`` state
but retains its modifications.
>>> p._p_changed = False
>>> p._p_state
>>> p._p_changed
>>> p.x
>>> p._p_changed
>>> dm.called
``__getstate__()`` and ``__setstate__()``
The next several tests cover the ``__getstate__()`` and ``__setstate__()``
>>> p = P()
>>> state = p.__getstate__()
>>> isinstance(state, dict)
>>> state['x']
>>> p._p_state
>>> p = P()
>>> state = p.__getstate__()
>>> isinstance(state, dict)
>>> state['x']
>>> p._p_state
Calling setstate always leaves the object in the uptodate state?
(I'm not entirely clear on this one.)
>>> p.__setstate__({'x': 5})
>>> p._p_state
>>> p.__setstate__({'x': 5})
>>> p._p_state
Assigning to a volatile attribute has no effect on the object state.
>>> p._v_foo = 2
>>> p.__getstate__()
{'x': 5}
>>> p._p_state
>>> p._v_foo = 2
>>> p.__getstate__()
{'x': 5}
>>> p._p_state
The _p_serial attribute is not affected by calling setstate.
The ``_p_serial`` attribute is not affected by calling setstate.
>>> p._p_serial = "00000012"
>>> p.__setstate__(p.__getstate__())
>>> p._p_serial
>>> p._p_serial = "00000012"
>>> p.__setstate__(p.__getstate__())
>>> p._p_serial
Change Ghost test
If an object is a ghost and its _p_changed is set to True (any true value),
it should activate (unghostify) the object. This behavior is new in ZODB
3.6; before then, an attempt to do "ghost._p_changed = True" was ignored.
>>> p = P()
>>> p._p_jar = DM()
>>> p._p_oid = 1
>>> p._p_deactivate()
>>> p._p_changed # None
>>> p._p_state # ghost state
>>> p._p_changed = True
>>> p._p_changed
>>> p._p_state # changed state
>>> p.x
If an object is a ghost and its ``_p_changed`` is set to ``True`` (any true
value), it should activate (unghostify) the object. This behavior is new in
ZODB 3.6; before then, an attempt to do ``ghost._p_changed = True`` was
>>> p = P()
>>> p._p_jar = DM()
>>> p._p_oid = 1
>>> p._p_deactivate()
>>> p._p_changed # None
>>> p._p_state # ghost state
>>> p._p_changed = True
>>> p._p_changed
>>> p._p_state # changed state
>>> p.x
Activate, deactivate, and invalidate
Some of these tests are redundant, but are included to make sure there
are explicit and simple tests of _p_activate(), _p_deactivate(), and
>>> p = P()
>>> p._p_oid = 1
>>> p._p_jar = DM()
>>> p._p_deactivate()
>>> p._p_state
>>> p._p_activate()
>>> p._p_state
>>> p.x
>>> p.x
>>> p._p_state
>>> p._p_invalidate()
>>> p._p_state
>>> p.x
are explicit and simple tests of ``_p_activate()``, ``_p_deactivate()``, and
>>> p = P()
>>> p._p_oid = 1
>>> p._p_jar = DM()
>>> p._p_deactivate()
>>> p._p_state
>>> p._p_activate()
>>> p._p_state
>>> p.x
>>> p.x
>>> p._p_state
>>> p._p_invalidate()
>>> p._p_state
>>> p.x
Test failures
The following tests cover various errors cases.
When an object is modified, it registers with its data manager. If
that registration fails, the exception is propagated and the object
stays in the up-to-date state. It shouldn't change to the modified
state, because it won't be saved when the transaction commits.
>>> p = P()
>>> p._p_oid = 1
>>> p._p_jar = BrokenDM()
>>> p._p_state
>>> p._p_jar.called
>>> p._p_changed = 1
Traceback (most recent call last):
When an object is modified, it registers with its data manager. If that
registration fails, the exception is propagated and the object stays in the
up-to-date state. It shouldn't change to the modified state, because it won't
be saved when the transaction commits.
>>> p = P()
>>> p._p_oid = 1
>>> p._p_jar = BrokenDM()
>>> p._p_state
>>> p._p_jar.called
>>> p._p_changed = 1
Traceback (most recent call last):
>>> p._p_jar.called
>>> p._p_state
Make sure that exceptions that occur inside the data manager's
setstate() method propagate out to the caller.
>>> p = P()
>>> p._p_oid = 1
>>> p._p_jar = BrokenDM()
>>> p._p_deactivate()
>>> p._p_state
>>> p._p_activate()
Traceback (most recent call last):
>>> p._p_jar.called
>>> p._p_state
Make sure that exceptions that occur inside the data manager's ``setstate()``
method propagate out to the caller.
>>> p = P()
>>> p._p_oid = 1
>>> p._p_jar = BrokenDM()
>>> p._p_deactivate()
>>> p._p_state
>>> p._p_activate()
Traceback (most recent call last):
>>> p._p_state
>>> p._p_state
Special test to cover layout of __dict__
Special test to cover layout of ``__dict__``
We once had a bug in the Persistent class that calculated an incorrect
offset for the __dict__ attribute. It assigned __dict__ and _p_jar to
the same location in memory. This is a simple test to make sure they
have different locations.
We once had a bug in the `Persistent` class that calculated an incorrect
offset for the ``__dict__`` attribute. It assigned ``__dict__`` and
``_p_jar`` to the same location in memory. This is a simple test to make sure
they have different locations.
>>> p = P()
>>> 'x' in p.__dict__
>>> p._p_jar
>>> p = P()
>>> 'x' in p.__dict__
>>> p._p_jar
Inheritance and metaclasses
Simple tests to make sure it's possible to inherit from the Persistent
base class multiple times. There used to be metaclasses involved in
Persistent that probably made this a more interesting test.
>>> class A(Persistent):
... pass
>>> class B(Persistent):
... pass
>>> class C(A, B):
... pass
>>> class D(object):
... pass
>>> class E(D, B):
... pass
>>> a = A()
>>> b = B()
>>> c = C()
>>> d = D()
>>> e = E()
Also make sure that it's possible to define Persistent classes that
have a custom metaclass.
>>> class alternateMeta(type):
... type
>>> class alternate(object):
... __metaclass__ = alternateMeta
>>> class mixedMeta(alternateMeta, type):
... pass
>>> class mixed(alternate, Persistent):
... pass
>>> class mixed(Persistent, alternate):
... pass
Simple tests to make sure it's possible to inherit from the `Persistent` base
class multiple times. There used to be metaclasses involved in `Persistent`
that probably made this a more interesting test.
>>> class A(Persistent):
... pass
>>> class B(Persistent):
... pass
>>> class C(A, B):
... pass
>>> class D(object):
... pass
>>> class E(D, B):
... pass
>>> a = A()
>>> b = B()
>>> c = C()
>>> d = D()
>>> e = E()
Also make sure that it's possible to define `Persistent` classes that have a
custom metaclass.
>>> class alternateMeta(type):
... type
>>> class alternate(object):
... __metaclass__ = alternateMeta
>>> class mixedMeta(alternateMeta, type):
... pass
>>> class mixed(alternate, Persistent):
... pass
>>> class mixed(Persistent, alternate):
... pass
Basic type structure
>>> Persistent.__dictoffset__
>>> Persistent.__weakrefoffset__
>>> Persistent.__basicsize__ > object.__basicsize__
>>> P.__dictoffset__ > 0
>>> P.__weakrefoffset__ > 0
>>> P.__dictoffset__ < P.__weakrefoffset__
>>> P.__basicsize__ > Persistent.__basicsize__
>>> Persistent.__dictoffset__
>>> Persistent.__weakrefoffset__
>>> Persistent.__basicsize__ > object.__basicsize__
>>> P.__dictoffset__ > 0
>>> P.__weakrefoffset__ > 0
>>> P.__dictoffset__ < P.__weakrefoffset__
>>> P.__basicsize__ > Persistent.__basicsize__
These are some simple tests of classes that have an __slots__
These are some simple tests of classes that have an ``__slots__``
attribute. Some of the classes should have slots, others shouldn't.
>>> class noDict(object):
... __slots__ = ['foo']
>>> class p_noDict(Persistent):
... __slots__ = ['foo']
>>> class p_shouldHaveDict(p_noDict):
... pass
>>> p_noDict.__dictoffset__
>>> x = p_noDict()
>>> = 1
>>> = 1
Traceback (most recent call last):
>>> class noDict(object):
... __slots__ = ['foo']
>>> class p_noDict(Persistent):
... __slots__ = ['foo']
>>> class p_shouldHaveDict(p_noDict):
... pass
>>> p_noDict.__dictoffset__
>>> x = p_noDict()
>>> = 1
>>> = 1
Traceback (most recent call last):
AttributeError: 'p_noDict' object has no attribute 'bar'
>>> x._v_bar = 1
Traceback (most recent call last):
AttributeError: 'p_noDict' object has no attribute 'bar'
>>> x._v_bar = 1
Traceback (most recent call last):
AttributeError: 'p_noDict' object has no attribute '_v_bar'
>>> x.__dict__
Traceback (most recent call last):
AttributeError: 'p_noDict' object has no attribute '_v_bar'
>>> x.__dict__
Traceback (most recent call last):
AttributeError: 'p_noDict' object has no attribute '__dict__'
AttributeError: 'p_noDict' object has no attribute '__dict__'
The various _p_ attributes are unaffected by slots.
>>> p._p_oid
>>> p._p_jar
>>> p._p_state
The various _p_ attributes are unaffected by slots.
>>> p._p_oid
>>> p._p_jar
>>> p._p_state
If the most-derived class does not specify
>>> p_shouldHaveDict.__dictoffset__ > 0
>>> x = p_shouldHaveDict()
>>> isinstance(x.__dict__, dict)
>>> p_shouldHaveDict.__dictoffset__ > 0
>>> x = p_shouldHaveDict()
>>> isinstance(x.__dict__, dict)
There's actually a substantial effort involved in making subclasses of
Persistent work with plain-old pickle. The ZODB serialization layer
never calls pickle on an object; it pickles the object's class
description and its state as two separate pickles.
>>> import pickle
>>> p = P()
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.__class__ is P
>>> p2.x == p.x
We should also test that pickle works with custom getstate and
setstate. Perhaps even reduce. The problem is that pickling depends
on finding the class in a particular module, and classes defined here
won't appear in any module. We could require each user of the tests
to define a base class, but that might be tedious.
`Persistent` work with plain-old pickle. The ZODB serialization layer never
calls pickle on an object; it pickles the object's class description and its
state as two separate pickles.
>>> import pickle
>>> p = P()
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.__class__ is P
>>> p2.x == p.x
We should also test that pickle works with custom getstate and setstate.
Perhaps even reduce. The problem is that pickling depends on finding the
class in a particular module, and classes defined here won't appear in any
module. We could require each user of the tests to define a base class, but
that might be tedious.
Some versions of Zope and ZODB have the zope.interfaces package
available. If it is available, then persistent will be associated
with several interfaces. It's hard to write a doctest test that runs
the tests only if zope.interface is available, so this test looks a
little unusual. One problem is that the assert statements won't do
anything if you run with -O.
>>> try:
... import zope.interface
... except ImportError:
... pass
... else:
... from persistent.interfaces import IPersistent
... assert IPersistent.implementedBy(Persistent)
... p = Persistent()
... assert IPersistent.providedBy(p)
... assert IPersistent.implementedBy(P)
... p = P()
... assert IPersistent.providedBy(p)
Some versions of Zope and ZODB have the `zope.interfaces` package available.
If it is available, then persistent will be associated with several
interfaces. It's hard to write a doctest test that runs the tests only if
`zope.interface` is available, so this test looks a little unusual. One
problem is that the assert statements won't do anything if you run with `-O`.
>>> try:
... import zope.interface
... except ImportError:
... pass
... else:
... from persistent.interfaces import IPersistent
... assert IPersistent.implementedBy(Persistent)
... p = Persistent()
... assert IPersistent.providedBy(p)
... assert IPersistent.implementedBy(P)
... p = P()
... assert IPersistent.providedBy(p)
......@@ -16,6 +16,9 @@ import unittest
from persistent import Persistent
from persistent.interfaces import IPersistent
# Confusing: ZODB doesn't use this file. It appears to be used only
# by Zope3, where it's imported by zope/app/schema/tests/
import zope.interface
except ImportError:
......@@ -115,8 +118,10 @@ class Test(unittest.TestCase):
self.assertEqual(dm.called, 1)
def testGhostChanged(self):
# An object is a ghost, and it's _p_changed it set to True.
# This assignment should have no effect.
# If an object is a ghost and its _p_changed is set to True (any
# true value), it should activate (unghostify) the object. This
# behavior is new in ZODB 3.6; before then, an attempt to do
# "ghost._p_changed = True" was ignored.
p = self.klass()
p._p_oid = 1
dm = DM()
......@@ -124,7 +129,7 @@ class Test(unittest.TestCase):
self.assertEqual(p._p_changed, None)
p._p_changed = True
self.assertEqual(p._p_changed, None)
self.assertEqual(p._p_changed, 1)
def testRegistrationFailure(self):
p = self.klass()
# Copyright (c) 2005 Zope Corporation and Contributors.
# All Rights Reserved.
# This software is subject to the provisions of the Zope Public License,
# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
"""Test the mapping interface to PersistentMapping
import unittest
from persistent.mapping import PersistentMapping
l0 = {}
l1 = {0:0}
l2 = {0:0, 1:1}
class TestPMapping(unittest.TestCase):
def testTheWorld(self):
# Test constructors
u = PersistentMapping()
u0 = PersistentMapping(l0)
u1 = PersistentMapping(l1)
u2 = PersistentMapping(l2)
uu = PersistentMapping(u)
uu0 = PersistentMapping(u0)
uu1 = PersistentMapping(u1)
uu2 = PersistentMapping(u2)
class OtherMapping:
def __init__(self, initmapping):
self.__data = initmapping
def items(self):
return self.__data.items()
v0 = PersistentMapping(OtherMapping(u0))
vv = PersistentMapping([(0, 0), (1, 1)])
# Test __repr__
eq = self.assertEqual
eq(str(u0), str(l0), "str(u0) == str(l0)")
eq(repr(u1), repr(l1), "repr(u1) == repr(l1)")
eq(`u2`, `l2`, "`u2` == `l2`")
# Test __cmp__ and __len__
def mycmp(a, b):
r = cmp(a, b)
if r < 0: return -1
if r > 0: return 1
return r
all = [l0, l1, l2, u, u0, u1, u2, uu, uu0, uu1, uu2]
for a in all:
for b in all:
eq(mycmp(a, b), mycmp(len(a), len(b)),
"mycmp(a, b) == mycmp(len(a), len(b))")
# Test __getitem__
for i in range(len(u2)):
eq(u2[i], i, "u2[i] == i")
# Test get
for i in range(len(u2)):
eq(u2.get(i), i, "u2.get(i) == i")
eq(u2.get(i, 5), i, "u2.get(i, 5) == i")
for i in min(u2)-1, max(u2)+1:
eq(u2.get(i), None, "u2.get(i) == None")
eq(u2.get(i, 5), 5, "u2.get(i, 5) == 5")
# Test __setitem__
uu2[0] = 0
uu2[1] = 100
uu2[2] = 200
# Test __delitem__
del uu2[1]
del uu2[0]
del uu2[0]
except KeyError:
raise TestFailed("uu2[0] shouldn't be deletable")
# Test __contains__
for i in u2:
self.failUnless(i in u2, "i in u2")
for i in min(u2)-1, max(u2)+1:
self.failUnless(i not in u2, "i not in u2")
# Test update
l = {"a":"b"}
u = PersistentMapping(l)
for i in u:
self.failUnless(i in l or i in u2, "i in l or i in u2")
for i in l:
self.failUnless(i in u, "i in u")
for i in u2:
self.failUnless(i in u, "i in u")
# Test setdefault
x = u2.setdefault(0, 5)
eq(x, 0, "u2.setdefault(0, 5) == 0")
x = u2.setdefault(5, 5)
eq(x, 5, "u2.setdefault(5, 5) == 5")
self.failUnless(5 in u2, "5 in u2")
# Test pop
x = u2.pop(1)
eq(x, 1, "u2.pop(1) == 1")
self.failUnless(1 not in u2, "1 not in u2")
except KeyError:
raise TestFailed("1 should not be poppable from u2")
x = u2.pop(1, 7)
eq(x, 7, "u2.pop(1, 7) == 7")
# Test popitem
items = u2.items()
key, value = u2.popitem()
self.failUnless((key, value) in items, "key, value in items")
self.failUnless(key not in u2, "key not in u2")
# Test clear
eq(u2, {}, "u2 == {}")
def test_suite():
return unittest.makeSuite(TestPMapping)
if __name__ == "__main__":
loader = unittest.TestLoader()
......@@ -27,6 +27,7 @@ You must specify either -p and -h or -U.
import getopt
import logging
import socket
import sys
import time
......@@ -41,6 +42,18 @@ from ZEO.ClientStorage import ClientStorage
def setup_logging():
# Set up logging to stderr which will show messages originating
# at severity ERROR or higher.
root = logging.getLogger()
fmt = logging.Formatter(
"------\n%(asctime)s %(levelname)s %(name)s %(message)s",
handler = logging.StreamHandler()
def check_server(addr, storage, write):
t0 = time.time()
if ZEO_VERSION == 2:
......@@ -122,6 +135,7 @@ def main():
addr = host, port
check_server(addr, storage, write)
if __name__ == "__main__":
This package is currently a facade of the ZODB.Transaction module.
It exists to support:
This package contains a generic transaction implementation for Python. It is
mainly used by the ZODB, though.
- Application code that uses the ZODB 4 transaction API
- ZODB4-style data managers (transaction.interfaces.IDataManager)
Note that the data manager API, transaction.interfaces.IDataManager,
Note that the data manager API, ``transaction.interfaces.IDataManager``,
is syntactically simple, but semantically complex. The semantics
were not easy to express in the interface. This could probably use
more work. The semantics are presented in detail through examples of
a sample data manager in transaction.tests.test_SampleDataManager.
a sample data manager in ``transaction.tests.test_SampleDataManager``.
......@@ -25,10 +25,3 @@ begin = manager.begin
commit = manager.commit
abort = manager.abort
savepoint = manager.savepoint
def get_transaction():
from ZODB.utils import deprecated36
deprecated36(""" use transaction.get() instead of get_transaction().
transaction.commit() is a shortcut spelling of transaction.get().commit(),
and transaction.abort() of transaction.get().abort().""")
return get()
......@@ -30,7 +30,7 @@ registers its _p_jar attribute. TODO: explain adapter
Note: Suntransactions are deprecated! Use savepoint/rollback instead.
Note: Subtransactions are deprecated! Use savepoint/rollback instead.
A subtransaction applies the transaction notion recursively. It
allows a set of modifications within a transaction to be committed or
......@@ -115,6 +115,20 @@ pre-commit hook is available for such use cases: use addBeforeCommitHook(),
passing it a callable and arguments. The callable will be called with its
arguments at the start of the commit (but not for substransaction commits).
After-commit hook
Sometimes, applications want to execute code after a transaction is
committed or aborted. For example, one might want to launch non
transactional code after a successful commit. Or still someone might
want to launch asynchronous code after. A post-commit hook is
available for such use cases: use addAfterCommitHook(), passing it a
callable and arguments. The callable will be called with a Boolean
value representing the status of the commit operation as first
argument (true if successfull or false iff aborted) preceding its
arguments at the start of the commit (but not for substransaction
Error handling
......@@ -241,6 +255,9 @@ class Transaction(object):
# List of (hook, args, kws) tuples added by addBeforeCommitHook().
self._before_commit = []
# List of (hook, args, kws) tuples added by addAfterCommitHook().
self._after_commit = []
# Raise TransactionFailedError, due to commit()/join()/register()
# getting called when the current transaction has already suffered
# a commit/savepoint failure.
......@@ -292,7 +309,7 @@ class Transaction(object):
savepoint = Savepoint(self, optimistic, *self._resources)
self._saveCommitishError() # reraises!
self._saveAndRaiseCommitishError() # reraises!
if self._savepoint2index is None:
self._savepoint2index = weakref.WeakKeyDictionary()
......@@ -345,32 +362,25 @@ class Transaction(object):
assert id(obj) not in map(id, adapter.objects)
def begin(self):
from ZODB.utils import deprecated36
deprecated36("Transaction.begin() should no longer be used; use "
"the begin() method of a transaction manager.")
if (self._resources or self._synchronizers):
# Else aborting wouldn't do anything, except if _manager is non-None,
# in which case it would do nothing besides uselessly free() this
# transaction.
def commit(self, subtransaction=_marker, deprecation_wng=True):
if subtransaction is _marker:
subtransaction = 0
elif deprecation_wng:
from ZODB.utils import deprecated37
deprecated37("subtransactions are deprecated; use "
"transaction.savepoint() instead of "
deprecated37("subtransactions are deprecated; instead of "
"transaction.commit(1), use "
"transaction.savepoint(optimistic=True) in "
"contexts where a subtransaction abort will never "
"occur, or sp=transaction.savepoint() if later "
"rollback is possible and then sp.rollback() "
"instead of transaction.abort(1)")
if self._savepoint2index:
if subtransaction:
# TODO deprecate subtransactions
self._subtransaction_savepoint = self.savepoint(1)
self._subtransaction_savepoint = self.savepoint(optimistic=True)
if self.status is Status.COMMITFAILED:
......@@ -383,16 +393,19 @@ class Transaction(object):
self._saveCommitishError() # This raises!
self.status = Status.COMMITTED
t, v, tb = self._saveAndGetCommitishError()
raise t, v, tb
if self._manager: s: s.afterCompletion(self))
def _saveCommitishError(self):
def _saveAndGetCommitishError(self):
self.status = Status.COMMITFAILED
# Save the traceback for TransactionFailedError.
ft = self._failure_traceback = StringIO()
......@@ -403,6 +416,10 @@ class Transaction(object):
traceback.print_tb(tb, None, ft)
# Append the exception type and value.
ft.writelines(traceback.format_exception_only(t, v))
return t, v, tb
def _saveAndRaiseCommitishError(self):
t, v, tb = self._saveAndGetCommitishError()
raise t, v, tb
def getBeforeCommitHooks(self):
......@@ -428,6 +445,44 @@ class Transaction(object):
hook(*args, **kws)
self._before_commit = []
def getAfterCommitHooks(self):
return iter(self._after_commit)
def addAfterCommitHook(self, hook, args=(), kws=None):
if kws is None:
kws = {}
self._after_commit.append((hook, tuple(args), kws))
def _callAfterCommitHooks(self, status=True):
# Avoid to abort anything at the end if no hooks are registred.
if not self._after_commit:
# Call all hooks registered, allowing further registrations
# during processing. Note that calls to addAterCommitHook() may
# add additional hooks while hooks are running, and iterating over a
# growing list is well-defined in Python.
for hook, args, kws in self._after_commit:
# The first argument passed to the hook is a Boolean value,
# true if the commit succeeded, or false if the commit aborted.
hook(status, *args, **kws)
# We need to catch the exceptions if we want all hooks
# to be called
self.log.error("Error in after commit hook exec in %s ",
hook, exc_info=sys.exc_info())
# The transaction is already committed. It must not have
# further effects after the commit.
for rm in self._resources:
# XXX should we take further actions here ?
self.log.error("Error in abort() on manager %s",
rm, exc_info=sys.exc_info())
self._after_commit = []
self._before_commit = []
def _commitResources(self):
# Execute the two-phase commit protocol.
......@@ -450,7 +505,7 @@ class Transaction(object):
# TODO: do we need to make this warning stronger?
# TODO: It would be nice if the system could be configured
# to stop committing transactions at this point.
self.log.critical("A storage error occured during the second "
self.log.critical("A storage error occurred during the second "
"phase of the two-phase commit. Resources "
"may be in an inconsistent state.")
......@@ -694,7 +749,7 @@ class Savepoint:
# Mark the transaction as failed.
transaction._saveCommitishError() # reraises!
transaction._saveAndRaiseCommitishError() # reraises!
class AbortSavepoint:
......@@ -156,7 +156,7 @@ class ITransaction(zope.interface.Interface):
"""Add extension data to the transaction.
name is the name of the extension property to set, of Python type
str; value must be pickleable. Multiple calls may be made to set
str; value must be picklable. Multiple calls may be made to set
multiple extension properties, provided the names are distinct.
Storages record the extension data, as meta-data, when a transaction
......@@ -232,6 +232,43 @@ class ITransaction(zope.interface.Interface):
by a top-level transaction commit.
def addAfterCommitHook(hook, args=(), kws=None):
"""Register a hook to call after a transaction commit attempt.
The specified hook function will be called after the transaction
commit succeeds or aborts. The first argument passed to the hook
is a Boolean value, true if the commit succeeded, or false if the
commit aborted. `args` specifies additional positional, and `kws`
keyword, arguments to pass to the hook. `args` is a sequence of
positional arguments to be passed, defaulting to an empty tuple
(only the true/false success argument is passed). `kws` is a
dictionary of keyword argument names and values to be passed, or
the default None (no keyword arguments are passed).
Multiple hooks can be registered and will be called in the order they
were registered (first registered, first called). This method can
also be called from a hook: an executing hook can register more
hooks. Applications should take care to avoid creating infinite loops
by recursively registering hooks.
Hooks are called only for a top-level commit. A subtransaction
commit or savepoint creation does not call any hooks. Calling a
hook "consumes" its registration: hook registrations do not
persist across transactions. If it's desired to call the same
hook on every transaction commit, then addAfterCommitHook() must be
called with that hook during every transaction; in such a case
consider registering a synchronizer object via a TransactionManager's
registerSynch() method instead.
def getAfterCommitHooks():
"""Return iterable producing the registered addAfterCommit hooks.
A triple (hook, args, kws) is produced for each registered hook.
The hooks are produced in the order in which they would be invoked
by a top-level transaction commit.
class ITransactionDeprecated(zope.interface.Interface):
"""Deprecated parts of the transaction API."""
......@@ -12,17 +12,17 @@ a transaction allowing:
Savepoints make it possible to write atomic subroutines that don't
make top-level transaction commitments.
To demonstrate how savepoints work with transactions, we've provided a
sample data manager implementation that provides savepoint support.
The primary purpose of this data manager is to provide code that can
be read to understand how savepoints work. The secondary purpose is to
provide support for demonstrating the correct operation of savepoint
support within the transaction system. This data manager is very
simple. It provides flat storage of named immutable values, like strings
and numbers.
To demonstrate how savepoints work with transactions, we've provided a sample
data manager implementation that provides savepoint support. The primary
purpose of this data manager is to provide code that can be read to understand
how savepoints work. The secondary purpose is to provide support for
demonstrating the correct operation of savepoint support within the
transaction system. This data manager is very simple. It provides flat
storage of named immutable values, like strings and numbers.
>>> import transaction.tests.savepointsample
>>> dm = transaction.tests.savepointsample.SampleSavepointDataManager()
......@@ -43,13 +43,13 @@ and abort changes:
>>> dm['name']
Now, let's look at an application that manages funds for people.
It allows deposits and debits to be entered for multiple people.
It accepts a sequence of entries and generates a sequence of status
messages. For each entry, it applies the change and then validates
the user's account. If the user's account is invalid, we roll back
the change for that entry. The success or failure of an entry is
indicated in the output status. First we'll initialize some accounts:
Now, let's look at an application that manages funds for people. It allows
deposits and debits to be entered for multiple people. It accepts a sequence
of entries and generates a sequence of status messages. For each entry, it
applies the change and then validates the user's account. If the user's
account is invalid, we roll back the change for that entry. The success or
failure of an entry is indicated in the output status. First we'll initialize
some accounts:
>>> dm['bob-balance'] = 0.0
>>> dm['bob-credit'] = 0.0
......@@ -63,8 +63,8 @@ Now, we'll define a validation function to validate an account:
... if dm[name+'-balance'] + dm[name+'-credit'] < 0:
... raise ValueError('Overdrawn', name)
And a function to apply entries. If the function fails in some
unexpected way, it rolls back all of its changes and prints the error:
And a function to apply entries. If the function fails in some unexpected
way, it rolls back all of its changes and prints the error:
>>> def apply_entries(entries):
... savepoint = transaction.savepoint()
......@@ -118,9 +118,9 @@ If we provide entries that cause an unexpected error:
Updated sally
Unexpected exception unsupported operand type(s) for +=: 'float' and 'str'
Because the apply_entries used a savepoint for the entire function,
it was able to rollback the partial changes without rolling back
changes made in the previous call to apply_entries:
Because the apply_entries used a savepoint for the entire function, it was
able to rollback the partial changes without rolling back changes made in the
previous call to ``apply_entries``:
>>> dm['bob-balance']
......@@ -195,11 +195,12 @@ However, using a savepoint invalidates any savepoints that come after it:
>>> transaction.abort()
Databases without savepoint support
Normally it's an error to use savepoints with databases that don't
support savepoints:
Normally it's an error to use savepoints with databases that don't support
>>> dm_no_sp = transaction.tests.savepointsample.SampleDataManager()
>>> dm_no_sp['name'] = 'bob'
......@@ -212,10 +213,10 @@ support savepoints:
>>> transaction.abort()
However, a flag can be passed to the transaction savepoint method to
indicate that databases without savepoint support should be tolerated
until a savepoint is rolled back. This allows transactions to proceed
if there are no reasons to roll back:
However, a flag can be passed to the transaction savepoint method to indicate
that databases without savepoint support should be tolerated until a savepoint
is rolled back. This allows transactions to proceed if there are no reasons
to roll back:
>>> dm_no_sp['name'] = 'sally'
>>> savepoint = transaction.savepoint(1)
......@@ -231,13 +232,14 @@ if there are no reasons to roll back:
TypeError: ('Savepoints unsupported', {'name': 'sam'})
If a failure occurs when creating or rolling back a savepoint, the
transaction state will be uncertain and the transaction will become
uncommitable. From that point on, most transaction operations,
including commit, will fail until the transaction is aborted.
If a failure occurs when creating or rolling back a savepoint, the transaction
state will be uncertain and the transaction will become uncommitable. From
that point on, most transaction operations, including commit, will fail until
the transaction is aborted.
In the previous example, we got an error when we tried to rollback the
savepoint. If we try to commit the transaction, the commit will fail:
......@@ -254,8 +256,8 @@ We have to abort it to make any progress:
>>> transaction.abort()
Similarly, in our earlier example, where we tried to take a savepoint
with a data manager that didn't support savepoints:
Similarly, in our earlier example, where we tried to take a savepoint with a
data manager that didn't support savepoints:
>>> dm_no_sp['name'] = 'sally'
>>> dm['name'] = 'sally'
# Copyright (c) 2001, 2002 Zope Corporation and Contributors.
# Copyright (c) 2001, 2002, 2005 Zope Corporation and Contributors.
# All Rights Reserved.
# This software is subject to the provisions of the Zope Public License,
......@@ -11,7 +11,7 @@
"""Test tranasction behavior for variety of cases.
"""Test transaction behavior for variety of cases.
I wrote these unittests to investigate some odd transaction
behavior when doing unittests of integrating non sub transaction
......@@ -241,7 +241,6 @@ class TransactionTests(unittest.TestCase):
assert self.nosub1._p_jar.ctpc_abort == 1
# last test, check the hosing mechanism
## def testHoserStoppage(self):
......@@ -728,6 +727,268 @@ def test_addBeforeCommitHook():
"arg '-' kw1 'no_kw1' kw2 'no_kw2'",
>>> reset_log()
When modifing persitent objects within before commit hooks
modifies the objects, of course :)
Start a new transaction
>>> t = transaction.begin()
Create a DB instance and add a IOBTree within
>>> from ZODB.tests.util import DB
>>> from ZODB.tests.util import P
>>> db = DB()
>>> con =
>>> root = con.root()
>>> root['p'] = P('julien')
>>> p = root['p']
This hook will get the object from the `DB` instance and change
the flag attribute.
>>> def hookmodify(status, arg=None, kw1='no_kw1', kw2='no_kw2'):
... = 'jul'
Now register this hook and commit.
>>> t.addBeforeCommitHook(hookmodify, (p, 1))
>>> transaction.commit()
Nothing should have changed since it should have been aborted.
>>> db.close()
def test_addAfterCommitHook():
"""Test addAfterCommitHook.
Let's define a hook to call, and a way to see that it was called.
>>> log = []
>>> def reset_log():
... del log[:]
>>> def hook(status, arg='no_arg', kw1='no_kw1', kw2='no_kw2'):
... log.append("%r arg %r kw1 %r kw2 %r" % (status, arg, kw1, kw2))
Now register the hook with a transaction.
>>> import transaction
>>> t = transaction.begin()
>>> t.addAfterCommitHook(hook, '1')
We can see that the hook is indeed registered.
>>> [(hook.func_name, args, kws)
... for hook, args, kws in t.getAfterCommitHooks()]
[('hook', ('1',), {})]
When transaction commit is done, the hook is called, with its
>>> log
>>> t.commit()
>>> log
["True arg '1' kw1 'no_kw1' kw2 'no_kw2'"]
>>> reset_log()
A hook's registration is consumed whenever the hook is called. Since
the hook above was called, it's no longer registered:
>>> len(list(t.getAfterCommitHooks()))
>>> transaction.commit()
>>> log
The hook is only called after a full commit, not for a savepoint or
>>> t = transaction.begin()
>>> t.addAfterCommitHook(hook, 'A', dict(kw1='B'))
>>> dummy = t.savepoint()
>>> log
>>> t.commit(subtransaction=True)
>>> log
>>> t.commit()
>>> log
["True arg 'A' kw1 'B' kw2 'no_kw2'"]
>>> reset_log()
If a transaction is aborted, no hook is called.
>>> t = transaction.begin()
>>> t.addAfterCommitHook(hook, ["OOPS!"])
>>> transaction.abort()
>>> log
>>> transaction.commit()
>>> log
The hook is called after the commit is done, so even if the
commit fails the hook will have been called. To provoke failures in
commit, we'll add failing resource manager to the transaction.
>>> class CommitFailure(Exception):
... pass
>>> class FailingDataManager:
... def tpc_begin(self, txn, sub=False):
... raise CommitFailure
... def abort(self, txn):
... pass
>>> t = transaction.begin()
>>> t.join(FailingDataManager())
>>> t.addAfterCommitHook(hook, '2')
>>> t.commit()
Traceback (most recent call last):
>>> log
["False arg '2' kw1 'no_kw1' kw2 'no_kw2'"]
>>> reset_log()
Let's register several hooks.
>>> t = transaction.begin()
>>> t.addAfterCommitHook(hook, '4', dict(kw1='4.1'))
>>> t.addAfterCommitHook(hook, '5', dict(kw2='5.2'))
They are returned in the same order by getAfterCommitHooks.
>>> [(hook.func_name, args, kws) #doctest: +NORMALIZE_WHITESPACE
... for hook, args, kws in t.getAfterCommitHooks()]
[('hook', ('4',), {'kw1': '4.1'}),
('hook', ('5',), {'kw2': '5.2'})]
And commit also calls them in this order.
>>> t.commit()
>>> len(log)
>>> log #doctest: +NORMALIZE_WHITESPACE
["True arg '4' kw1 '4.1' kw2 'no_kw2'",
"True arg '5' kw1 'no_kw1' kw2 '5.2'"]
>>> reset_log()
While executing, a hook can itself add more hooks, and they will all
be called before the real commit starts.
>>> def recurse(status, txn, arg):
... log.append('rec' + str(arg))
... if arg:
... txn.addAfterCommitHook(hook, '-')
... txn.addAfterCommitHook(recurse, (txn, arg-1))
>>> t = transaction.begin()
>>> t.addAfterCommitHook(recurse, (t, 3))
>>> transaction.commit()
>>> log #doctest: +NORMALIZE_WHITESPACE
"True arg '-' kw1 'no_kw1' kw2 'no_kw2'",
"True arg '-' kw1 'no_kw1' kw2 'no_kw2'",
"True arg '-' kw1 'no_kw1' kw2 'no_kw2'",
>>> reset_log()
If an after commit hook is raising an exception then it will log a
message at error level so that if other hooks are registered they
can be executed. We don't support execution dependencies at this level.
>>> mgr = transaction.TransactionManager()
>>> do = DataObject(mgr)
>>> def hookRaise(status, arg='no_arg', kw1='no_kw1', kw2='no_kw2'):
... raise TypeError("Fake raise")
>>> t = transaction.begin()
>>> t.addAfterCommitHook(hook, ('-', 1))
>>> t.addAfterCommitHook(hookRaise, ('-', 2))
>>> t.addAfterCommitHook(hook, ('-', 3))
>>> transaction.commit()
>>> log
["True arg '-' kw1 1 kw2 'no_kw2'", "True arg '-' kw1 3 kw2 'no_kw2'"]
>>> reset_log()
Test that the associated transaction manager has been cleanup when
after commit hooks are registered
>>> mgr = transaction.TransactionManager()
>>> do = DataObject(mgr)
>>> t = transaction.begin()
>>> len(t._manager._txns)
>>> t.addAfterCommitHook(hook, ('-', 1))
>>> transaction.commit()
>>> log
["True arg '-' kw1 1 kw2 'no_kw2'"]
>>> len(t._manager._txns)
>>> reset_log()
The transaction is already committed when the after commit hooks
will be executed. Executing the hooks must not have further
effects on persistent objects.
Start a new transaction
>>> t = transaction.begin()
Create a DB instance and add a IOBTree within
>>> from ZODB.tests.util import DB
>>> from ZODB.tests.util import P
>>> db = DB()
>>> con =
>>> root = con.root()
>>> root['p'] = P('julien')
>>> p = root['p']
This hook will get the object from the `DB` instance and change
the flag attribute.
>>> def badhook(status, arg=None, kw1='no_kw1', kw2='no_kw2'):
... = 'jul'
Now register this hook and commit.
>>> t.addAfterCommitHook(badhook, (p, 1))
>>> transaction.commit()
Nothing should have changed since it should have been aborted.
>>> db.close()
def test_suite():
......@@ -36,7 +36,11 @@ if os.path.isdir(LIB_DIR):
path = LIB_DIR
print "Running tests from", path
# Insert the ZODB src dir first in the sys.path to avoid a name conflict
# with zope.whatever librairies that might be installed on the Python
# version used to launch these tests.
sys.path.insert(0, path)
from zope.testing import testrunner
# Persistence/ generates a long warning message about the
Markdown is supported
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment