testmvcc.py 10.6 KB
Newer Older
Jeremy Hylton's avatar
Jeremy Hylton committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
##############################################################################
#
# Copyright (c) 2004 Zope Corporation and Contributors.
# All Rights Reserved.
#
# This software is subject to the provisions of the Zope Public License,
# Version 2.0 (ZPL).  A copy of the ZPL should accompany this distribution.
# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
# FOR A PARTICULAR PURPOSE.
#
##############################################################################
r"""
Multi-version concurrency control tests
=======================================

Multi-version concurrency control (MVCC) exploits storages that store
multiple revisions of an object to avoid read conflicts.  Normally
Tim Peters's avatar
Tim Peters committed
20 21
when an object is read from the storage, its most recent revision is
read.  Under MVCC, an older revision may be read so that the transaction
Jeremy Hylton's avatar
Jeremy Hylton committed
22 23 24 25
sees a consistent view of the database.

ZODB guarantees execution-time consistency: A single transaction will
always see a consistent view of the database while it is executing.
Tim Peters's avatar
Tim Peters committed
26 27
If transaction A is running, has already read an object O1, and a
different transaction B modifies object O2, then transaction A can no
Jeremy Hylton's avatar
Jeremy Hylton committed
28 29
longer read the current revision of O2.  It must either read the
version of O2 that is consistent with O1 or raise a ReadConflictError.
Tim Peters's avatar
Tim Peters committed
30
When MVCC is in use, A will do the former.
Jeremy Hylton's avatar
Jeremy Hylton committed
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

This note includes doctests that explain how MVCC is implemented (and
test that the implementation is correct).  The tests use a
MinimalMemoryStorage that implements MVCC support, but not much else.

>>> from ZODB.tests.test_storage import MinimalMemoryStorage
>>> from ZODB import DB
>>> db = DB(MinimalMemoryStorage())

We will use two different connections with the experimental
setLocalTransaction() method to make sure that the connections act
independently, even though they'll be run from a single thread.

>>> cn1 = db.open()
>>> txn1 = cn1.setLocalTransaction()

The test will just use some MinPO objects.  The next few lines just
setup an initial database state.

>>> from ZODB.tests.MinPO import MinPO
>>> r = cn1.root()
>>> r["a"] = MinPO(1)
>>> r["b"] = MinPO(1)
>>> txn1.commit()

Now open a second connection.

>>> cn2 = db.open()
>>> txn2 = cn2.setLocalTransaction()

61 62 63
Connection high-water mark
--------------------------

Jeremy Hylton's avatar
Jeremy Hylton committed
64
The ZODB Connection tracks a transaction high-water mark, which
Tim Peters's avatar
Tim Peters committed
65 66 67 68 69 70 71 72
bounds the latest transaction id that can be read by the current
transaction and still present a consistent view of the database.
Transactions with ids up to but not including the high-water mark
are OK to read.  When a transaction commits, the database sends
invalidations to all the other connections; the invalidation contains
the transaction id and the oids of modified objects.  The Connection
stores the high-water mark in _txn_time, which is set to None until
an invalidation arrives.
Jeremy Hylton's avatar
Jeremy Hylton committed
73 74 75

>>> cn = db.open()

Tim Peters's avatar
Tim Peters committed
76 77 78
>>> print cn._txn_time
None
>>> cn.invalidate(100, dict.fromkeys([1, 2]))
Jeremy Hylton's avatar
Jeremy Hylton committed
79
>>> cn._txn_time
Tim Peters's avatar
Tim Peters committed
80 81
100
>>> cn.invalidate(200, dict.fromkeys([1, 2]))
Jeremy Hylton's avatar
Jeremy Hylton committed
82
>>> cn._txn_time
Tim Peters's avatar
Tim Peters committed
83
100
Jeremy Hylton's avatar
Jeremy Hylton committed
84

Tim Peters's avatar
Tim Peters committed
85 86 87 88 89
A connection's high-water mark is set to the transaction id taken from
the first invalidation processed by the connection.  Transaction ids are
monotonically increasing, so the first one seen during the current
transaction remains the high-water mark for the duration of the
transaction.
Jeremy Hylton's avatar
Jeremy Hylton committed
90 91 92 93 94 95

XXX We'd like simple abort and commit calls to make txn boundaries,
but that doesn't work unless an object is modified.  sync() will abort
a transaction and process invalidations.

>>> cn.sync()
96 97
>>> print cn._txn_time  # the high-water mark got reset to None
None
Jeremy Hylton's avatar
Jeremy Hylton committed
98

99 100 101
Basic functionality
-------------------

Jeremy Hylton's avatar
Jeremy Hylton committed
102
The next bit of code includes a simple MVCC test.  One transaction
Tim Peters's avatar
Tim Peters committed
103
will modify "a."  The other transaction will then modify "b" and commit.
Jeremy Hylton's avatar
Jeremy Hylton committed
104 105 106 107 108 109 110 111 112 113 114 115

>>> r1 = cn1.root()
>>> r1["a"].value = 2
>>> cn1.getTransaction().commit()
>>> txn = db.lastTransaction()

The second connection has its high-water mark set now.

>>> cn2._txn_time == txn
True

It is safe to read "b," because it was not modified by the concurrent
Tim Peters's avatar
Tim Peters committed
116
transaction.
Jeremy Hylton's avatar
Jeremy Hylton committed
117 118 119 120

>>> r2 = cn2.root()
>>> r2["b"]._p_serial < cn2._txn_time
True
Tim Peters's avatar
Tim Peters committed
121 122
>>> r2["b"].value
1
Jeremy Hylton's avatar
Jeremy Hylton committed
123 124
>>> r2["b"].value = 2

Tim Peters's avatar
Tim Peters committed
125 126
It is not safe, however, to read the current revision of "a" because
it was modified at the high-water mark.  If we read it, we'll get a
Jeremy Hylton's avatar
Jeremy Hylton committed
127 128 129 130 131 132 133 134 135 136 137 138 139
non-current version.

>>> r2["a"].value
1
>>> r2["a"]._p_serial < cn2._txn_time
True

We can confirm that we have a non-current revision by asking the
storage.

>>> db._storage.isCurrent(r2["a"]._p_oid, r2["a"]._p_serial)
False

Tim Peters's avatar
Tim Peters committed
140
It's possible to modify "a", but we get a conflict error when we
Jeremy Hylton's avatar
Jeremy Hylton committed
141 142 143 144 145 146 147 148
commit the transaction.

>>> r2["a"].value = 3
>>> txn2.commit()
Traceback (most recent call last):
 ...
ConflictError: database conflict error (oid 0000000000000001, class ZODB.tests.MinPO.MinPO)

149 150
The failed commit aborted the current transaction, so we can try
again.  This example will demonstrate that we can commit a transaction
Tim Peters's avatar
Tim Peters committed
151
if we only modify current revisions.
152

Tim Peters's avatar
Tim Peters committed
153 154
>>> print cn2._txn_time
None
155 156 157

>>> r1 = cn1.root()
>>> r1["a"].value = 3
Tim Peters's avatar
Tim Peters committed
158 159
>>> txn1 is cn1.getTransaction()
True
160 161 162 163 164 165 166 167 168
>>> cn1.getTransaction().commit()
>>> txn = db.lastTransaction()
>>> cn2._txn_time == txn
True

>>> r2["b"].value = r2["a"].value + 1
>>> r2["b"].value
3
>>> txn2.commit()
Tim Peters's avatar
Tim Peters committed
169 170
>>> print cn2._txn_time
None
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197

Object cache
------------

A Connection keeps objects in its cache so that multiple database
references will always point to the same Python object.  At
transaction boundaries, objects modified by other transactions are
ghostified so that the next transaction doesn't see stale state.  We
need to be sure the non-current objects loaded by MVCC are always
ghosted.  It should be trivial, because MVCC is only used when an
invalidation has been received for an object.

First get the database back in an initial state.

>>> cn1.sync()
>>> r1["a"].value = 0
>>> r1["b"].value = 0
>>> cn1.getTransaction().commit()

>>> cn2.sync()
>>> r2["a"].value
0
>>> r2["b"].value = 1
>>> cn2.getTransaction().commit()

>>> r1["b"].value
0
Tim Peters's avatar
Tim Peters committed
198 199
>>> cn1.sync()  # cn2 modified 'b', so cn1 should get a ghost for b
>>> r1["b"]._p_state  # -1 means GHOST
200 201
-1

Tim Peters's avatar
Tim Peters committed
202 203
Closing the connection, committing a transaction, and aborting a transaction,
should all have the same effect on non-current objects in cache.
204 205 206 207 208 209 210 211 212 213 214

>>> def testit():
...     cn1.sync()
...     r1["a"].value = 0
...     r1["b"].value = 0
...     cn1.getTransaction().commit()
...     cn2.sync()
...     r2["b"].value = 1
...     cn2.getTransaction().commit()

>>> testit()
Tim Peters's avatar
Tim Peters committed
215 216 217 218
>>> r1["b"]._p_state  # 0 means UPTODATE, although note it's an older revision
0
>>> r1["b"].value
0
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234
>>> r1["a"].value = 1
>>> cn1.getTransaction().commit()
>>> r1["b"]._p_state
-1

When a connection is closed, it is saved by the database.  It will be
reused by the next open() call (along with its object cache).

>>> testit()
>>> r1["a"].value = 1
>>> cn1.close()
>>> cn3 = db.open()
>>> cn1 is cn3
True
>>> r1 = cn1.root()

Tim Peters's avatar
Tim Peters committed
235 236 237 238 239
Although "b" is a ghost in cn1 at this point (because closing a connection
has the same effect on non-current objects in the connection's cache as
committing a transaction), not every object is a ghost.  The root was in
the cache and was current, so our first reference to it doesn't return
a ghost.
240

Tim Peters's avatar
Tim Peters committed
241
>>> r1._p_state # UPTODATE
242
0
Tim Peters's avatar
Tim Peters committed
243
>>> r1["b"]._p_state # GHOST
244 245
-1

Tim Peters's avatar
Tim Peters committed
246
>>> cn1._transaction = None # See the Cleanup section below
247

248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275
Late invalidation
-----------------

The combination of ZEO and MVCC adds more complexity.  Since
invalidations are delivered asynchronously by ZEO, it is possible for
an invalidation to arrive just after a request to load the invalidated
object is sent.  The connection can't use the just-loaded data,
because the invalidation arrived first.  The complexity for MVCC is
that it must check for invalidated objects after it has loaded them,
just in case.

Rather than add all the complexity of ZEO to these tests, the
MinimalMemoryStorage has a hook.  We'll write a subclass that will
deliver an invalidation when it loads an object.  The hook allows us
to test the Connection code.

>>> class TestStorage(MinimalMemoryStorage):
...    def __init__(self):
...        self.hooked = {}
...        self.count = 0
...        super(TestStorage, self).__init__()
...    def registerDB(self, db, limit):
...        self.db = db
...    def hook(self, oid, tid, version):
...        if oid in self.hooked:
...            self.db.invalidate(tid, {oid:1})
...            self.count += 1

276 277 278 279
We can execute this test with a single connection, because we're
synthesizing the invalidation that is normally generated by the second
connection.  We need to create two revisions so that there is a
non-current revision to load.
280 281 282 283 284 285 286 287 288

>>> ts = TestStorage()
>>> db = DB(ts)
>>> cn1 = db.open()
>>> txn1 = cn1.setLocalTransaction()
>>> r1 = cn1.root()
>>> r1["a"] = MinPO(0)
>>> r1["b"] = MinPO(0)
>>> cn1.getTransaction().commit()
289 290
>>> r1["b"].value = 1
>>> cn1.getTransaction().commit()
291 292 293 294 295
>>> cn1.cacheMinimize()

>>> oid = r1["b"]._p_oid
>>> ts.hooked[oid] = 1

Jeremy Hylton's avatar
Jeremy Hylton committed
296 297 298
Once the oid is hooked, an invalidation will be delivered the next
time it is activated.  The code below activates the object, then
confirms that the hook worked and that the old state was retrieved.
299 300 301 302 303 304 305 306 307 308

>>> oid in cn1._invalidated
False
>>> r1["b"]._p_state
-1
>>> r1["b"]._p_activate()
>>> oid in cn1._invalidated
True
>>> ts.count
1
Jeremy Hylton's avatar
Jeremy Hylton committed
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327
>>> r1["b"].value
0

No earlier revision available
-----------------------------

We'll reuse the code from the example above, except that there will
only be a single revision of "b."  As a result, the attempt to
activate "b" will result in a ReadConflictError.

>>> ts = TestStorage()
>>> db = DB(ts)
>>> cn1 = db.open()
>>> txn1 = cn1.setLocalTransaction()
>>> r1 = cn1.root()
>>> r1["a"] = MinPO(0)
>>> r1["b"] = MinPO(0)
>>> cn1.getTransaction().commit()
>>> cn1.cacheMinimize()
328

Jeremy Hylton's avatar
Jeremy Hylton committed
329 330
>>> oid = r1["b"]._p_oid
>>> ts.hooked[oid] = 1
331

Jeremy Hylton's avatar
Jeremy Hylton committed
332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347
Once the oid is hooked, an invalidation will be delivered the next
time it is activated.  The code below activates the object, then
confirms that the hook worked and that the old state was retrieved.

>>> oid in cn1._invalidated
False
>>> r1["b"]._p_state
-1
>>> r1["b"]._p_activate()
Traceback (most recent call last):
 ...
ReadConflictError: database read conflict error (oid 0000000000000002, class ZODB.tests.MinPO.MinPO)
>>> oid in cn1._invalidated
True
>>> ts.count
1
348

349 350 351 352 353 354 355 356 357 358 359
Cleanup
-------

The setLocalTransaction() feature creates cyclic trash involving the
Connection and Transaction.  The Transaction has an __del__ method,
which prevents the cycle from being collected.  There's no API for
clearing the Connection's local transaction.

>>> cn1._transaction = None
>>> cn2._transaction = None

Jeremy Hylton's avatar
Jeremy Hylton committed
360 361 362 363 364 365
"""

import doctest

def test_suite():
    return doctest.DocTestSuite()