Commits · eb02632cf6ed6eba0a3bb95ae1db8e986c64f672 · Kirill Smelkov / ZODB

31 Jul, 2020 3 commits

loadAt · eb02632c

Kirill Smelkov authored Jul 27, 2020

loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

1) returns data=None and serial of the removal, when loaded object was found to
be deleted. loadBefore is returning only data=None in such case. This loadAt
property allows to fix DemoStorage data corruption when whiteouts in overlay
part were not previously correctly taken into account.

https://github.com/zopefoundation/ZODB/issues/318

2) for regular data records, does not require storages to return next_serial,
in addition to (data, serial). loadBefore requirement to return both
serial and next_serial is constraining storages unnecessarily, and,
while for FileStorage it is free to implement, for other storages it is
not - for example for NEO and RelStorage, finding out next_serial, after
looking up oid@at data record, costs one more SQL query:

https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

next_serial is not only about execution overhead - it is semantically
redundant to be there and can be removed from load return. The reason
I say that next_serial can be removed is that in ZODB/py the only place,
that I could find, where next_serial is used on client side is in client
cache (e.g. in NEO client cache), and that cache can be remade to
work without using that next_serial at all. In simple words whenever
after

loadAt(oid, at) -> (data, serial)

query, the cache can remember data for oid in [serial, at] range.

Next, when invalidation message from server is received, cache entries,
that had at == client_head, are extended (at -> new_head) for oids that
are not present in invalidation message, while for oids that are present
in invalidation message no such extension is done. This allows to
maintain cache in correct state, invalidate it when there is a need to
invalidate, and not to throw away cache entries that should remain live.
This of course requires ZODB server to include both modified and
just-created objects into invalidation messages

( https://github.com/zopefoundation/ZEO/pull/160 ,
https://github.com/zopefoundation/ZODB/pull/319 ).

Switching to loadAt should thus allow storages like NEO and, maybe,
RelStorage, to do 2x less SQL queries on every object access.

https://github.com/zopefoundation/ZODB/issues/318#issuecomment-657685745

In other words loadAt unifies return signature to always be

(data, serial)

instead of

POSKeyError object does not exist at all
None object was removed
(data, serial, next_serial) regular data record

used by loadBefore.

This patch:

- introduces new interface.
- introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
or, if the storage does not implement loadAt interface, tries to mimic
loadAt semantic via storage.loadBefore to possible extent + emits
corresponding warning.
- converts MVCCAdapter to use loadAt instead of loadBefore.
- changes DemoStorage to use loadAt, and this way fixes above-mentioned
data corruption issue; adds corresponding test; converts
DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
- adds loadAt implementation to FileStorage and MappingStorage.
- adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch

eb02632c

MVCCAdapter.load: Raise ReadConflictError only if pack is running simultaneously · ddfe57eb

Kirill Smelkov authored Jul 26, 2020

Currently when load(oid) finds that the object was deleted, it raises
ReadConflictError - not POSKeyError - because a pack could be running
simultaneously and the deletion could result from the pack. In that case
we want corresponding transaction to be retried - not failed - via
raising ConflictError subclass for backward-compatibility reason.
However from semantic point of view, it is more logically correct to
raise POSKeyError, when an object is found to be deleted or
not-yet-created, and raise ReadConflictError only if a pack was actually
running simultaneously, and the deletion could result from that pack.

-> Fix MVCCAdapter.load to do this - now it raises ReadConflictError
only if MVCCAdapterInstance view appears before storage packtime, which
indicates that there could indeed be conflict in between read access and
pack removing the object.

To detect if pack was running and beyond MVCCAdapterInstance view, we
need to teach storage drivers to provide way to known what was the last
pack time/transaction. Add optional IStorageLastPack interface with
.lastPack() method to do so. If a storage does not implement lastPack,
we take conservative approach and raise ReadConflictError
unconditionally as before.

Add/adapt corresponding tests.

Teach FileStorage, MappingStorage and DemoStorage to implement the new
interface.

NOTE: storages that implement IMVCCStorage natively already raise
POSKeyError - not ReadConflictError - when load(oid) finds deleted
object. This is so because IMVCCStorages natively provide isolation, via
e.g. RDBMS in case of RelStorage. The isolation property provided by
RDBMS guarantees that connection view of the database is not affected by
other changes - e.g. pack - until connection's transaction is complete.

/cc @jimfulton

ddfe57eb

Kill leftovers of pre-MVCC read conflicts · 3a493b01

Kirill Smelkov authored Jul 21, 2020

In the early days, before MVCC was introduced, ZODB used to raise
ReadConflictError on access to object that was simultaneously changed by
another client in concurrent transaction. However, as
doc/articles/ZODB-overview.rst says

	Since Zope 2.8 ZODB has implemented **Multi Version Concurrency Control**.
	This means no more ReadConflictErrors, each transaction is guaranteed to be
	able to load any object as it was when the transaction begun.

So today the only way to get a ReadConflictError should be

  1) at commit time for an object that was requested to stay unchanged
     via checkCurrentSerialInTransaction, and

  2) at plain access time, if a pack running simultaneously to current
     transaction, removes object revision that we try to load.

The second point is a bit unfortunate, since when load discovers that
object was deleted or not yet created, it is logically more clean to
raise POSKeyError. However due to backward compatibility we still want
to raise ReadConflictError in this case - please see comments added to
MVCCAdapter for details.

Anyway, let's remove leftovers of handling regular read-conflicts from
pre-MVCC era:

Adjust docstring of ReadConflictError to explicitly describe that this
error can only happen at commit time for objects requested to be
current, or at plain access if pack is running simultaneously under
connection foot.

There were also leftover code, comment and test bits in Connection,
interfaces, testmvcc and testZODB, that are corrected/removed
correspondingly. testZODB actually had ReadConflictTests that was
completely deactivated: commit b0f992fd ("Removed the mvcc option..."; 2007)
moved read-conflict-on-access related tests out of ZODBTests, but did not
activated moved parts at all, because as that commit says when MVCC is
always on unconditionally, there is no on-access conflicts:

    Removed the mvcc option.  Everybody wants mvcc and removing us lets us
    simplify the code a little. (We'll be able to simplify more when we
    stop supporting versions.)

Today, if I try to manually activate that ReadConflictTests via

    @@ -637,6 +637,7 @@ def __init__(self, poisonedjar):
     def test_suite():
         return unittest.TestSuite((
             unittest.makeSuite(ZODBTests, 'check'),
    +        unittest.makeSuite(ReadConflictTests, 'check'),
             ))

     if __name__ == "__main__":

it fails in dumb way showing that this tests were unmaintained for ages:

    Error in test checkReadConflict (ZODB.tests.testZODB.ReadConflictTests)
    Traceback (most recent call last):
      File "/usr/lib/python2.7/unittest/case.py", line 320, in run
        self.setUp()
      File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/testZODB.py", line 451, in setUp
        ZODB.tests.utils.TestCase.setUp(self)
    AttributeError: 'module' object has no attribute 'utils'

Since today ZODB always uses MVCC and there is no way to get
ReadConflictError on concurrent plain read/write access, those tests
should be also gone together with old pre-MVCC way of handling
concurrency.

/cc @jimfulton
/reviewed-on https://github.com/zopefoundation/ZODB/pull/320
/reviewed-by @jamadden

3a493b01

12 Jun, 2020 2 commits
- Merge pull request #317 from zopefoundation/changenote-280 · 22df1fd1
  Jason Madden authored Jun 12, 2020
```
Add a change note for #280
```
  22df1fd1
- Add a change note for #280 [skip ci] · e71bb8ce
  Jason Madden authored Jun 12, 2020
```
ConnectionPool and ConnectionPool.map both had docstrings and were used by third-party code. People should be warned about this potentially breaking change.
```
  e71bb8ce
11 Jun, 2020 2 commits
- Back to development: 5.6.1 · 8303cf2f
  Julien Muchembled authored Jun 11, 2020
  
  8303cf2f
- Preparing release 5.6.0 · 3d06c046
  Julien Muchembled authored Jun 11, 2020
  
  3d06c046
10 Jun, 2020 1 commit
- changelog: pull request #307 · 181fa370
  Julien Muchembled authored Jun 10, 2020
  
  181fa370
09 Jun, 2020 1 commit

mvccadapter: check if the last TID changed without invalidation · 4a6b0283

Julien Muchembled authored Apr 01, 2020

Since commit b5895a5c ("mvccadapter:
fix race with invalidations when starting a new transaction"),
a ZEO test fails as follows:

    File "src/ZEO/tests/drop_cache_rather_than_verify.txt", line 114, in drop_cache_rather_than_verify.txt
    Failed example:
        conn.root()[1].x
    Expected:
        6
    Got:
        1

Earlier in the test, the ZEO server is restarted and then another
client commits. When disconnected, the first client does not receive
invalidations anymore and the connection gets stuck in the past until
there's a new commit after it reconnected. It was possible to make the
test pass with the following patch:

--- a/src/ZEO/ClientStorage.py
+++ b/src/ZEO/ClientStorage.py
@@ -357,6 +357,7 @@ def notify_connected(self, conn, info):

         # invalidate our db cache
         if self._db is not None:
+            self._db.invalidate(self.lastTransaction(), ())
             self._db.invalidateCache()

         logger.info("%s %s to storage: %s",

Other implementations like NEO are probably affected the same way.

Rather than changing interfaces in a backward-incompatible way,
this commit revert to the original behaviour, and all the changes
that were done in existing tests are reverted.

However, the interfaces are clarified about the fact that storage
implementations must update at a precise moment the value that is
returned by lastTransaction(): just after invalidate() or
tpc_finish callback.

4a6b0283

02 Jun, 2020 1 commit
- Merge pull request #310 from zopefoundation/zope-interface-fix-inconsistent-iro · 5ce50c38
  Julien Muchembled authored Jun 02, 2020
```
Fix inconsistent resolution order with zope.interface v5.
```
  5ce50c38
20 May, 2020 4 commits
- The j1m.sphinxautointerface dependency can be dropped, the functionality is... · 463fc5af
  Jan-Jaap Driessen authored May 21, 2020
```
The j1m.sphinxautointerface dependency can be dropped, the functionality is now in https://pypi.org/project/sphinxcontrib-zopeext
```
  463fc5af
- Leave the direct definition of IBlobStorage in place, only expand the... · 14e73eb3
  Jan-Jaap Driessen authored May 20, 2020
```
Leave the direct definition of IBlobStorage in place, only expand the definition if necessary inside the __init__.
```
  14e73eb3
- Add strict iro to travis. · 6dfe5254
  Jan-Jaap Driessen authored May 20, 2020
  
  6dfe5254
- Fix inconsistent resolution order with zope.interface v5. · 47497fc1
  Jan-Jaap Driessen authored May 20, 2020
  
  47497fc1
31 Mar, 2020 1 commit

FileStorage: Save committed transaction to disk even if changed data is empty · bb9bf539

Kirill Smelkov authored Mar 13, 2020

ZODB tries to avoid saving empty transactions to storage on
`transaction.commit()`. The way it works is: if no objects were changed
during ongoing transaction, ZODB.Connection does not join current
TransactionManager, and transaction.commit() performs two-phase commit
protocol only on joined DataManagers. In other words if no objects were
changed, no tpc_*() methods are called at all on ZODB.Connection at
transaction.commit() time.

This way application servers like Zope/ZServer/ERP5/... can have
something as

try:
# process incoming request
transaction.commit() # processed ok
except:
transaction.abort()
# problem: log + reraise

in top-level code to process requests without creating many on-disk
transactions with empty data changes just because read-only requests
were served.

Everything is working as intended.

However at storage level, FileStorage currently also checks whether
transaction that is being committed also comes with empty data changes,
and _skips_ saving transaction into disk *at all* for such cases, even
if it has been explicitly told to commit the transaction via two-phase
commit protocol calls done at storage level.

This creates the situation, where contrary to promise in
ZODB/interfaces.py(*), after successful tpc_begin/tpc_vote/tpc_finish()
calls made at storage level, transaction is _not_ made permanent,
despite tid of "committed" transaction being returned to caller. In other
words FileStorage, when asked to commit a transaction, even if one with
empty data changes, reports "ok" and gives transaction ID to the caller,
without creating corresponding transaction record on disk.

This behaviour is

a) redundant to application-level avoidance to create empty transaction
on storage described in the beginning, and

b) creates problems:

The first problem is that application that works at storage-level might
be interested in persisting transaction, even with empty changes to
data, just because it wants to save the metadata similarly to e.g.
`git commit --allow-empty`.

The other problem is that an application view and data in database
become inconsistent: an application is told that a transaction was
created with corresponding transaction ID, but if the storage is
actually inspected, e.g. by iteration, the transaction is not there.
This, in particular, can create problems if TID of committed transaction
is reported elsewhere and that second database client does not find the
transaction it was told should exist.

I hit this particular problem with wendelin.core. In wendelin.core,
there is custom virtual memory layer that keeps memory in sync with
data in ZODB. At commit time, the memory is inspected for being dirtied,
and if a page was changed, virtual memory layer joins current
transaction _and_ forces corresponding ZODB.Connection - via which it
will be saving data into ZODB objects - to join the transaction too,
because it would be too late to join ZODB.Connection after 2PC process
has begun(+). One of the format in which data are saved tries to
optimize disk space usage, and it actually might happen, that even if
data in RAM were dirtied, the data itself stayed the same and so nothing
should be saved into ZODB. However ZODB.Connection is already joined
into transaction and it is hard not to join it because joining a
DataManager when the 2PC is already ongoing does not work.

This used to work ok with wendelin.core 1, but with wendelin.core 2 -
where separate virtual filesystem is also connected to the database to
provide base layer for arrays mappings - this creates problem, because
when wcfs (the filesystem) is told to synchronize to view the database
@tid of committed transaction, it can wait forever waiting for that, or
later, transaction to appear on disk in the database, creating
application-level deadlock.

I agree that some more effort might be made at wendelin.core side to
avoid committing transactions with empty data at storage level.

However the most clean way to fix this problem in my view is to fix
FileStorage itself, because if at storage level it was asked to commit
something, it should not silently skip doing so and dropping even non-empty
metadata + returning ok and committed transaction ID to the caller.

As described in the beginning this should not create problems for
application-level ZODB users, while at storage-level the implementation
is now consistently matching interface and common sense.

----

(*) tpc_finish: Finish the transaction, making any transaction changes permanent.
Changes must be made permanent at this point.
...

https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L828-L831

(+) https://lab.nexedi.com/kirr/wendelin.core/blob/9ff5ed32/bigfile/file_zodb.py#L788-822

bb9bf539

27 Mar, 2020 1 commit
- Merge pull request #304 from sblondon/update_to_pypi.org · 4f4102fa
  sblondon authored Mar 27, 2020
```
Update package links from pypi.python.org to pypi.org
```
  4f4102fa
26 Mar, 2020 1 commit
- Update package links from pypi.python.org to pypi.org · ce000346
  Stéphane Blondon authored Mar 26, 2020
  
  ce000346
20 Mar, 2020 1 commit
- Merge pull request #302 from azmeuk/doc-links · d8b19255
  Éloi Rivard authored Mar 20, 2020
```
Added newtdb link
```
  d8b19255
17 Mar, 2020 12 commits
- Merge pull request #301 from azmeuk/cleanups · 96d76caa
  Éloi Rivard authored Mar 17, 2020
```
Removed log.ini
```
  96d76caa
- Merge pull request #300 from azmeuk/coverage · e1f39db4
  Éloi Rivard authored Mar 17, 2020
```
Use python3 for coverage
```
  e1f39db4
- Added newtdb link · 98808e16
  Éloi Rivard authored Mar 17, 2020
  
  98808e16
- Use python3 for coverage · e9f79371
  Éloi Rivard authored Mar 17, 2020
  
  e9f79371
- Removed log.ini · 8eb1927c
  Éloi Rivard authored Mar 17, 2020
  
  8eb1927c
- Merge pull request #299 from azmeuk/doc · 2657d6b5
  Éloi Rivard authored Mar 17, 2020
```
Some documentation love
```
  2657d6b5
- Removed subtransactions documentation · 5bbd987d
  Éloi Rivard authored Mar 16, 2020
  
  5bbd987d
- Brought documentation at the same place · ecae906c
  Éloi Rivard authored Mar 14, 2020
  
  ecae906c
- Updated documentation links · be47be34
  Éloi Rivard authored Mar 14, 2020
  
  be47be34
- Splitted documentation index · 551f6a6c
  Éloi Rivard authored Mar 14, 2020
  
  551f6a6c
- Updated satellite projects links in the documentation. Fixes #236 · 7b93ab5c
  Éloi Rivard authored Mar 14, 2020
  
  7b93ab5c
- build documentation with tox · 579b5395
  Éloi Rivard authored Mar 14, 2020
  
  579b5395
06 Mar, 2020 1 commit
- Merge pull request #297 from zopefoundation/appveyor-mock · 0e343a8c
  Jason Madden authored Mar 06, 2020
```
Simplify appveyor.yml.
```
  0e343a8c
05 Mar, 2020 4 commits
- Evidently the build_script is required. · 22055bde
  Jason Madden authored Mar 05, 2020
  
  22055bde
- Simplify appveyor.yml. · 29225f21
  Jason Madden authored Mar 05, 2020
```
Fixes issues with zc.buildout and version pins (https://github.com/zopefoundation/ZODB/pull/295#issuecomment-593805066)
```
  29225f21
- Merge pull request #296 from zopefoundation/pure-python-tests · 6601d634
  Jason Madden authored Mar 05, 2020
```
Fix test compatibility with persistent 4.6; test CPython with PURE_PYTHON
```
  6601d634
- Fix test compatibility with persistent 4.6; test CPython with PURE_PYTHON. · 9af7c7f3
  Jason Madden authored Mar 05, 2020
  
  9af7c7f3
23 Jan, 2020 1 commit
- mvccadapter: fix race with invalidations when starting a new transaction (#291) · b5895a5c
  Julien Muchembled authored Jan 23, 2020
```
Fixes #290
```
  b5895a5c
16 Jan, 2020 1 commit
- Merge pull request #293 from zopefoundation/doc-292-volatile-attributes · 12b52a71
  Jens W. Klein authored Jan 16, 2020
```
fix #292 improve volatile attribute documentation
```
  12b52a71
02 Jan, 2020 1 commit
- reduce dup information and compress · b4353ac6
  Jens W. Klein authored Jan 02, 2020
  
  b4353ac6
30 Dec, 2019 1 commit
- fix #292 improve volatile attribute documentation · e1a789c1
  Jens W. Klein authored Dec 30, 2019
  
  e1a789c1
12 Dec, 2019 1 commit
- Merge pull request #289 from zopefoundation/transaction30 · 8e97bd7d
  Jason Madden authored Dec 12, 2019
```
Fix tests with transaction 3.0.
```
  8e97bd7d