1. 31 Jul, 2020 3 commits
    • Kirill Smelkov's avatar
      loadAt · eb02632c
      Kirill Smelkov authored
      loadAt is new optional storage interface that is intended to replace loadBefore
      with more clean and uniform semantic. Compared to loadBefore, loadAt:
      
      1) returns data=None and serial of the removal, when loaded object was found to
         be deleted. loadBefore is returning only data=None in such case. This loadAt
         property allows to fix DemoStorage data corruption when whiteouts in overlay
         part were not previously correctly taken into account.
      
         https://github.com/zopefoundation/ZODB/issues/318
      
      2) for regular data records, does not require storages to return next_serial,
         in addition to (data, serial). loadBefore requirement to return both
         serial and next_serial is constraining storages unnecessarily, and,
         while for FileStorage it is free to implement, for other storages it is
         not - for example for NEO and RelStorage, finding out next_serial, after
         looking up oid@at data record, costs one more SQL query:
      
         https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
         https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482
      
         https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
         https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199
      
         next_serial is not only about execution overhead - it is semantically
         redundant to be there and can be removed from load return. The reason
         I say that next_serial can be removed is that in ZODB/py the only place,
         that I could find, where next_serial is used on client side is in client
         cache (e.g. in NEO client cache), and that cache can be remade to
         work without using that next_serial at all. In simple words whenever
         after
      
           loadAt(oid, at)  ->  (data, serial)
      
         query, the cache can remember data for oid in [serial, at] range.
      
         Next, when invalidation message from server is received, cache entries,
         that had at == client_head, are extended (at -> new_head) for oids that
         are not present in invalidation message, while for oids that are present
         in invalidation message no such extension is done. This allows to
         maintain cache in correct state, invalidate it when there is a need to
         invalidate, and not to throw away cache entries that should remain live.
         This of course requires ZODB server to include both modified and
         just-created objects into invalidation messages
      
           ( https://github.com/zopefoundation/ZEO/pull/160 ,
             https://github.com/zopefoundation/ZODB/pull/319 ).
      
         Switching to loadAt should thus allow storages like NEO and, maybe,
         RelStorage, to do 2x less SQL queries on every object access.
      
         https://github.com/zopefoundation/ZODB/issues/318#issuecomment-657685745
      
      In other words loadAt unifies return signature to always be
      
         (data, serial)
      
      instead of
      
         POSKeyError				object does not exist at all
         None					object was removed
         (data, serial, next_serial)		regular data record
      
      used by loadBefore.
      
      This patch:
      
      - introduces new interface.
      - introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
        or, if the storage does not implement loadAt interface, tries to mimic
        loadAt semantic via storage.loadBefore to possible extent + emits
        corresponding warning.
      - converts MVCCAdapter to use loadAt instead of loadBefore.
      - changes DemoStorage to use loadAt, and this way fixes above-mentioned
        data corruption issue; adds corresponding test; converts
        DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
      - adds loadAt implementation to FileStorage and MappingStorage.
      - adapts other tests/code correspondingly.
      
      /cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
      eb02632c
    • Kirill Smelkov's avatar
      MVCCAdapter.load: Raise ReadConflictError only if pack is running simultaneously · ddfe57eb
      Kirill Smelkov authored
      Currently when load(oid) finds that the object was deleted, it raises
      ReadConflictError - not POSKeyError - because a pack could be running
      simultaneously and the deletion could result from the pack. In that case
      we want corresponding transaction to be retried - not failed - via
      raising ConflictError subclass for backward-compatibility reason.
      However from semantic point of view, it is more logically correct to
      raise POSKeyError, when an object is found to be deleted or
      not-yet-created, and raise ReadConflictError only if a pack was actually
      running simultaneously, and the deletion could result from that pack.
      
      -> Fix MVCCAdapter.load to do this - now it raises ReadConflictError
      only if MVCCAdapterInstance view appears before storage packtime, which
      indicates that there could indeed be conflict in between read access and
      pack removing the object.
      
      To detect if pack was running and beyond MVCCAdapterInstance view, we
      need to teach storage drivers to provide way to known what was the last
      pack time/transaction. Add optional IStorageLastPack interface with
      .lastPack() method to do so. If a storage does not implement lastPack,
      we take conservative approach and raise ReadConflictError
      unconditionally as before.
      
      Add/adapt corresponding tests.
      
      Teach FileStorage, MappingStorage and DemoStorage to implement the new
      interface.
      
      NOTE: storages that implement IMVCCStorage natively already raise
      POSKeyError - not ReadConflictError - when load(oid) finds deleted
      object. This is so because IMVCCStorages natively provide isolation, via
      e.g. RDBMS in case of RelStorage. The isolation property provided by
      RDBMS guarantees that connection view of the database is not affected by
      other changes - e.g. pack - until connection's transaction is complete.
      
      /cc @jimfulton
      ddfe57eb
    • Kirill Smelkov's avatar
      Kill leftovers of pre-MVCC read conflicts · 3a493b01
      Kirill Smelkov authored
      In the early days, before MVCC was introduced, ZODB used to raise
      ReadConflictError on access to object that was simultaneously changed by
      another client in concurrent transaction. However, as
      doc/articles/ZODB-overview.rst says
      
      	Since Zope 2.8 ZODB has implemented **Multi Version Concurrency Control**.
      	This means no more ReadConflictErrors, each transaction is guaranteed to be
      	able to load any object as it was when the transaction begun.
      
      So today the only way to get a ReadConflictError should be
      
        1) at commit time for an object that was requested to stay unchanged
           via checkCurrentSerialInTransaction, and
      
        2) at plain access time, if a pack running simultaneously to current
           transaction, removes object revision that we try to load.
      
      The second point is a bit unfortunate, since when load discovers that
      object was deleted or not yet created, it is logically more clean to
      raise POSKeyError. However due to backward compatibility we still want
      to raise ReadConflictError in this case - please see comments added to
      MVCCAdapter for details.
      
      Anyway, let's remove leftovers of handling regular read-conflicts from
      pre-MVCC era:
      
      Adjust docstring of ReadConflictError to explicitly describe that this
      error can only happen at commit time for objects requested to be
      current, or at plain access if pack is running simultaneously under
      connection foot.
      
      There were also leftover code, comment and test bits in Connection,
      interfaces, testmvcc and testZODB, that are corrected/removed
      correspondingly. testZODB actually had ReadConflictTests that was
      completely deactivated: commit b0f992fd ("Removed the mvcc option..."; 2007)
      moved read-conflict-on-access related tests out of ZODBTests, but did not
      activated moved parts at all, because as that commit says when MVCC is
      always on unconditionally, there is no on-access conflicts:
      
          Removed the mvcc option.  Everybody wants mvcc and removing us lets us
          simplify the code a little. (We'll be able to simplify more when we
          stop supporting versions.)
      
      Today, if I try to manually activate that ReadConflictTests via
      
          @@ -637,6 +637,7 @@ def __init__(self, poisonedjar):
           def test_suite():
               return unittest.TestSuite((
                   unittest.makeSuite(ZODBTests, 'check'),
          +        unittest.makeSuite(ReadConflictTests, 'check'),
                   ))
      
           if __name__ == "__main__":
      
      it fails in dumb way showing that this tests were unmaintained for ages:
      
          Error in test checkReadConflict (ZODB.tests.testZODB.ReadConflictTests)
          Traceback (most recent call last):
            File "/usr/lib/python2.7/unittest/case.py", line 320, in run
              self.setUp()
            File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/testZODB.py", line 451, in setUp
              ZODB.tests.utils.TestCase.setUp(self)
          AttributeError: 'module' object has no attribute 'utils'
      
      Since today ZODB always uses MVCC and there is no way to get
      ReadConflictError on concurrent plain read/write access, those tests
      should be also gone together with old pre-MVCC way of handling
      concurrency.
      
      /cc @jimfulton
      /reviewed-on https://github.com/zopefoundation/ZODB/pull/320
      /reviewed-by @jamadden
      3a493b01
  2. 12 Jun, 2020 2 commits
  3. 11 Jun, 2020 2 commits
  4. 10 Jun, 2020 1 commit
  5. 09 Jun, 2020 1 commit
    • Julien Muchembled's avatar
      mvccadapter: check if the last TID changed without invalidation · 4a6b0283
      Julien Muchembled authored
      Since commit b5895a5c ("mvccadapter:
      fix race with invalidations when starting a new transaction"),
      a ZEO test fails as follows:
      
          File "src/ZEO/tests/drop_cache_rather_than_verify.txt", line 114, in drop_cache_rather_than_verify.txt
          Failed example:
              conn.root()[1].x
          Expected:
              6
          Got:
              1
      
      Earlier in the test, the ZEO server is restarted and then another
      client commits. When disconnected, the first client does not receive
      invalidations anymore and the connection gets stuck in the past until
      there's a new commit after it reconnected. It was possible to make the
      test pass with the following patch:
      
      --- a/src/ZEO/ClientStorage.py
      +++ b/src/ZEO/ClientStorage.py
      @@ -357,6 +357,7 @@ def notify_connected(self, conn, info):
      
               # invalidate our db cache
               if self._db is not None:
      +            self._db.invalidate(self.lastTransaction(), ())
                   self._db.invalidateCache()
      
               logger.info("%s %s to storage: %s",
      
      Other implementations like NEO are probably affected the same way.
      
      Rather than changing interfaces in a backward-incompatible way,
      this commit revert to the original behaviour, and all the changes
      that were done in existing tests are reverted.
      
      However, the interfaces are clarified about the fact that storage
      implementations must update at a precise moment the value that is
      returned by lastTransaction(): just after invalidate() or
      tpc_finish callback.
      4a6b0283
  6. 02 Jun, 2020 1 commit
  7. 20 May, 2020 4 commits
  8. 31 Mar, 2020 1 commit
    • Kirill Smelkov's avatar
      FileStorage: Save committed transaction to disk even if changed data is empty · bb9bf539
      Kirill Smelkov authored
      ZODB tries to avoid saving empty transactions to storage on
      `transaction.commit()`. The way it works is: if no objects were changed
      during ongoing transaction, ZODB.Connection does not join current
      TransactionManager, and transaction.commit() performs two-phase commit
      protocol only on joined DataManagers. In other words if no objects were
      changed, no tpc_*() methods are called at all on ZODB.Connection at
      transaction.commit() time.
      
      This way application servers like Zope/ZServer/ERP5/... can have
      something as
      
          try:
              # process incoming request
              transaction.commit()    # processed ok
          except:
              transaction.abort()
              # problem: log + reraise
      
      in top-level code to process requests without creating many on-disk
      transactions with empty data changes just because read-only requests
      were served.
      
      Everything is working as intended.
      
      However at storage level, FileStorage currently also checks whether
      transaction that is being committed also comes with empty data changes,
      and _skips_ saving transaction into disk *at all* for such cases, even
      if it has been explicitly told to commit the transaction via two-phase
      commit protocol calls done at storage level.
      
      This creates the situation, where contrary to promise in
      ZODB/interfaces.py(*), after successful tpc_begin/tpc_vote/tpc_finish()
      calls made at storage level, transaction is _not_ made permanent,
      despite tid of "committed" transaction being returned to caller. In other
      words FileStorage, when asked to commit a transaction, even if one with
      empty data changes, reports "ok" and gives transaction ID to the caller,
      without creating corresponding transaction record on disk.
      
      This behaviour is
      
      a) redundant to application-level avoidance to create empty transaction
         on storage described in the beginning, and
      
      b) creates problems:
      
      The first problem is that application that works at storage-level might
      be interested in persisting transaction, even with empty changes to
      data, just because it wants to save the metadata similarly to e.g.
      `git commit --allow-empty`.
      
      The other problem is that an application view and data in database
      become inconsistent: an application is told that a transaction was
      created with corresponding transaction ID, but if the storage is
      actually inspected, e.g. by iteration, the transaction is not there.
      This, in particular, can create problems if TID of committed transaction
      is reported elsewhere and that second database client does not find the
      transaction it was told should exist.
      
      I hit this particular problem with wendelin.core. In wendelin.core,
      there is custom virtual memory layer that keeps memory in sync with
      data in ZODB. At commit time, the memory is inspected for being dirtied,
      and if a page was changed, virtual memory layer joins current
      transaction _and_ forces corresponding ZODB.Connection - via which it
      will be saving data into ZODB objects - to join the transaction too,
      because it would be too late to join ZODB.Connection after 2PC process
      has begun(+). One of the format in which data are saved tries to
      optimize disk space usage, and it actually might happen, that even if
      data in RAM were dirtied, the data itself stayed the same and so nothing
      should be saved into ZODB. However ZODB.Connection is already joined
      into transaction and it is hard not to join it because joining a
      DataManager when the 2PC is already ongoing does not work.
      
      This used to work ok with wendelin.core 1, but with wendelin.core 2 -
      where separate virtual filesystem is also connected to the database to
      provide base layer for arrays mappings - this creates problem, because
      when wcfs (the filesystem) is told to synchronize to view the database
      @tid of committed transaction, it can wait forever waiting for that, or
      later, transaction to appear on disk in the database, creating
      application-level deadlock.
      
      I agree that some more effort might be made at wendelin.core side to
      avoid committing transactions with empty data at storage level.
      
      However the most clean way to fix this problem in my view is to fix
      FileStorage itself, because if at storage level it was asked to commit
      something, it should not silently skip doing so and dropping even non-empty
      metadata + returning ok and committed transaction ID to the caller.
      
      As described in the beginning this should not create problems for
      application-level ZODB users, while at storage-level the implementation
      is now consistently matching interface and common sense.
      
      ----
      
      (*) tpc_finish: Finish the transaction, making any transaction changes permanent.
          Changes must be made permanent at this point.
          ...
      
          https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L828-L831
      
      (+) https://lab.nexedi.com/kirr/wendelin.core/blob/9ff5ed32/bigfile/file_zodb.py#L788-822
      bb9bf539
  9. 27 Mar, 2020 1 commit
  10. 26 Mar, 2020 1 commit
  11. 20 Mar, 2020 1 commit
  12. 17 Mar, 2020 12 commits
  13. 06 Mar, 2020 1 commit
  14. 05 Mar, 2020 4 commits
  15. 23 Jan, 2020 1 commit
  16. 16 Jan, 2020 1 commit
  17. 02 Jan, 2020 1 commit
  18. 30 Dec, 2019 1 commit
  19. 12 Dec, 2019 1 commit