• Kirill Smelkov's avatar
    [ZEO4] Include both modified and just created objects into invalidations · f2fae122
    Kirill Smelkov authored
    ( Upstream commit ab86bd72 )
    
    This is ZEO4 backport of https://github.com/zopefoundation/ZEO/pull/160
    
    It changes ZEO4 to both include created objects into invalidation
    messages, and, in turn, not to skip sending invalidation message at
    all if committed transaction only creates objects. Please see original
    description for details below.
    
    Extra changes compared to ZEO5 patch:
    
    - tests/servertesting.py: provide callAsyncNoSend to avoid the following crash:
    
      File ".../ZEO/tests/testZEO2.py", line 156, in ZEO.tests.testZEO2.proper_handling_of_errors_in_restart
      Failed example:
          zs1.tpc_finish('1').set_sender(0, conn1)
      Exception raised:
          Traceback (most recent call last):
            File "/usr/lib/python2.7/doctest.py", line 1315, in __run
              compileflags, 1) in test.globs
            File "<doctest ZEO.tests.testZEO2.proper_handling_of_errors_in_restart[18]>", line 1, in <module>
              zs1.tpc_finish('1').set_sender(0, conn1)
            File ".../ZEO/StorageServer.py", line 408, in tpc_finish
              self.storage.tpc_finish(self.transaction, self._invalidate)
            File ".../ZODB/FileStorage/FileStorage.py", line 741, in tpc_finish
              f(self._tid)
            File ".../ZEO/StorageServer.py", line 418, in _invalidate
              self.invalidated, self.get_size_info())
            File ".../ZEO/StorageServer.py", line 1110, in invalidate
              p.client.invalidateTransaction(tid, invalidated)
            File ".../ZEO/StorageServer.py", line 1466, in invalidateTransaction
              self.rpc.callAsyncNoSend('invalidateTransaction', tid, args)
          AttributeError: Connection instance has no attribute 'callAsyncNoSend'
    
    - testZEO2.proper_handling_of_errors_in_restart: adjust it since now an
      invalidation message is sent and previously it was completely avoided
      because objects in that test are only created.
    
    ---- 8< ---- (original description)
    
    Starting from 1999 (b3805a2f "just getting started") only modified - not
    just created - objects were included into ZEO invalidation messages:
    
    https://github.com/zopefoundation/ZEO/commit/b3805a2f#diff-52fb76aaf08a1643cdb8fdaf69e37802R126-R127
    
    In 2000 this behaviour was further changed to not send invalidation
    message at all if the only objects a transaction has were the created ones:
    
    https://github.com/zopefoundation/ZEO/commit/230ffbe8#diff-52fb76aaf08a1643cdb8fdaf69e37802L163-R163
    
    In 2016 the latter was reconsidered as bug and fixed in ZEO5 because
    ZODB5 relies more heavily on MVCC semantic and needs to be notified
    about every transaction committed to storage to be able to properly
    update ZODB.Connection view:
    
    https://github.com/zopefoundation/ZEO/commit/02943acd#diff-52fb76aaf08a1643cdb8fdaf69e37802L889-R834
    https://github.com/zopefoundation/ZEO/commit/9613f09b
    
    In 2020, with this patch, I'm proposing to reconsider initial "send only
    modified, not created objects" as bug, and include both modified and
    just created objects into invalidation messages at least for the
    following reasons:
    
    - a ZODB client (not necessarily native ZODB/py client) can maintain
      raw cache for the storage. If such client tries to load an oid at
      database view when that object did not existed yet, gets "no object"
      reply and stores that information into raw cache, to properly invalidate
      the cache it needs an invalidation message from ZODB server that
      _includes_ created object.
    
    - tools like `zodb watch` [1,2,3] don't work properly (give incorrect output)
      if not all objects modified/created by a transaction are included into
      invalidation messages.
    
    - similarly to `zodb watch`, a monitoring tool, that would want to be
      notified of all created/modified objects, won't see full
      database-change picture, and so won't work properly without knowing
      which objects were created.
    
    - wendelin.core 2 - which builds data from ZODB BTrees and data objects
      into virtual filesystem - needs to get invalidation messages with both
      modified and created objects to properly implement its own lazy
      invalidation and isolation protocol for file blocks in OS cache: when
      a block of file is accessed, all clients, that have this block mmaped,
      need to be notified and asked to remmap that block into particular
      revision of the file depending on a client's view of the filesystem and
      database [4,5].
    
      To compute to where a client needs to remmap the block, WCFS server
      (that in turn acts as ZODB client wrt ZEO/NEO server), needs to be able
      to see whether client's view of the filesystem is before object creation
      (and then ask that client to pin that block to hole), or after creation
      (and then ask the client to pin that block to corresponding revision).
    
      This computation needs ZODB server to send invalidation messages in
      full: with both modified and just created objects.
    
    The patch is simple - it removes `if serial != b"\0\0\0\0\0\0\0\0"`
    before queuing oid into ZEOStorage.invalidated, and adjusts the tests
    correspondingly. From my point of view and experience, in practice, this
    patch should not cause any compatibility break nor performance regressions.
    
    Thanks beforehand,
    Kirill
    
    /cc @jimfulton
    
    [1] https://lab.nexedi.com/kirr/neo/blob/ea53a795/go/zodb/zodbtools/watch.go
    [2] neo@e0d59f5d
    [3] neo@c41c2907
    
    [4] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/wcfs.go#L94-182
    [5] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/client/wcfs.h#L20-71
    f2fae122
StorageServer.py 58.1 KB