• Kirill Smelkov's avatar
    bigfile/zodb: ZODB.Connection can migrate between threads on close/open and we have to care · c7c01ce4
    Kirill Smelkov authored
    Intro
    -----
    
    ZODB maintains pool of opened-to-DB connections. For each request Zope
    opens 1 connection and, after request handling is done, returns the
    connection back to ZODB pool (via Connection.close()). The same
    connection will be opened again for handling some future next request at
    some future time. This next open can happen in different-from-first
    request worker thread.
    
    TransactionManager  (as accessed by transaction.{get,commit,abort,...})
    is thread-local, that is e.g. transaction.get() returns different
    transaction for threads T1 and T2.
    
    When _ZBigFileH hooks into txn_manager to get a chance to run its
    .beforeCompletion() when transaction.commit() is run, it hooks into
    _current_ _thread_ transaction manager.
    
    Without unhooking on connection close, and circumstances where
    connection migrates to different thread this can lead to
    dissynchronization between ZBigFileH managing fileh pages and Connection
    with ZODB objects. And even to data corruption, e.g.
    
        T1              T2
    
        open
        zarray[0] = 11
        commit
        close
    
                        open                # opens connection as closed in T1
        open
                        zarray[0] = 21
        commit
                        abort
    
        close           close
    
    Here zarray[0]=21 _will_ be committed by T1 as part of T1 transaction -
    because when T1 does commit .beforeCompletion() for zarray is invoked,
    sees there is dirty data and propagate changes to zodb objects in
    connection for T2, joins connection for T2 into txn for T1, and then txn
    for t1 when doing two-phase-commit stores modified objects to DB ->
    oops.
    
    ----------------------------------------
    
    To prevent such dissynchronization _ZBigFileH needs to be a DataManager
    which works in sync with the connection it was initially created under -
    on connection close, unregister from transaction_manager, and on
    connection open, register to transaction manager in current, possibly
    different, thread context. Then there won't be incorrect
    beforeCompletion() notification and corruption.
    
    This issue, besides possible data corruption, was probably also exposing
    itself via following ways we've seen in production (everywhere
    connection was migrated from T1 to T2):
    
    1. Exception ZODB.POSException.ConnectionStateError:
            ConnectionStateError('Cannot close a connection joined to a transaction',)
            in <bound method Cleanup.__del__ of <App.ZApplication.Cleanup instance at 0x7f10f4bab050>> ignored
    
         T1          T2
    
                     modify zarray
                     commit/abort    # does not join zarray to T2.txn,
                                     # because .beforeCompletion() is
                                     # registered in T1.txn_manager
    
         commit                      # T1 invokes .beforeCompletion()
         ...                         # beforeCompletion() joins ZBigFileH and zarray._p_jar (=T2.conn) to T1.txn
         ...                         # commit is going on in progress
         ...
         ...         close           # T2 thinks request handling is done and
         ...                         # and closes connection. But T2.conn is
         ...                         # still joined to T1.txn
    
    2. Traceback (most recent call last):
         File ".../wendelin/bigfile/file_zodb.py", line 121, in storeblk
           def storeblk(self, blk, buf):   return self.zself.storeblk(blk, buf)
         File ".../wendelin/bigfile/file_zodb.py", line 220, in storeblk
           zblk._v_blkdata = bytes(buf)    # FIXME does memcpy
         File ".../ZODB/Connection.py", line 857, in setstate
           raise ConnectionStateError(msg)
       ZODB.POSException.ConnectionStateError: Shouldn't load state for 0x1f23a5 when the connection is closed
    
       Similar to "1", but close in T2 happens sooner, so that when T1 does
       the commit and tries to store object to database, Connection refuses to
       do the store:
    
         T1          T2
    
                     modify zarray
                     commit/abort
    
         commit
         ...         close
         ...
         ...
         . obj.store()
         ...
         ...
    
    3. Traceback (most recent call last):
         File ".../wendelin/bigfile/file_zodb.py", line 121, in storeblk
           def storeblk(self, blk, buf):   return self.zself.storeblk(blk, buf)
         File ".../wendelin/bigfile/file_zodb.py", line 221, in storeblk
           zblk._p_changed = True          # if zblk was already in DB: _p_state -> CHANGED
         File ".../ZODB/Connection.py", line 979, in register
           self._register(obj)
         File ".../ZODB/Connection.py", line 989, in _register
           self.transaction_manager.get().join(self)
         File ".../transaction/_transaction.py", line 220, in join
           Status.ACTIVE, Status.DOOMED, self.status))
       ValueError: expected txn status 'Active' or 'Doomed', but it's 'Committing'
    
      ( storeblk() does zblk._p_changed -> Connection.register(zblk) ->
        txn.join() but txn is already committing
    
        IOW storeblk() was invoked with txn.state being already 'Committing' )
    
        T1          T2
    
                    modify obj      # this way T2.conn joins T2.txn
                    modify zarray
    
        commit                      # T1 invokes .beforeCompletion()
        ...                         # beforeCompletion() joins only _ZBigFileH to T1.txn
        ...                         # (because T2.conn is already marked as joined)
        ...
        ...         commit/abort    # T2 does commit/abort - this touches only T2.conn, not ZBigFileH
        ...                         # in particular T2.conn is now reset to be not joined
        ...
        . tpc_begin                 # actual active commit phase of T1 was somehow delayed a bit
        . tpc_commit                # when changes from RAM propagate to ZODB objects associated
        .  storeblk                 # connection (= T2.conn !) is notified again,
        .   zblk = ...              # wants to join txn for it thinks its transaction_manager,
                                    # which when called from under T1 returns *T1* transaction manager for
                                    # which T1.txn is already in state='Committing'
    
    4. Empty transaction committed to NEO
    
       ( different from doing just transaction.commit() without changing
         any data - a connection was joined to txn, but set of modified
         object turned out to be empty )
    
       This is probably a race in Connection._register when both T1 and T2
       go to it at the same time:
    
       https://github.com/zopefoundation/ZODB/blob/3.10/src/ZODB/Connection.py#L988
    
       def _register(self, obj=None):
            if self._needs_to_join:
                self.transaction_manager.get().join(self)
                self._needs_to_join = False
    
        T1                          T2
    
                                    modify zarray
        commit
        ...
        .beforeCompletion           modify obj
        . if T2.conn.needs_join      if T2.conn.needs_join      # race here
        .   T2.conn.join(T1.txn)       T2.conn.join(T2.txn)     # as a result T2.conn joins both T1.txn and T2.txn
        .
        commit finishes             # T2.conn registered-for-commit object list is now empty
    
                                    commit
                                     tpc_begin
                                      storage.tpc_begin
                                     tpc_commit
                                      # no object stored, because for-commit-list is empty
    
    /cc @jm, @klaus, @Tyagov, @vpelletier
    c7c01ce4
file_zodb.py 17.3 KB