• Julien Muchembled's avatar
    Minimize the amount of work during tpc_finish · 7eb7cf1b
    Julien Muchembled authored
    NEO did not ensure that all data and metadata are written on disk before
    tpc_finish, and it was for example vulnerable to ENOSPC errors.
    In other words, some work had to be moved to tpc_vote:
    
    - In tpc_vote, all involved storage nodes are now asked to write all metadata
      to ttrans/tobj and _commit_. Because the final tid is not known yet, the tid
      column of ttrans and tobj now contains NULL and the ttid respectively.
    
    - In tpc_finish, AskLockInformation is still required for read locking,
      ttrans.tid is updated with the final value and this change is _committed_.
    
    - The verification phase is greatly simplified, more reliable and faster. For
      all voted transactions, we can know if a tpc_finish was started by getting
      the final tid from the ttid, either from ttrans or from trans. And we know
      that such transactions can't be partial so we don't need to check oids.
    
    So in addition to minimizing the risk of failures during tpc_finish, we also
    fix a bug causing the verification phase to discard transactions with objects
    for which readCurrent was called.
    
    On performance side:
    
    - Although tpc_vote now asks all involved storages, instead of only those
      storing the transaction metadata, the client has been improved to do this
      in parallel. The additional commits are also all done in parallel.
    
    - A possible improvement to compensate the additional commits is to delay the
      commit done by the unlock.
    
    - By minimizing the time to lock transactions, objects are read-locked for a
      much shorter period. This is even more important that locked transactions
      must be unlocked in the same order.
    
    Transactions with too many modified objects will now timeout inside tpc_vote
    instead of tpc_finish. Of course, such transactions may still cause other
    transaction to timeout in tpc_finish.
    7eb7cf1b
mysqldb.py 30.9 KB