• Kirill Smelkov's avatar
    bigfile/virtmem: Do storeblk() with virtmem lock released · fb4bfb32
    Kirill Smelkov authored
    Like with loadblk (see f49c11a3 "bigfile/virtmem: Do loadblk() with
    virtmem lock released" for the reference) storeblk() calls are
    potentially slow and external code that serves the call can take other
    locks in addition to virtmem lock taken by virtmem subsystem.
    If that "other locks" are also taken before external code calls e.g.
    with fileh_invalidate_page() in different codepath - a deadlock can happen:
    
          T1                  T2
    
          commit              invalidation-from-server received
          V -> storeblk
                              Z   <- ClientStorage.invalidateTransaction()
          Z -> zeo.store
                              V   <- fileh_invalidate_page (of unrelated page)
    
    The solution to avoid deadlock, like for loadblk case, is to call storeblk()
    with virtmem lock released.
    
    However unlike loadblk which can be invoked at any time, storeblk is
    invoked at commit time only so for storeblk case we handle rules for making
    sure virtmem stays consistent after virtmem lock is retaken differently:
    
    1. We disallow several parallel writeouts for one fileh. This way dirty
       pages handling logic can not mess up. This restriction is also
       consistent with ZODB 2 phase commit protocol where for a transaction
       commit logic is invoked/handled from only 1 thread.
    
    2. For the same reason we disallow discard while writeout is in
       progress. This is also consistent with ZODB 2 phase commit protocol
       where txn.tpc_abort() is not expected to be called at the same time
       with txn.commit().
    
    3. While writeout is in progress, for that fileh we disallow pages
       modifications and pages invalidations - because both operations would
       change at least fileh dirty pages list which is iterated over by
       writeout code with releasing/retaking the virtmem lock. By
       disallowing them we make sure fileh dirty pages list stays constant
       during whole fileh writeout.
    
       This restrictions are also consistent with ZODB commit semantics:
    
       - while an object is being stored into ZODB it is not expected it
         will be further modified or explicitly invalidated by client via
         ._p_invalidate()
    
       - server initiated invalidations come into effect only at transaction
         boundaries - when new transaction is started, not during commit time.
    
    Also since now storeblk is called with virtmem lock released, for buffer
    to store we no longer can use present page mapping in some vma directly,
    because while virtmem lock is released that mappings can go away.
    
    Fixes: nexedi/wendelin.core#6
    fb4bfb32
virtmem.c 27.7 KB