• Kirill Smelkov's avatar
    bigfile/virtmem: Do loadblk() with virtmem lock released · f49c11a3
    Kirill Smelkov authored
    loadblk() calls are potentially slow and external code that serve the cal can
    take other locks in addition to virtmem lock taken by virtmem subsystem. If
    that "other locks" are also taken before external code calls e.g.
    fileh_invalidate_page() in different codepath a deadlock can happen, e.g.
    
          T1                  T2
    
          page-access         invalidation-from-server received
          V -> loadblk
                              Z   <- ClientStorage.invalidateTransaction()
          Z -> zeo.load
                              V   <- fileh_invalidate_page
    
    The solution to avoid deadlock is to call loadblk() with virtmem lock released
    and upon loadblk() completion recheck virtmem data structures carefully.
    
    To make that happen:
    
    - new page state is introduces:
    
        PAGE_LOADING                (file content loading is  in progress)
    
    - virtmem releases virt_lock before calling loadblk() when serving pagefault
    
    - because loading is now done with virtmem lock released, now:
    
    1. After loading completes we need to recheck fileh/vma data structures
    
       The recheck is done in full - vma_on_pagefault() just asks its driver (see
       VM_RETRY and VM_HANDLED codes) to retry handling the fault completely. This
       should work as the freshly loaded page was just inserted into fileh->pagemap
       and should be found there in the cache on next lookup.
    
       On the other hand this also works correctly, if there was concurrent change
       - e.g. vma was unmapped while we were loading the data - in that case the
       fault will be also processed correctly - but loaded data will stay in
       fileh->pagemap (and if not used will be evicted as not-needed
       eventually by RAM reclaim).
    
    2. Similar to retrying mechanism is used for cases when two threads
       concurrently access the same page and would both try to load corresponding
       block - only one thread issues the actual loadblk() and another waits for load
       to complete with polling and VM_RETRY.
    
    3. To correctly invalidate loading-in-progress pages another new page state
       is introduced:
    
        PAGE_LOADING_INVALIDATED    (file content loading was in progress
                                     while request to invalidate the page came in)
    
       which fileh_invalidate_page() uses to propagate invalidation message to
       loadblk() caller.
    
    4. Blocks loading can now happen in parallel with other block loading and
       other virtmem operations - e.g. invalidation. For such cases tests are added
       to test_thread.py
    
    5. virtmem lock now becomes just regular lock, instead of being previously
       recursive.
    
       For virtmem lock to be recursive was needed for cases, when code under
       loadblk() could trigger other virtmem calls, e.g. due to GC and calling
       another VMA dtor that would want to lock virtmem, but virtmem lock was
       already held.
    
       This is no longer needed.
    
    6. To catch double faults we now cannot use just on static variable
       in_on_pagefault. That variable thus becomes thread-local.
    
    7. Old test in test_thread to "test that access vs access don't overlap" no
       longer holds true - and is thus removed.
    
    /cc @Tyagov, @klaus
    f49c11a3
virtmem.h 9.45 KB