• Suren Baghdasaryan's avatar
    mm: separate vma->lock from vm_area_struct · c7f8f31c
    Suren Baghdasaryan authored
    vma->lock being part of the vm_area_struct causes performance regression
    during page faults because during contention its count and owner fields
    are constantly updated and having other parts of vm_area_struct used
    during page fault handling next to them causes constant cache line
    bouncing.  Fix that by moving the lock outside of the vm_area_struct.
    
    All attempts to keep vma->lock inside vm_area_struct in a separate cache
    line still produce performance regression especially on NUMA machines. 
    Smallest regression was achieved when lock is placed in the fourth cache
    line but that bloats vm_area_struct to 256 bytes.
    
    Considering performance and memory impact, separate lock looks like the
    best option.  It increases memory footprint of each VMA but that can be
    optimized later if the new size causes issues.  Note that after this
    change vma_init() does not allocate or initialize vma->lock anymore.  A
    number of drivers allocate a pseudo VMA on the stack but they never use
    the VMA's lock, therefore it does not need to be allocated.  The future
    drivers which might need the VMA lock should use
    vm_area_alloc()/vm_area_free() to allocate the VMA.
    
    Link: https://lkml.kernel.org/r/20230227173632.3292573-34-surenb@google.comSigned-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    c7f8f31c
fork.c 83.9 KB