• Nicholas Piggin's avatar
    powerpc/64s/radix: Optimize flush_tlb_range · cbf09c83
    Nicholas Piggin authored
    Currently for radix, flush_tlb_range flushes the entire PID, because
    the Linux mm code does not tell us about page size here for THP vs
    regular pages. This is quite sub-optimal for small mremap / mprotect
    / change_protection.
    
    So implement va range flushes with two flush passes, one for each
    page size (regular and THP). The second flush has an order of matnitude
    fewer tlbie instructions than the first, so it is a relatively small
    additional cost.
    
    There is still room for improvement here with some changes to generic
    APIs, particularly if there are mostly THP pages to be invalidated,
    the small page flushes could be reduced.
    
    Time to mprotect 1 page of memory (after mmap, touch):
    vanilla 2.9us   1.8us
    patched 1.2us   1.6us
    
    Time to mprotect 30 pages of memory (after mmap, touch):
    vanilla 8.2us   7.2us
    patched 6.9us   17.9us
    
    Time to mprotect 34 pages of memory (after mmap, touch):
    vanilla 9.1us   8.0us
    patched 9.0us   8.0us
    
    34 pages is the point at which the invalidation switches from va
    to entire PID, which tlbie can do in a single instruction. This is
    why in the case of 30 pages, the new code runs slower for this test.
    This is a deliberate tradeoff already present in the unmap and THP
    promotion code, the idea is that the benefit from avoiding flushing
    entire TLB for this PID on all threads in the system.
    Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    cbf09c83
tlb-radix.c 16.6 KB