• Rik van Riel's avatar
    x86/mm/tlb: Make lazy TLB mode lazier · 145f573b
    Rik van Riel authored
    Lazy TLB mode can result in an idle CPU being woken up by a TLB flush,
    when all it really needs to do is reload %CR3 at the next context switch,
    assuming no page table pages got freed.
    
    Memory ordering is used to prevent race conditions between switch_mm_irqs_off,
    which checks whether .tlb_gen changed, and the TLB invalidation code, which
    increments .tlb_gen whenever page table entries get invalidated.
    
    The atomic increment in inc_mm_tlb_gen is its own barrier; the context
    switch code adds an explicit barrier between reading tlbstate.is_lazy and
    next->context.tlb_gen.
    
    CPUs in lazy TLB mode remain part of the mm_cpumask(mm), both because
    that allows TLB flush IPIs to be sent at page table freeing time, and
    because the cache line bouncing on the mm_cpumask(mm) was responsible
    for about half the CPU use in switch_mm_irqs_off().
    
    We can change native_flush_tlb_others() without touching other
    (paravirt) implementations of flush_tlb_others() because we'll be
    flushing less. The existing implementations flush more and are
    therefore still correct.
    
    Cc: npiggin@gmail.com
    Cc: mingo@kernel.org
    Cc: will.deacon@arm.com
    Cc: kernel-team@fb.com
    Cc: luto@kernel.org
    Cc: hpa@zytor.com
    Tested-by: default avatarSong Liu <songliubraving@fb.com>
    Signed-off-by: default avatarRik van Riel <riel@surriel.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Link: http://lkml.kernel.org/r/20180926035844.1420-8-riel@surriel.com
    145f573b
tlb.c 23.6 KB