Commit 61c77326 authored by Shaohua Li's avatar Shaohua Li Committed by H. Peter Anvin

x86, mm: Avoid unnecessary TLB flush

In x86, access and dirty bits are set automatically by CPU when CPU accesses
memory. When we go into the code path of below flush_tlb_fix_spurious_fault(),
we already set dirty bit for pte and don't need flush tlb. This might mean
tlb entry in some CPUs hasn't dirty bit set, but this doesn't matter. When
the CPUs do page write, they will automatically check the bit and no software
involved.

On the other hand, flush tlb in below position is harmful. Test creates CPU
number of threads, each thread writes to a same but random address in same vma
range and we measure the total time. Under a 4 socket system, original time is
1.96s, while with the patch, the time is 0.8s. Under a 2 socket system, there is
20% time cut too. perf shows a lot of time are taking to send ipi/handle ipi for
tlb flush.
Signed-off-by: default avatarShaohua Li <shaohua.li@intel.com>
LKML-Reference: <20100816011655.GA362@sli10-desk.sh.intel.com>
Acked-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrea Archangeli <aarcange@redhat.com>
Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
parent 76be97c1
...@@ -603,6 +603,8 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, ...@@ -603,6 +603,8 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm,
pte_update(mm, addr, ptep); pte_update(mm, addr, ptep);
} }
#define flush_tlb_fix_spurious_fault(vma, address)
/* /*
* clone_pgd_range(pgd_t *dst, pgd_t *src, int count); * clone_pgd_range(pgd_t *dst, pgd_t *src, int count);
* *
......
...@@ -129,6 +129,10 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres ...@@ -129,6 +129,10 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
#define move_pte(pte, prot, old_addr, new_addr) (pte) #define move_pte(pte, prot, old_addr, new_addr) (pte)
#endif #endif
#ifndef flush_tlb_fix_spurious_fault
#define flush_tlb_fix_spurious_fault(vma, address) flush_tlb_page(vma, address)
#endif
#ifndef pgprot_noncached #ifndef pgprot_noncached
#define pgprot_noncached(prot) (prot) #define pgprot_noncached(prot) (prot)
#endif #endif
......
...@@ -3147,7 +3147,7 @@ static inline int handle_pte_fault(struct mm_struct *mm, ...@@ -3147,7 +3147,7 @@ static inline int handle_pte_fault(struct mm_struct *mm,
* with threads. * with threads.
*/ */
if (flags & FAULT_FLAG_WRITE) if (flags & FAULT_FLAG_WRITE)
flush_tlb_page(vma, address); flush_tlb_fix_spurious_fault(vma, address);
} }
unlock: unlock:
pte_unmap_unlock(pte, ptl); pte_unmap_unlock(pte, ptl);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment