Commit d03853d5 authored by Olof Johansson's avatar Olof Johansson Committed by Linus Torvalds

[PATCH] PPC64: Remove hot busy-wait loop in __hash_page

It turns out that our current __hash_page code will do a very hot busy-wait
loop waiting on _PAGE_BUSY to be cleared.  It even does ldarx/stdcx in the
loop, which will bounce reservations around like crazy if there's more than
one CPU spinning on the same PTE (or even another PTE in the same
reservation granule).  The end result is that each fault takes longer when
there's contention, which in turn increases the chance of another thread
hitting the same fault and also piling up.  Not pretty.

There's two options here:
1. Do an out-of-line busy loop a'la spinlocks with just loads (no
   reserves)
2. Just bail and refault if needed.

(2) makes sense here: If the PTE is busy, chances are it's in flux anyway
and the other code path making a change might just be ready to hash it.

This fixes a stampede seen on a large-ish system where a multithreaded
HPC app faults in the same text pages on several cpus at the same time.
Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 66faf984
...@@ -85,7 +85,10 @@ _GLOBAL(__hash_page) ...@@ -85,7 +85,10 @@ _GLOBAL(__hash_page)
bne- htab_wrong_access bne- htab_wrong_access
/* Check if PTE is busy */ /* Check if PTE is busy */
andi. r0,r31,_PAGE_BUSY andi. r0,r31,_PAGE_BUSY
bne- 1b /* If so, just bail out and refault if needed. Someone else
* is changing this PTE anyway and might hash it.
*/
bne- bail_ok
/* Prepare new PTE value (turn access RW into DIRTY, then /* Prepare new PTE value (turn access RW into DIRTY, then
* add BUSY,HASHPTE and ACCESSED) * add BUSY,HASHPTE and ACCESSED)
*/ */
...@@ -215,6 +218,10 @@ _GLOBAL(htab_call_hpte_remove) ...@@ -215,6 +218,10 @@ _GLOBAL(htab_call_hpte_remove)
/* Try all again */ /* Try all again */
b htab_insert_pte b htab_insert_pte
bail_ok:
li r3,0
b bail
htab_pte_insert_ok: htab_pte_insert_ok:
/* Insert slot number & secondary bit in PTE */ /* Insert slot number & secondary bit in PTE */
rldimi r30,r3,12,63-15 rldimi r30,r3,12,63-15
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment