• Peter Zijlstra's avatar
    locking/percpu-rwsem: Optimize readers and reduce global impact · 80127a39
    Peter Zijlstra authored
    Currently the percpu-rwsem switches to (global) atomic ops while a
    writer is waiting; which could be quite a while and slows down
    releasing the readers.
    
    This patch cures this problem by ordering the reader-state vs
    reader-count (see the comments in __percpu_down_read() and
    percpu_down_write()). This changes a global atomic op into a full
    memory barrier, which doesn't have the global cacheline contention.
    
    This also enables using the percpu-rwsem with rcu_sync disabled in order
    to bias the implementation differently, reducing the writer latency by
    adding some cost to readers.
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: linux-kernel@vger.kernel.org
    [ Fixed modular build. ]
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    80127a39
percpu-rwsem.c 5.14 KB