Commit d3681e26 authored by Waiman Long's avatar Waiman Long Committed by Ingo Molnar

locking/rwsem: Wake up almost all readers in wait queue

When the front of the wait queue is a reader, other readers
immediately following the first reader will also be woken up at the
same time. However, if there is a writer in between. Those readers
behind the writer will not be woken up.

Because of optimistic spinning, the lock acquisition order is not FIFO
anyway. The lock handoff mechanism will ensure that lock starvation
will not happen.

Assuming that the lock hold times of the other readers still in the
queue will be about the same as the readers that are being woken up,
there is really not much additional cost other than the additional
latency due to the wakeup of additional tasks by the waker. Therefore
all the readers up to a maximum of 256 in the queue are woken up when
the first waiter is a reader to improve reader throughput. This is
somewhat similar in concept to a phase-fair R/W lock.

With a locking microbenchmark running on 5.1 based kernel, the total
locking rates (in kops/s) on a 8-socket IvyBridge-EX system with
equal numbers of readers and writers before and after this patch were
as follows:

   # of Threads  Pre-Patch   Post-patch
   ------------  ---------   ----------
        4          1,641        1,674
        8            731        1,062
       16            564          924
       32             78          300
       64             38          195
      240             50          149

There is no performance gain at low contention level. At high contention
level, however, this patch gives a pretty decent performance boost.
Signed-off-by: default avatarWaiman Long <longman@redhat.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Link: https://lkml.kernel.org/r/20190520205918.22251-11-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent 990fa738
...@@ -254,6 +254,14 @@ enum writer_wait_state { ...@@ -254,6 +254,14 @@ enum writer_wait_state {
*/ */
#define RWSEM_WAIT_TIMEOUT DIV_ROUND_UP(HZ, 250) #define RWSEM_WAIT_TIMEOUT DIV_ROUND_UP(HZ, 250)
/*
* Magic number to batch-wakeup waiting readers, even when writers are
* also present in the queue. This both limits the amount of work the
* waking thread must do and also prevents any potential counter overflow,
* however unlikely.
*/
#define MAX_READERS_WAKEUP 0x100
/* /*
* handle the lock release when processes blocked on it that can now run * handle the lock release when processes blocked on it that can now run
* - if we come here from up_xxxx(), then the RWSEM_FLAG_WAITERS bit must * - if we come here from up_xxxx(), then the RWSEM_FLAG_WAITERS bit must
...@@ -329,11 +337,17 @@ static void rwsem_mark_wake(struct rw_semaphore *sem, ...@@ -329,11 +337,17 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
} }
/* /*
* Grant an infinite number of read locks to the readers at the front * Grant up to MAX_READERS_WAKEUP read locks to all the readers in the
* of the queue. We know that woken will be at least 1 as we accounted * queue. We know that the woken will be at least 1 as we accounted
* for above. Note we increment the 'active part' of the count by the * for above. Note we increment the 'active part' of the count by the
* number of readers before waking any processes up. * number of readers before waking any processes up.
* *
* This is an adaptation of the phase-fair R/W locks where at the
* reader phase (first waiter is a reader), all readers are eligible
* to acquire the lock at the same time irrespective of their order
* in the queue. The writers acquire the lock according to their
* order in the queue.
*
* We have to do wakeup in 2 passes to prevent the possibility that * We have to do wakeup in 2 passes to prevent the possibility that
* the reader count may be decremented before it is incremented. It * the reader count may be decremented before it is incremented. It
* is because the to-be-woken waiter may not have slept yet. So it * is because the to-be-woken waiter may not have slept yet. So it
...@@ -345,13 +359,20 @@ static void rwsem_mark_wake(struct rw_semaphore *sem, ...@@ -345,13 +359,20 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
* 2) For each waiters in the new list, clear waiter->task and * 2) For each waiters in the new list, clear waiter->task and
* put them into wake_q to be woken up later. * put them into wake_q to be woken up later.
*/ */
list_for_each_entry(waiter, &sem->wait_list, list) { INIT_LIST_HEAD(&wlist);
list_for_each_entry_safe(waiter, tmp, &sem->wait_list, list) {
if (waiter->type == RWSEM_WAITING_FOR_WRITE) if (waiter->type == RWSEM_WAITING_FOR_WRITE)
break; continue;
woken++; woken++;
list_move_tail(&waiter->list, &wlist);
/*
* Limit # of readers that can be woken up per wakeup call.
*/
if (woken >= MAX_READERS_WAKEUP)
break;
} }
list_cut_before(&wlist, &sem->wait_list, &waiter->list);
adjustment = woken * RWSEM_READER_BIAS - adjustment; adjustment = woken * RWSEM_READER_BIAS - adjustment;
lockevent_cond_inc(rwsem_wake_reader, woken); lockevent_cond_inc(rwsem_wake_reader, woken);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment