• Hao Xu's avatar
    io-wq: decouple work_list protection from the big wqe->lock · 42abc95f
    Hao Xu authored
    wqe->lock is abused, it now protects acct->work_list, hash stuff,
    nr_workers, wqe->free_list and so on. Lets first get the work_list out
    of the wqe-lock mess by introduce a specific lock for work list. This
    is the first step to solve the huge contension between work insertion
    and work consumption.
    good thing:
      - split locking for bound and unbound work list
      - reduce contension between work_list visit and (worker's)free_list.
    
    For the hash stuff, since there won't be a work with same file in both
    bound and unbound work list, thus they won't visit same hash entry. it
    works well to use the new lock to protect hash stuff.
    
    Results:
    set max_unbound_worker = 4, test with echo-server:
    nice -n -15 ./io_uring_echo_server -p 8081 -f -n 1000 -l 16
    (-n connection, -l workload)
    before this patch:
    Samples: 2M of event 'cycles:ppp', Event count (approx.): 1239982111074
    Overhead  Command          Shared Object         Symbol
      28.59%  iou-wrk-10021    [kernel.vmlinux]      [k] native_queued_spin_lock_slowpath
       8.89%  io_uring_echo_s  [kernel.vmlinux]      [k] native_queued_spin_lock_slowpath
       6.20%  iou-wrk-10021    [kernel.vmlinux]      [k] _raw_spin_lock
       2.45%  io_uring_echo_s  [kernel.vmlinux]      [k] io_prep_async_work
       2.36%  iou-wrk-10021    [kernel.vmlinux]      [k] _raw_spin_lock_irqsave
       2.29%  iou-wrk-10021    [kernel.vmlinux]      [k] io_worker_handle_work
       1.29%  io_uring_echo_s  [kernel.vmlinux]      [k] io_wqe_enqueue
       1.06%  iou-wrk-10021    [kernel.vmlinux]      [k] io_wqe_worker
       1.06%  io_uring_echo_s  [kernel.vmlinux]      [k] _raw_spin_lock
       1.03%  iou-wrk-10021    [kernel.vmlinux]      [k] __schedule
       0.99%  iou-wrk-10021    [kernel.vmlinux]      [k] tcp_sendmsg_locked
    
    with this patch:
    Samples: 1M of event 'cycles:ppp', Event count (approx.): 708446691943
    Overhead  Command          Shared Object         Symbol
      16.86%  iou-wrk-10893    [kernel.vmlinux]      [k] native_queued_spin_lock_slowpat
       9.10%  iou-wrk-10893    [kernel.vmlinux]      [k] _raw_spin_lock
       4.53%  io_uring_echo_s  [kernel.vmlinux]      [k] native_queued_spin_lock_slowpat
       2.87%  iou-wrk-10893    [kernel.vmlinux]      [k] io_worker_handle_work
       2.57%  iou-wrk-10893    [kernel.vmlinux]      [k] _raw_spin_lock_irqsave
       2.56%  io_uring_echo_s  [kernel.vmlinux]      [k] io_prep_async_work
       1.82%  io_uring_echo_s  [kernel.vmlinux]      [k] _raw_spin_lock
       1.33%  iou-wrk-10893    [kernel.vmlinux]      [k] io_wqe_worker
       1.26%  io_uring_echo_s  [kernel.vmlinux]      [k] try_to_wake_up
    
    spin_lock failure from 25.59% + 8.89% =  34.48% to 16.86% + 4.53% = 21.39%
    TPS is similar, while cpu usage is from almost 400% to 350%
    Signed-off-by: default avatarHao Xu <haoxu@linux.alibaba.com>
    Link: https://lore.kernel.org/r/20220206095241.121485-2-haoxu@linux.alibaba.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
    42abc95f
io-wq.c 33.8 KB