1. 16 Mar, 2022 3 commits
    • Jens Axboe's avatar
      io_uring: cache req->apoll->events in req->cflags · 81459350
      Jens Axboe authored
      When we arm poll on behalf of a different type of request, like a network
      receive, then we allocate req->apoll as our poll entry. Running network
      workloads shows io_poll_check_events() as the most expensive part of
      io_uring, and it's all due to having to pull in req->apoll instead of
      just the request which we have hot already.
      
      Cache poll->events in req->cflags, which isn't used until the request
      completes anyway. This isn't strictly needed for regular poll, where
      req->poll.events is used and thus already hot, but for the sake of
      unification we do it all around.
      
      This saves 3-4% of overhead in certain request workloads.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      81459350
    • Jens Axboe's avatar
      io_uring: move req->poll_refs into previous struct hole · 521d61fc
      Jens Axboe authored
      This serves two purposes:
      
      - We now have the last cacheline mostly unused for generic workloads,
        instead of having to pull in the poll refs explicitly for workloads
        that rely on poll arming.
      
      - It shrinks the io_kiocb from 232 to 224 bytes.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      521d61fc
    • Dylan Yudaken's avatar
      io_uring: make tracing format consistent · 052ebf1f
      Dylan Yudaken authored
      Make the tracing formatting for user_data and flags consistent.
      
      Having consistent formatting allows one for example to grep for a specific
      user_data/flags and be able to trace a single sqe through easily.
      
      Change user_data to 0x%llx and flags to 0x%x everywhere. The '0x' is
      useful to disambiguate for example "user_data 100".
      
      Additionally remove the '=' for flags in io_uring_req_failed, again for consistency.
      Signed-off-by: default avatarDylan Yudaken <dylany@fb.com>
      Link: https://lore.kernel.org/r/20220316095204.2191498-1-dylany@fb.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      052ebf1f
  2. 15 Mar, 2022 1 commit
    • Jens Axboe's avatar
      io_uring: recycle apoll_poll entries · 4d9237e3
      Jens Axboe authored
      Particularly for networked workloads, io_uring intensively uses its
      poll based backend to get a notification when data/space is available.
      Profiling workloads, we see 3-4% of alloc+free that is directly attributed
      to just the apoll allocation and free (and the rest being skb alloc+free).
      
      For the fast path, we have ctx->uring_lock held already for both issue
      and the inline completions, and we can utilize that to avoid any extra
      locking needed to have a basic recycling cache for the apoll entries on
      both the alloc and free side.
      
      Double poll still requires an allocation. But those are rare and not
      a fast path item.
      
      With the simple cache in place, we see a 3-4% reduction in overhead for
      the workload.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4d9237e3
  3. 12 Mar, 2022 1 commit
  4. 10 Mar, 2022 25 commits
  5. 06 Mar, 2022 5 commits
    • Linus Torvalds's avatar
      Linux 5.17-rc7 · ffb217a1
      Linus Torvalds authored
      ffb217a1
    • Linus Torvalds's avatar
      Merge tag 'for-5.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 3ee65c0f
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A few more fixes for various problems that have user visible effects
        or seem to be urgent:
      
         - fix corruption when combining DIO and non-blocking io_uring over
           multiple extents (seen on MariaDB)
      
         - fix relocation crash due to premature return from commit
      
         - fix quota deadlock between rescan and qgroup removal
      
         - fix item data bounds checks in tree-checker (found on a fuzzed
           image)
      
         - fix fsync of prealloc extents after EOF
      
         - add missing run of delayed items after unlink during log replay
      
         - don't start relocation until snapshot drop is finished
      
         - fix reversed condition for subpage writers locking
      
         - fix warning on page error"
      
      * tag 'for-5.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fallback to blocking mode when doing async dio over multiple extents
        btrfs: add missing run of delayed items after unlink during log replay
        btrfs: qgroup: fix deadlock between rescan worker and remove qgroup
        btrfs: fix relocation crash due to premature return from btrfs_commit_transaction()
        btrfs: do not start relocation until in progress drops are done
        btrfs: tree-checker: use u64 for item data end to avoid overflow
        btrfs: do not WARN_ON() if we have PageError set
        btrfs: fix lost prealloc extents beyond eof after full fsync
        btrfs: subpage: fix a wrong check on subpage->writers
      3ee65c0f
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f81664f7
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "x86 guest:
      
         - Tweaks to the paravirtualization code, to avoid using them when
           they're pointless or harmful
      
        x86 host:
      
         - Fix for SRCU lockdep splat
      
         - Brown paper bag fix for the propagation of errno"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: pull kvm->srcu read-side to kvm_arch_vcpu_ioctl_run
        KVM: x86/mmu: Passing up the error state of mmu_alloc_shadow_roots()
        KVM: x86: Yield to IPI target vCPU only if it is busy
        x86/kvmclock: Fix Hyper-V Isolated VM's boot issue when vCPUs > 64
        x86/kvm: Don't waste memory if kvmclock is disabled
        x86/kvm: Don't use PV TLB/yield when mwait is advertised
      f81664f7
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 9bdeaca1
      Linus Torvalds authored
      Pull powerpc fix from Michael Ellerman:
       "Fix build failure when CONFIG_PPC_64S_HASH_MMU is not set.
      
        Thanks to Murilo Opsfelder Araujo, and Erhard F"
      
      * tag 'powerpc-5.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: Fix build failure when CONFIG_PPC_64S_HASH_MMU is not set
      9bdeaca1
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · f40a33f5
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Fix sorting on old "cpu" value in histograms
      
       - Fix return value of __setup() boot parameter handlers
      
      * tag 'trace-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix return value of __setup handlers
        tracing/histogram: Fix sorting on old "cpu" value
      f40a33f5
  6. 05 Mar, 2022 5 commits