1. 01 May, 2024 6 commits
    • Ye Bin's avatar
      Documentation: tracing: add new type '%pd' and '%pD' for kprobe · 5e37460f
      Ye Bin authored
      Similar to printk() '%pd' is for fetch dentry's name from struct dentry's
      pointer, and '%pD' is for fetch file's name from struct file's pointer.
      
      Link: https://lore.kernel.org/all/20240322064308.284457-4-yebin10@huawei.com/Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      5e37460f
    • Ye Bin's avatar
      tracing/probes: support '%pD' type for print struct file's name · 20fe4d07
      Ye Bin authored
      As like '%pd' type, this patch supports print type '%pD' for print file's
      name. For example "name=$arg1:%pD" casts the `$arg1` as (struct file*),
      dereferences the "file.f_path.dentry.d_name.name" field and stores it to
      "name" argument as a kernel string.
      Here is an example:
      [tracing]# echo 'p:testprobe vfs_read name=$arg1:%pD' > kprobe_event
      [tracing]# echo 1 > events/kprobes/testprobe/enable
      [tracing]# grep -q "1" events/kprobes/testprobe/enable
      [tracing]# echo 0 > events/kprobes/testprobe/enable
      [tracing]# grep "vfs_read" trace | grep "enable"
                  grep-15108   [003] .....  5228.328609: testprobe: (vfs_read+0x4/0xbb0) name="enable"
      
      Note that this expects the given argument (e.g. $arg1) is an address of struct
      file. User must ensure it.
      
      Link: https://lore.kernel.org/all/20240322064308.284457-3-yebin10@huawei.com/
      [Masami: replaced "previous patch" with '%pd' type]
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      20fe4d07
    • Ye Bin's avatar
      tracing/probes: support '%pd' type for print struct dentry's name · d9b15224
      Ye Bin authored
      During fault locating, the file name needs to be printed based on the
      dentry  address. The offset needs to be calculated each time, which
      is troublesome. Similar to printk, kprobe support print type '%pd' for
      print dentry's name. For example "name=$arg1:%pd" casts the `$arg1`
      as (struct dentry *), dereferences the "d_name.name" field and stores
      it to "name" argument as a kernel string.
      Here is an example:
      [tracing]# echo 'p:testprobe dput name=$arg1:%pd' > kprobe_events
      [tracing]# echo 1 > events/kprobes/testprobe/enable
      [tracing]# grep -q "1" events/kprobes/testprobe/enable
      [tracing]# echo 0 > events/kprobes/testprobe/enable
      [tracing]# cat trace | grep "enable"
      	    bash-14844   [002] ..... 16912.889543: testprobe: (dput+0x4/0x30) name="enable"
                  grep-15389   [003] ..... 16922.834182: testprobe: (dput+0x4/0x30) name="enable"
                  grep-15389   [003] ..... 16922.836103: testprobe: (dput+0x4/0x30) name="enable"
                  bash-14844   [001] ..... 16931.820909: testprobe: (dput+0x4/0x30) name="enable"
      
      Note that this expects the given argument (e.g. $arg1) is an address of struct
      dentry. User must ensure it.
      
      Link: https://lore.kernel.org/all/20240322064308.284457-2-yebin10@huawei.com/Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      d9b15224
    • Andrii Nakryiko's avatar
      uprobes: add speculative lockless system-wide uprobe filter check · cdf355cc
      Andrii Nakryiko authored
      It's very common with BPF-based uprobe/uretprobe use cases to have
      a system-wide (not PID specific) probes used. In this case uprobe's
      trace_uprobe_filter->nr_systemwide counter is bumped at registration
      time, and actual filtering is short circuited at the time when
      uprobe/uretprobe is triggered.
      
      This is a great optimization, and the only issue with it is that to even
      get to checking this counter uprobe subsystem is taking
      read-side trace_uprobe_filter->rwlock. This is actually noticeable in
      profiles and is just another point of contention when uprobe is
      triggered on multiple CPUs simultaneously.
      
      This patch moves this nr_systemwide check outside of filter list's
      rwlock scope, as rwlock is meant to protect list modification, while
      nr_systemwide-based check is speculative and racy already, despite the
      lock (as discussed in [0]). trace_uprobe_filter_remove() and
      trace_uprobe_filter_add() already check for filter->nr_systewide
      explicitly outside of __uprobe_perf_filter, so no modifications are
      required there.
      
      Confirming with BPF selftests's based benchmarks.
      
      BEFORE (based on changes in previous patch)
      ===========================================
      uprobe-nop     :    2.732 ± 0.022M/s
      uprobe-push    :    2.621 ± 0.016M/s
      uprobe-ret     :    1.105 ± 0.007M/s
      uretprobe-nop  :    1.396 ± 0.007M/s
      uretprobe-push :    1.347 ± 0.008M/s
      uretprobe-ret  :    0.800 ± 0.006M/s
      
      AFTER
      =====
      uprobe-nop     :    2.878 ± 0.017M/s (+5.5%, total +8.3%)
      uprobe-push    :    2.753 ± 0.013M/s (+5.3%, total +10.2%)
      uprobe-ret     :    1.142 ± 0.010M/s (+3.8%, total +3.8%)
      uretprobe-nop  :    1.444 ± 0.008M/s (+3.5%, total +6.5%)
      uretprobe-push :    1.410 ± 0.010M/s (+4.8%, total +7.1%)
      uretprobe-ret  :    0.816 ± 0.002M/s (+2.0%, total +3.9%)
      
      In the above, first percentage value is based on top of previous patch
      (lazy uprobe buffer optimization), while the "total" percentage is
      based on kernel without any of the changes in this patch set.
      
      As can be seen, we get about 4% - 10% speed up, in total, with both lazy
      uprobe buffer and speculative filter check optimizations.
      
        [0] https://lore.kernel.org/bpf/20240313131926.GA19986@redhat.com/Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/all/20240318181728.2795838-4-andrii@kernel.org/Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      cdf355cc
    • Andrii Nakryiko's avatar
      uprobes: prepare uprobe args buffer lazily · 1b8f85de
      Andrii Nakryiko authored
      uprobe_cpu_buffer and corresponding logic to store uprobe args into it
      are used for uprobes/uretprobes that are created through tracefs or
      perf events.
      
      BPF is yet another user of uprobe/uretprobe infrastructure, but doesn't
      need uprobe_cpu_buffer and associated data. For BPF-only use cases this
      buffer handling and preparation is a pure overhead. At the same time,
      BPF-only uprobe/uretprobe usage is very common in practice. Also, for
      a lot of cases applications are very senstivie to performance overheads,
      as they might be tracing a very high frequency functions like
      malloc()/free(), so every bit of performance improvement matters.
      
      All that is to say that this uprobe_cpu_buffer preparation is an
      unnecessary overhead that each BPF user of uprobes/uretprobe has to pay.
      This patch is changing this by making uprobe_cpu_buffer preparation
      optional. It will happen only if either tracefs-based or perf event-based
      uprobe/uretprobe consumer is registered for given uprobe/uretprobe. For
      BPF-only use cases this step will be skipped.
      
      We used uprobe/uretprobe benchmark which is part of BPF selftests (see [0])
      to estimate the improvements. We have 3 uprobe and 3 uretprobe
      scenarios, which vary an instruction that is replaced by uprobe: nop
      (fastest uprobe case), `push rbp` (typical case), and non-simulated
      `ret` instruction (slowest case). Benchmark thread is constantly calling
      user space function in a tight loop. User space function has attached
      BPF uprobe or uretprobe program doing nothing but atomic counter
      increments to count number of triggering calls. Benchmark emits
      throughput in millions of executions per second.
      
      BEFORE these changes
      ====================
      uprobe-nop     :    2.657 ± 0.024M/s
      uprobe-push    :    2.499 ± 0.018M/s
      uprobe-ret     :    1.100 ± 0.006M/s
      uretprobe-nop  :    1.356 ± 0.004M/s
      uretprobe-push :    1.317 ± 0.019M/s
      uretprobe-ret  :    0.785 ± 0.007M/s
      
      AFTER these changes
      ===================
      uprobe-nop     :    2.732 ± 0.022M/s (+2.8%)
      uprobe-push    :    2.621 ± 0.016M/s (+4.9%)
      uprobe-ret     :    1.105 ± 0.007M/s (+0.5%)
      uretprobe-nop  :    1.396 ± 0.007M/s (+2.9%)
      uretprobe-push :    1.347 ± 0.008M/s (+2.3%)
      uretprobe-ret  :    0.800 ± 0.006M/s (+1.9)
      
      So the improvements on this particular machine seems to be between 2% and 5%.
      
        [0] https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/benchs/bench_trigger.cReviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/all/20240318181728.2795838-3-andrii@kernel.org/Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      1b8f85de
    • Andrii Nakryiko's avatar
      uprobes: encapsulate preparation of uprobe args buffer · 3eaea21b
      Andrii Nakryiko authored
      Move the logic of fetching temporary per-CPU uprobe buffer and storing
      uprobes args into it to a new helper function. Store data size as part
      of this buffer, simplifying interfaces a bit, as now we only pass single
      uprobe_cpu_buffer reference around, instead of pointer + dsize.
      
      This logic was duplicated across uprobe_dispatcher and uretprobe_dispatcher,
      and now will be centralized. All this is also in preparation to make
      this uprobe_cpu_buffer handling logic optional in the next patch.
      
      Link: https://lore.kernel.org/all/20240318181728.2795838-2-andrii@kernel.org/
      [Masami: update for v6.9-rc3 kernel]
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      3eaea21b
  2. 28 Apr, 2024 6 commits
  3. 27 Apr, 2024 9 commits
    • Linus Torvalds's avatar
      Merge tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux · 2c815938
      Linus Torvalds authored
      Pull Rust fixes from Miguel Ojeda:
      
       - Soundness: make internal functions generated by the 'module!' macro
         inaccessible, do not implement 'Zeroable' for 'Infallible' and
         require 'Send' for the 'Module' trait.
      
       - Build: avoid errors with "empty" files and workaround 'rustdoc' ICE.
      
       - Kconfig: depend on '!CFI_CLANG' and avoid selecting 'CONSTRUCTORS'.
      
       - Code docs: remove non-existing key from 'module!' macro example.
      
       - Docs: trivial rendering fix in arch table.
      
      * tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux:
        rust: remove `params` from `module` macro example
        kbuild: rust: force `alloc` extern to allow "empty" Rust files
        kbuild: rust: remove unneeded `@rustc_cfg` to avoid ICE
        rust: kernel: require `Send` for `Module` implementations
        rust: phy: implement `Send` for `Registration`
        rust: make mutually exclusive with CFI_CLANG
        rust: macros: fix soundness issue in `module!` macro
        rust: init: remove impl Zeroable for Infallible
        docs: rust: fix improper rendering in Arch Support page
        rust: don't select CONSTRUCTORS
      2c815938
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 57865f39
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
         separation
      
       - A fix to avoid loading rv64/NOMMU kernel past the start of RAM
      
       - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
         overflow in the bitmask
      
       - The sud_test kselftest has been fixed to properly swizzle the syscall
         number into the return register, which are not the same on RISC-V
      
       - A fix for a build warning in the perf tools on rv32
      
       - A fix for the CBO selftests, to avoid non-constants leaking into the
         inline asm
      
       - A pair of fixes for T-Head PBMT errata probing, which has been
         renamed MAE by the vendor
      
      * tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
        perf riscv: Fix the warning due to the incompatible type
        riscv: T-Head: Test availability bit before enabling MAE errata
        riscv: thead: Rename T-Head PBMT to MAE
        selftests: sud_test: return correct emulated syscall value on RISC-V
        riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
        riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
        riscv: Fix TASK_SIZE on 64-bit NOMMU
      57865f39
    • Linus Torvalds's avatar
      Merge tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · d43df69f
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Three smb3 client fixes, all also for stable:
      
         - two small locking fixes spotted by Coverity
      
         - FILE_ALL_INFO and network_open_info packing fix"
      
      * tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: fix lock ordering potential deadlock in cifs_sync_mid_result
        smb3: missing lock when picking channel
        smb: client: Fix struct_group() usage in __packed structs
      d43df69f
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5d12ed4b
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Fix a race condition in the at24 eeprom handler, a NULL pointer
        exception in the I2C core for controllers only using target modes,
        drop a MAINTAINERS entry, and fix an incorrect DT binding for at24"
      
      * tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: smbus: fix NULL function pointer dereference
        MAINTAINERS: Drop entry for PCA9541 bus master selector
        eeprom: at24: fix memory corruption race condition
        dt-bindings: eeprom: at24: Fix ST M24C64-D compatible schema
      5d12ed4b
    • Tetsuo Handa's avatar
      profiling: Remove create_prof_cpu_mask(). · 2e5449f4
      Tetsuo Handa authored
      create_prof_cpu_mask() is no longer used after commit 1f44a225 ("s390:
      convert interrupt handling to use generic hardirq").
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e5449f4
    • Linus Torvalds's avatar
      Merge tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · 8a5c3ef7
      Linus Torvalds authored
      Pull soundwire fix from Vinod Koul:
      
       - Single AMD driver fix for wake interrupt handling in clockstop mode
      
      * tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
        soundwire: amd: fix for wake interrupt handling for clockstop mode
      8a5c3ef7
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 6fba14a7
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
      
       - Revert pl330 issue_pending waits until WFP state due to regression
         reported in Bluetooth loading
      
       - Xilinx driver fixes for synchronization, buffer offsets, locking and
         kdoc
      
       - idxd fixes for spinlock and preventing the migration of the perf
         context to an invalid target
      
       - idma driver fix for interrupt handling when powered off
      
       - Tegra driver residual calculation fix
      
       - Owl driver register access fix
      
      * tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: idxd: Fix oops during rmmod on single-CPU platforms
        dmaengine: xilinx: xdma: Clarify kdoc in XDMA driver
        dmaengine: xilinx: xdma: Fix synchronization issue
        dmaengine: xilinx: xdma: Fix wrong offsets in the buffers addresses in dma descriptor
        dma: xilinx_dpdma: Fix locking
        dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue
        idma64: Don't try to serve interrupts when device is powered off
        dmaengine: tegra186: Fix residual calculation
        dmaengine: owl: fix register access functions
        dmaengine: Revert "dmaengine: pl330: issue_pending waits until WFP state"
      6fba14a7
    • Linus Torvalds's avatar
      Merge tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · 63407d30
      Linus Torvalds authored
      Pull phy fixes from Vinod Koul:
      
       - static checker (array size, bounds) fix for marvel driver
      
       - Rockchip rk3588 pcie fixes for bifurcation and mux
      
       - Qualcomm qmp-compbo fix for VCO, register base and regulator name for
         m31 driver
      
       - charger det crash fix for ti driver
      
      * tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
        phy: ti: tusb1210: Resolve charger-det crash if charger psy is unregistered
        phy: qcom: qmp-combo: fix VCO div offset on v5_5nm and v6
        phy: phy-rockchip-samsung-hdptx: Select CONFIG_RATIONAL
        phy: qcom: m31: match requested regulator name with dt schema
        phy: qcom: qmp-combo: Fix register base for QSERDES_DP_PHY_MODE
        phy: qcom: qmp-combo: Fix VCO div offset on v3
        phy: rockchip: naneng-combphy: Fix mux on rk3588
        phy: rockchip-snps-pcie3: fix clearing PHP_GRF_PCIESEL_CON bits
        phy: rockchip-snps-pcie3: fix bifurcation on rk3588
        phy: freescale: imx8m-pcie: fix pcie link-up instability
        phy: marvell: a3700-comphy: Fix hardcoded array size
        phy: marvell: a3700-comphy: Fix out of bounds read
      63407d30
    • Wolfram Sang's avatar
      i2c: smbus: fix NULL function pointer dereference · 91811a31
      Wolfram Sang authored
      Baruch reported an OOPS when using the designware controller as target
      only. Target-only modes break the assumption of one transfer function
      always being available. Fix this by always checking the pointer in
      __i2c_transfer.
      Reported-by: default avatarBaruch Siach <baruch@tkos.co.il>
      Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il
      Fixes: 4b1acc43 ("i2c: core changes for slave support")
      [wsa: dropped the simplification in core-smbus to avoid theoretical regressions]
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Tested-by: default avatarBaruch Siach <baruch@tkos.co.il>
      91811a31
  4. 26 Apr, 2024 19 commits