Commits · 1f20a5769446a1acae67ac9e63d07a594829a789 · Kirill Smelkov / linux

07 May, 2024 6 commits

page_pool: make sure frag API fields don't span between cachelines · 1f20a576

Alexander Lobakin authored May 07, 2024

After commit 5027ec19 ("net: page_pool: split the page_pool_params
into fast and slow") that made &page_pool contain only "hot" params at
the start, cacheline boundary chops frag API fields group in the middle
again.
To not bother with this each time fast params get expanded or shrunk,
let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to
their actual size (2 longs + 1 int). This ensures 16-byte alignment for
the 32-bit architectures and 32-byte alignment for the 64-bit ones,
excluding unnecessary false-sharing.
::page_state_hold_cnt is used quite intensively on hotpath no matter if
frag API is used, so move it to the newly created hole in the first
cacheline.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

1f20a576

iommu/dma: avoid expensive indirect calls for sync operations · ea01fa70

Alexander Lobakin authored May 07, 2024

When IOMMU is on, the actual synchronization happens in the same cases
as with the direct DMA. Advertise %DMA_F_CAN_SKIP_SYNC in IOMMU DMA to
skip sync ops calls (indirect) for non-SWIOTLB buffers.

perf profile before the patch:

    18.53%  [kernel]       [k] gq_rx_skb
    14.77%  [kernel]       [k] napi_reuse_skb
     8.95%  [kernel]       [k] skb_release_data
     5.42%  [kernel]       [k] dev_gro_receive
     5.37%  [kernel]       [k] memcpy
<*>  5.26%  [kernel]       [k] iommu_dma_sync_sg_for_cpu
     4.78%  [kernel]       [k] tcp_gro_receive
<*>  4.42%  [kernel]       [k] iommu_dma_sync_sg_for_device
     4.12%  [kernel]       [k] ipv6_gro_receive
     3.65%  [kernel]       [k] gq_pool_get
     3.25%  [kernel]       [k] skb_gro_receive
     2.07%  [kernel]       [k] napi_gro_frags
     1.98%  [kernel]       [k] tcp6_gro_receive
     1.27%  [kernel]       [k] gq_rx_prep_buffers
     1.18%  [kernel]       [k] gq_rx_napi_handler
     0.99%  [kernel]       [k] csum_partial
     0.74%  [kernel]       [k] csum_ipv6_magic
     0.72%  [kernel]       [k] free_pcp_prepare
     0.60%  [kernel]       [k] __napi_poll
     0.58%  [kernel]       [k] net_rx_action
     0.56%  [kernel]       [k] read_tsc
<*>  0.50%  [kernel]       [k] __x86_indirect_thunk_r11
     0.45%  [kernel]       [k] memset

After patch, lines with <*> no longer show up, and overall
cpu usage looks much better (~60% instead of ~72%):

    25.56%  [kernel]       [k] gq_rx_skb
     9.90%  [kernel]       [k] napi_reuse_skb
     7.39%  [kernel]       [k] dev_gro_receive
     6.78%  [kernel]       [k] memcpy
     6.53%  [kernel]       [k] skb_release_data
     6.39%  [kernel]       [k] tcp_gro_receive
     5.71%  [kernel]       [k] ipv6_gro_receive
     4.35%  [kernel]       [k] napi_gro_frags
     4.34%  [kernel]       [k] skb_gro_receive
     3.50%  [kernel]       [k] gq_pool_get
     3.08%  [kernel]       [k] gq_rx_napi_handler
     2.35%  [kernel]       [k] tcp6_gro_receive
     2.06%  [kernel]       [k] gq_rx_prep_buffers
     1.32%  [kernel]       [k] csum_partial
     0.93%  [kernel]       [k] csum_ipv6_magic
     0.65%  [kernel]       [k] net_rx_action

iavf yields +10% of Mpps on Rx. This also unblocks batched allocations
of XSk buffers when IOMMU is active.
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

ea01fa70

dma: avoid redundant calls for sync operations · f406c8e4

Alexander Lobakin authored May 07, 2024

Quite often, devices do not need dma_sync operations on x86_64 at least.
Indeed, when dev_is_dma_coherent(dev) is true and
dev_use_swiotlb(dev) is false, iommu_dma_sync_single_for_cpu()
and friends do nothing.

However, indirectly calling them when CONFIG_RETPOLINE=y consumes about
10% of cycles on a cpu receiving packets from softirq at ~100Gbit rate.
Even if/when CONFIG_RETPOLINE is not set, there is a cost of about 3%.

Add dev->need_dma_sync boolean and turn it off during the device
initialization (dma_set_mask()) depending on the setup:
dev_is_dma_coherent() for the direct DMA, !(sync_single_for_device ||
sync_single_for_cpu) or the new dma_map_ops flag, %DMA_F_CAN_SKIP_SYNC,
advertised for non-NULL DMA ops.
Then later, if/when swiotlb is used for the first time, the flag
is reset back to on, from swiotlb_tbl_map_single().

On iavf, the UDP trafficgen with XDP_DROP in skb mode test shows
+3-5% increase for direct DMA.

Suggested-by: Christoph Hellwig <hch@lst.de> # direct DMA shortcut
Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

f406c8e4

dma: compile-out DMA sync op calls when not used · fe7514b1

Alexander Lobakin authored May 07, 2024

Some platforms do have DMA, but DMA there is always direct and coherent.
Currently, even on such platforms DMA sync operations are compiled and
called.
Add a new hidden Kconfig symbol, DMA_NEED_SYNC, and set it only when
either sync operations are needed or there is DMA ops or swiotlb
or DMA debug is enabled. Compile global dma_sync_*() and dma_need_sync()
only when it's set, otherwise provide empty inline stubs.
The change allows for future optimizations of DMA sync calls depending
on runtime conditions.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

fe7514b1

iommu/dma: fix zeroing of bounce buffer padding used by untrusted devices · 2650073f

Michael Kelley authored Apr 07, 2024

iommu_dma_map_page() allocates swiotlb memory as a bounce buffer when an
untrusted device wants to map only part of the memory in an granule. The
goal is to disallow the untrusted device having DMA access to unrelated
kernel data that may be sharing the granule. To meet this goal, the
bounce buffer itself is zeroed, and any additional swiotlb memory up to
alloc_size after the bounce buffer end (i.e., "post-padding") is also
zeroed.

However, as of commit 901c7280 ("Reinstate some of "swiotlb: rework
"fix info leak with DMA_FROM_DEVICE"""), swiotlb_tbl_map_single() always
initializes the contents of the bounce buffer to the original memory.
Zeroing the bounce buffer is redundant and probably wrong per the
discussion in that commit. Only the post-padding needs to be zeroed.

Also, when the DMA min_align_mask is non-zero, the allocated bounce
buffer space may not start on a granule boundary. The swiotlb memory
from the granule boundary to the start of the allocated bounce buffer
might belong to some unrelated bounce buffer. So as described in the
"second issue" in [1], it can't be zeroed to protect against untrusted
devices. But as of commit af133562 ("swiotlb: extend buffer
pre-padding to alloc_align_mask if necessary"), swiotlb_tbl_map_single()
allocates pre-padding slots when necessary to meet min_align_mask
requirements, making it possible to zero the pre-padding area as well.

Finally, iommu_dma_map_page() uses the swiotlb for untrusted devices
and also for certain kmalloc() memory. Current code does the zeroing
for both cases, but it is needed only for the untrusted device case.

Fix all of this by updating iommu_dma_map_page() to zero both the
pre-padding and post-padding areas, but not the actual bounce buffer.
Do this only in the case where the bounce buffer is used because
of an untrusted device.

[1] https://lore.kernel.org/all/20210929023300.335969-1-stevensd@google.com/Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>

2650073f

swiotlb: remove alloc_size argument to swiotlb_tbl_map_single() · 327e2c97

Michael Kelley authored Apr 07, 2024

Currently swiotlb_tbl_map_single() takes alloc_align_mask and
alloc_size arguments to specify an swiotlb allocation that is larger
than mapping_size.  This larger allocation is used solely by
iommu_dma_map_single() to handle untrusted devices that should not have
DMA visibility to memory pages that are partially used for unrelated
kernel data.

Having two arguments to specify the allocation is redundant. While
alloc_align_mask naturally specifies the alignment of the starting
address of the allocation, it can also implicitly specify the size
by rounding up the mapping_size to that alignment.

Additionally, the current approach has an edge case bug.
iommu_dma_map_page() already does the rounding up to compute the
alloc_size argument. But swiotlb_tbl_map_single() then calculates the
alignment offset based on the DMA min_align_mask, and adds that offset to
alloc_size. If the offset is non-zero, the addition may result in a value
that is larger than the max the swiotlb can allocate.  If the rounding up
is done _after_ the alignment offset is added to the mapping_size (and
the original mapping_size conforms to the value returned by
swiotlb_max_mapping_size), then the max that the swiotlb can allocate
will not be exceeded.

In view of these issues, simplify the swiotlb_tbl_map_single() interface
by removing the alloc_size argument. Most call sites pass the same value
for mapping_size and alloc_size, and they pass alloc_align_mask as zero.
Just remove the redundant argument from these callers, as they will see
no functional change. For iommu_dma_map_page() also remove the alloc_size
argument, and have swiotlb_tbl_map_single() compute the alloc_size by
rounding up mapping_size after adding the offset based on min_align_mask.
This has the side effect of fixing the edge case bug but with no other
functional change.

Also add a sanity test on the alloc_align_mask. While IOMMU code
currently ensures the granule is not larger than PAGE_SIZE, if that
guarantee were to be removed in the future, the downstream effect on the
swiotlb might go unnoticed until strange allocation failures occurred.

Tested on an ARM64 system with 16K page size and some kernel test-only
hackery to allow modifying the DMA min_align_mask and the granule size
that becomes the alloc_align_mask. Tested these combinations with a
variety of original memory addresses and sizes, including those that
reproduce the edge case bug:

 * 4K granule and 0 min_align_mask
 * 4K granule and 0xFFF min_align_mask (4K - 1)
 * 16K granule and 0xFFF min_align_mask
 * 64K granule and 0xFFF min_align_mask
 * 64K granule and 0x3FFF min_align_mask (16K - 1)

With the changes, all combinations pass.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>

327e2c97

02 May, 2024 1 commit

Documentation/core-api: add swiotlb documentation · c93f261d

Michael Kelley authored May 01, 2024

There's currently no documentation for the swiotlb. Add documentation
describing usage scenarios, the key APIs, and implementation details.
Group the new documentation with other DMA-related documentation.
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Christoph Hellwig <hch@lst.de>

c93f261d

28 Apr, 2024 6 commits

Linux 6.9-rc6 · e67572cd
Linus Torvalds authored Apr 28, 2024

e67572cd

Merge tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 245c8e81

Linus Torvalds authored Apr 28, 2024

Pull scheduler fixes from Ingo Molnar:

 - Fix EEVDF corner cases

 - Fix two nohz_full= related bugs that can cause boot crashes
   and warnings

* tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU
  sched/isolation: Prevent boot crash when the boot CPU is nohz_full
  sched/eevdf: Prevent vlag from going out of bounds in reweight_eevdf()
  sched/eevdf: Fix miscalculation in reweight_entity() when se is not curr
  sched/eevdf: Always update V if se->on_rq when reweighting

245c8e81

Merge tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · aec147c1

Linus Torvalds authored Apr 28, 2024

Pull x86 fixes from Ingo Molnar:

 - Make the CPU_MITIGATIONS=n interaction with conflicting
   mitigation-enabling boot parameters a bit saner.

 - Re-enable CPU mitigations by default on non-x86

 - Fix TDX shared bit propagation on mprotect()

 - Fix potential show_regs() system hang when PKE initialization
   is not fully finished yet.

 - Add the 0x10-0x1f model IDs to the Zen5 range

 - Harden #VC instruction emulation some more

* tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
  cpu: Re-enable CPU mitigations by default for !X86 architectures
  x86/tdx: Preserve shared bit on mprotect()
  x86/cpu: Fix check for RDPKRU in __show_regs()
  x86/CPU/AMD: Add models 0x10-0x1f to the Zen5 range
  x86/sev: Check for MWAITX and MONITORX opcodes in the #VC handler

aec147c1

Merge tag 'irq-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8d62e9bf

Linus Torvalds authored Apr 28, 2024

Pull irq fix from Ingo Molnar:
 "Fix a double free bug in the init error path of the GICv3 irqchip
  driver"

* tag 'irq-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Prevent double free on error

8d62e9bf

sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU · 257bf89d

Oleg Nesterov authored Apr 13, 2024

housekeeping_setup() checks cpumask_intersects(present, online) to ensure
that the kernel will have at least one housekeeping CPU after smp_init(),
but this doesn't work if the maxcpus= kernel parameter limits the number of
processors available after bootup.

For example, a kernel with "maxcpus=2 nohz_full=0-2" parameters crashes at
boot time on a virtual machine with 4 CPUs.

Change housekeeping_setup() to use cpumask_first_and() and check that the
returned CPU number is valid and less than setup_max_cpus.

Another corner case is "nohz_full=0" on a machine with a single CPU or with
the maxcpus=1 kernel argument. In this case non_housekeeping_mask is empty
and tick_nohz_full_setup() makes no sense. And indeed, the kernel hits the
WARN_ON(tick_nohz_full_running) in tick_sched_do_timer().

And how should the kernel interpret the "nohz_full=" parameter? It should
be silently ignored, but currently cpulist_parse() happily returns the
empty cpumask and this leads to the same problem.

Change housekeeping_setup() to check cpumask_empty(non_housekeeping_mask)
and do nothing in this case.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20240413141746.GA10008@redhat.com

257bf89d

sched/isolation: Prevent boot crash when the boot CPU is nohz_full · 5097cbcb

Oleg Nesterov authored Apr 11, 2024

Documentation/timers/no_hz.rst states that the "nohz_full=" mask must not
include the boot CPU, which is no longer true after:

  08ae95f4 ("nohz_full: Allow the boot CPU to be nohz_full").

However after:

  aae17ebb ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")

the kernel will crash at boot time in this case; housekeeping_any_cpu()
returns an invalid CPU number until smp_init() brings the first
housekeeping CPU up.

Change housekeeping_any_cpu() to check the result of cpumask_any_and() and
return smp_processor_id() in this case.

This is just the simple and backportable workaround which fixes the
symptom, but smp_processor_id() at boot time should be safe at least for
type == HK_TYPE_TIMER, this more or less matches the tick_do_timer_boot_cpu
logic.

There is no worry about cpu_down(); tick_nohz_cpu_down() will not allow to
offline tick_do_timer_cpu (the 1st online housekeeping CPU).

Fixes: aae17ebb ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")
Reported-by: Chris von Recklinghausen <crecklin@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20240411143905.GA19288@redhat.com
Closes: https://lore.kernel.org/all/20240402105847.GA24832@redhat.com/

5097cbcb

27 Apr, 2024 9 commits

Merge tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux · 2c815938

Linus Torvalds authored Apr 27, 2024

Pull Rust fixes from Miguel Ojeda:

 - Soundness: make internal functions generated by the 'module!' macro
   inaccessible, do not implement 'Zeroable' for 'Infallible' and
   require 'Send' for the 'Module' trait.

 - Build: avoid errors with "empty" files and workaround 'rustdoc' ICE.

 - Kconfig: depend on '!CFI_CLANG' and avoid selecting 'CONSTRUCTORS'.

 - Code docs: remove non-existing key from 'module!' macro example.

 - Docs: trivial rendering fix in arch table.

* tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux:
  rust: remove `params` from `module` macro example
  kbuild: rust: force `alloc` extern to allow "empty" Rust files
  kbuild: rust: remove unneeded `@rustc_cfg` to avoid ICE
  rust: kernel: require `Send` for `Module` implementations
  rust: phy: implement `Send` for `Registration`
  rust: make mutually exclusive with CFI_CLANG
  rust: macros: fix soundness issue in `module!` macro
  rust: init: remove impl Zeroable for Infallible
  docs: rust: fix improper rendering in Arch Support page
  rust: don't select CONSTRUCTORS

2c815938

Merge tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 57865f39

Linus Torvalds authored Apr 27, 2024

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
   separation

 - A fix to avoid loading rv64/NOMMU kernel past the start of RAM

 - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
   overflow in the bitmask

 - The sud_test kselftest has been fixed to properly swizzle the syscall
   number into the return register, which are not the same on RISC-V

 - A fix for a build warning in the perf tools on rv32

 - A fix for the CBO selftests, to avoid non-constants leaking into the
   inline asm

 - A pair of fixes for T-Head PBMT errata probing, which has been
   renamed MAE by the vendor

* tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
  perf riscv: Fix the warning due to the incompatible type
  riscv: T-Head: Test availability bit before enabling MAE errata
  riscv: thead: Rename T-Head PBMT to MAE
  selftests: sud_test: return correct emulated syscall value on RISC-V
  riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
  riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
  riscv: Fix TASK_SIZE on 64-bit NOMMU

57865f39

Merge tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · d43df69f

Linus Torvalds authored Apr 27, 2024

Pull smb client fixes from Steve French:
 "Three smb3 client fixes, all also for stable:

   - two small locking fixes spotted by Coverity

   - FILE_ALL_INFO and network_open_info packing fix"

* tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: fix lock ordering potential deadlock in cifs_sync_mid_result
  smb3: missing lock when picking channel
  smb: client: Fix struct_group() usage in __packed structs

d43df69f

Merge tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5d12ed4b

Linus Torvalds authored Apr 27, 2024

Pull i2c fixes from Wolfram Sang:
 "Fix a race condition in the at24 eeprom handler, a NULL pointer
  exception in the I2C core for controllers only using target modes,
  drop a MAINTAINERS entry, and fix an incorrect DT binding for at24"

* tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: smbus: fix NULL function pointer dereference
  MAINTAINERS: Drop entry for PCA9541 bus master selector
  eeprom: at24: fix memory corruption race condition
  dt-bindings: eeprom: at24: Fix ST M24C64-D compatible schema

5d12ed4b

profiling: Remove create_prof_cpu_mask(). · 2e5449f4

Tetsuo Handa authored Apr 27, 2024

create_prof_cpu_mask() is no longer used after commit 1f44a225 ("s390:
convert interrupt handling to use generic hardirq").
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

2e5449f4

Merge tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · 8a5c3ef7

Linus Torvalds authored Apr 27, 2024

Pull soundwire fix from Vinod Koul:

 - Single AMD driver fix for wake interrupt handling in clockstop mode

* tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: amd: fix for wake interrupt handling for clockstop mode

8a5c3ef7

Merge tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 6fba14a7

Linus Torvalds authored Apr 27, 2024

Pull dmaengine fixes from Vinod Koul:

 - Revert pl330 issue_pending waits until WFP state due to regression
   reported in Bluetooth loading

 - Xilinx driver fixes for synchronization, buffer offsets, locking and
   kdoc

 - idxd fixes for spinlock and preventing the migration of the perf
   context to an invalid target

 - idma driver fix for interrupt handling when powered off

 - Tegra driver residual calculation fix

 - Owl driver register access fix

* tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: idxd: Fix oops during rmmod on single-CPU platforms
  dmaengine: xilinx: xdma: Clarify kdoc in XDMA driver
  dmaengine: xilinx: xdma: Fix synchronization issue
  dmaengine: xilinx: xdma: Fix wrong offsets in the buffers addresses in dma descriptor
  dma: xilinx_dpdma: Fix locking
  dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue
  idma64: Don't try to serve interrupts when device is powered off
  dmaengine: tegra186: Fix residual calculation
  dmaengine: owl: fix register access functions
  dmaengine: Revert "dmaengine: pl330: issue_pending waits until WFP state"

6fba14a7

Merge tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · 63407d30

Linus Torvalds authored Apr 27, 2024

Pull phy fixes from Vinod Koul:

 - static checker (array size, bounds) fix for marvel driver

 - Rockchip rk3588 pcie fixes for bifurcation and mux

 - Qualcomm qmp-compbo fix for VCO, register base and regulator name for
   m31 driver

 - charger det crash fix for ti driver

* tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: ti: tusb1210: Resolve charger-det crash if charger psy is unregistered
  phy: qcom: qmp-combo: fix VCO div offset on v5_5nm and v6
  phy: phy-rockchip-samsung-hdptx: Select CONFIG_RATIONAL
  phy: qcom: m31: match requested regulator name with dt schema
  phy: qcom: qmp-combo: Fix register base for QSERDES_DP_PHY_MODE
  phy: qcom: qmp-combo: Fix VCO div offset on v3
  phy: rockchip: naneng-combphy: Fix mux on rk3588
  phy: rockchip-snps-pcie3: fix clearing PHP_GRF_PCIESEL_CON bits
  phy: rockchip-snps-pcie3: fix bifurcation on rk3588
  phy: freescale: imx8m-pcie: fix pcie link-up instability
  phy: marvell: a3700-comphy: Fix hardcoded array size
  phy: marvell: a3700-comphy: Fix out of bounds read

63407d30

i2c: smbus: fix NULL function pointer dereference · 91811a31

Wolfram Sang authored Apr 26, 2024

Baruch reported an OOPS when using the designware controller as target
only. Target-only modes break the assumption of one transfer function
always being available. Fix this by always checking the pointer in
__i2c_transfer.
Reported-by: Baruch Siach <baruch@tkos.co.il>
Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il
Fixes: 4b1acc43 ("i2c: core changes for slave support")
[wsa: dropped the simplification in core-smbus to avoid theoretical regressions]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Baruch Siach <baruch@tkos.co.il>

91811a31

26 Apr, 2024 18 commits

Merge tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 5eb4573e

Linus Torvalds authored Apr 26, 2024

Pull ARM SoC fixes from Arnd Bergmann:
 "There are a lot of minor DT fixes for Mediatek, Rockchip, Qualcomm and
  Microchip and NXP, addressing both build-time warnings and bugs found
  during runtime testing.

  Most of these changes are machine specific fixups, but there are a few
  notable regressions that affect an entire SoC:

   - The Qualcomm MSI support that was improved for 6.9 ended up being
     wrong on some chips and now gets fixed.

   - The i.MX8MP camera interface broke due to a typo and gets updated
     again.

  The main driver fix is also for Qualcomm platforms, rewriting an
  interface in the QSEECOM firmware support that could lead to crashing
  the kernel from a trusted application.

  The only other code changes are minor fixes for Mediatek SoC drivers"

* tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits)
  ARM: dts: imx6ull-tarragon: fix USB over-current polarity
  soc: mediatek: mtk-socinfo: depends on CONFIG_SOC_BUS
  soc: mediatek: mtk-svs: Append "-thermal" to thermal zone names
  arm64: dts: imx8mp: Fix assigned-clocks for second CSI2
  ARM: dts: microchip: at91-sama7g54_curiosity: Replace regulator-suspend-voltage with the valid property
  ARM: dts: microchip: at91-sama7g5ek: Replace regulator-suspend-voltage with the valid property
  arm64: dts: rockchip: Fix USB interface compatible string on kobol-helios64
  arm64: dts: qcom: sc8180x: Fix ss_phy_irq for secondary USB controller
  arm64: dts: qcom: sm8650: Fix the msi-map entries
  arm64: dts: qcom: sm8550: Fix the msi-map entries
  arm64: dts: qcom: sm8450: Fix the msi-map entries
  arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
  arm64: dts: qcom: x1e80100: Fix the compatible for cluster idle states
  arm64: dts: qcom: Fix type of "wdog" IRQs for remoteprocs
  arm64: dts: rockchip: regulator for sd needs to be always on for BPI-R2Pro
  dt-bindings: rockchip: grf: Add missing type to 'pcie-phy' node
  arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 2
  arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 1
  arm64: dts: rockchip: drop redundant pcie-reset-suspend in Scarlet Dumo
  arm64: dts: rockchip: mark system power controller and fix typo on orangepi-5-plus
  ...

5eb4573e

Merge tag 'mm-hotfixes-stable-2024-04-26-13-30' of... · e6ebf011

Linus Torvalds authored Apr 26, 2024

Merge tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address
  post-6.8 issues or aren't considered suitable for backporting.

  All except one of these are for MM. I see no particular theme - it's
  singletons all over"

* tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  selftests: mm: protection_keys: save/restore nr_hugepages value from launch script
  stackdepot: respect __GFP_NOLOCKDEP allocation flag
  hugetlb: check for anon_vma prior to folio allocation
  mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
  mm: turn folio_test_hugetlb into a PageType
  mm: support page_mapcount() on page_has_type() pages
  mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros
  mm/hugetlb: fix missing hugetlb_lock for resv uncharge
  selftests: mm: fix unused and uninitialized variable warning
  selftests/harness: remove use of LINE_MAX

e6ebf011

Merge tag 'mmc-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 4630932a

Linus Torvalds authored Apr 26, 2024

Pull MMC host fixes from Ulf Hansson:

 - moxart: Fix regression for sg_miter for PIO mode

 - sdhci-msm: Avoid hang by preventing access to suspended controller

 - sdhci-of-dwcmshc: Fix SD card tuning error for th1520

* tag 'mmc-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: moxart: fix handling of sgm->consumed, otherwise WARN_ON triggers
  mmc: sdhci-of-dwcmshc: th1520: Increase tuning loop count to 128
  mmc: sdhci-msm: pervent access to suspended controller

4630932a

Merge tag 'arc-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · c9e35b4a

Linus Torvalds authored Apr 26, 2024

Pull ARC fixes from Vineet Gupta:

 - Incorrect VIPT aliasing assumption

 - Misc build warning fixes and some typos

* tag 'arc-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [plat-hsdk]: Remove misplaced interrupt-cells property
  ARC: Fix typos
  ARC: mm: fix new code about cache aliasing
  ARC: Fix -Wmissing-prototypes warnings

c9e35b4a

Merge tag 'mtd/fixes-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · bbacf717

Linus Torvalds authored Apr 26, 2024

Pull MTD fixes from Miquel Raynal:
 "There has been OTP support improvements in the NVMEM subsystem, and
  later also improvements of OTP support in the NAND subsystem. This
  lead to situations that we currently cannot handle, so better prevent
  this situation from happening in order to avoid canceling device's
  probe.

  In the raw NAND subsystem, two runtime fixes have been shared, one
  fixing two important commands in the Qcom driver since it got reworked
  and a NULL pointer dereference happening on STB chips.

  Arnd also fixed a UBSAN link failure on diskonchip"

* tag 'mtd/fixes-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: limit OTP NVMEM cell parse to non-NAND devices
  mtd: diskonchip: work around ubsan link failure
  mtd: rawnand: qcom: Fix broken OP_RESET_DEVICE command in qcom_misc_cmd_type_exec()
  mtd: rawnand: brcmnand: Fix data access violation for STB chip

bbacf717

Merge tag 'gpio-fixes-for-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 3022bf37

Linus Torvalds authored Apr 26, 2024

Pull gpio fixes from Bartosz Golaszewski:

 - fix a regression in pin access control in gpio-tegra186

 - make data pointer dereference robust in Intel Tangier driver

* tag 'gpio-fixes-for-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: tegra186: Fix tegra186_gpio_is_accessible() check
  gpio: tangier: Use correct type for the IRQ chip data

3022bf37

Merge tag 'cxl-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl · 5b43efa1

Linus Torvalds authored Apr 26, 2024

Pull cxl fix from Dave Jiang:

 - Fix potential payload size confusion in cxl_mem_get_poison()

* tag 'cxl-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core: Fix potential payload size confusion in cxl_mem_get_poison()

5b43efa1

Merge tag 'for-6.9/dm-fixes-3' of... · 08f0677d

Linus Torvalds authored Apr 26, 2024

Merge tag 'for-6.9/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix 6.9 regression so that DM device removal is performed
   synchronously by default.

   Asynchronous removal has always been possible but it isn't the
   default. It is important that synchronous removal be preserved,
   otherwise it is an interface change that breaks lvm2.

 - Remove errant semicolon in drivers/md/dm-vdo/murmurhash3.c

* tag 'for-6.9/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: restore synchronous close of device mapper block device
  dm vdo murmurhash: remove unneeded semicolon

08f0677d

Merge tag 'vfs-6.9-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 52034cae

Linus Torvalds authored Apr 26, 2024

Pull vfs fixes from Christian Brauner:
 "This contains a few small fixes for this merge window and the attempt
  to handle the ntfs removal regression that was reported a little while
  ago:

   - After the removal of the legacy ntfs driver we received reports
     about regressions for some people that do mount "ntfs" explicitly
     and expect the driver to be available. Since ntfs3 is a drop-in for
     legacy ntfs we alias legacy ntfs to ntfs3 just like ext3 is aliased
     to ext4.

     We also enforce legacy ntfs is always mounted read-only and give it
     custom file operations to ensure that ioctl()'s can't be abused to
     perform write operations.

   - Fix an unbalanced module_get() in bdev_open().

   - Two smaller fixes for the netfs work done earlier in this cycle.

   - Fix the errno returned from the new FS_IOC_GETUUID and
     FS_IOC_GETFSSYSFSPATH ioctls. Both commands just pull information
     out of the superblock so there's no need to call into the actual
     ioctl handlers.

     So instead of returning ENOIOCTLCMD to indicate to fallback we just
     return ENOTTY directly avoiding that indirection"

* tag 'vfs-6.9-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix the pre-flush when appending to a file in writethrough mode
  netfs: Fix writethrough-mode error handling
  ntfs3: add legacy ntfs file operations
  ntfs3: enforce read-only when used as legacy ntfs driver
  ntfs3: serve as alias for the legacy ntfs driver
  block: fix module reference leakage from bdev_open_by_dev error path
  fs: Return ENOTTY directly if FS_IOC_GETUUID or FS_IOC_GETFSSYSFSPATH fail

52034cae

Merge tag 'loongarch-fixes-6.9-2' of... · 09ef2957

Linus Torvalds authored Apr 26, 2024

Merge tag 'loongarch-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix some build errors and some trivial runtime bugs"

* tag 'loongarch-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Lately init pmu after smp is online
  LoongArch: Fix callchain parse error with kernel tracepoint events
  LoongArch: Fix access error when read fault on a write-only VMA
  LoongArch: Fix a build error due to __tlb_remove_tlb_entry()
  LoongArch: Fix Kconfig item and left code related to CRASH_CORE

09ef2957

Merge tag 'pwm/for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux · 084c473c

Linus Torvalds authored Apr 26, 2024

Pull maintainer entry update from Uwe Kleine-König:
 "This is just an update to my maintainer entries as I will switch jobs
  soon. Getting a contact email address into the MAINTAINERS file that
  will work also after my switch will hopefully reduce people mailing to
  the then non-existing address.

  I also drop my co-maintenance for SIOX, but that continues to be in
  good hands"

* tag 'pwm/for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
  MAINTAINERS: Update Uwe's email address, drop SIOX maintenance

084c473c

Merge tag 'drm-fixes-2024-04-26' of https://gitlab.freedesktop.org/drm/kernel · 61ef6208

Linus Torvalds authored Apr 26, 2024

Pull drm fixes from Dave Airlie:
 "Regular weekly merge request, mostly amdgpu and misc bits in
  xe/etnaviv/gma500 and some core changes. Nothing too outlandish, seems
  to be about normal for this time of release.

  atomic-helpers:
   - Fix memory leak in drm_format_conv_state_copy()

  fbdev:
   - fbdefio: Fix address calculation

  amdgpu:
   - Suspend/resume fix
   - Don't expose gpu_od directory if it's empty
   - SDMA 4.4.2 fix
   - VPE fix
   - BO eviction fix
   - UMSCH fix
   - SMU 13.0.6 reset fixes
   - GPUVM flush accounting fix
   - SDMA 5.2 fix
   - Fix possible UAF in mes code

  amdkfd:
   - Eviction fence handling fix
   - Fix memory leak when GPU memory allocation fails
   - Fix dma-buf validation
   - Fix rescheduling of restore worker
   - SVM fix

  gma500:
   - Fix crash during boot

  etnaviv:
   - fix GC7000 TX clock gating
   - revert NPU UAPI changes

  xe:
   - Fix error paths on managed allocations
   - Fix PF/VF relay messages"

* tag 'drm-fixes-2024-04-26' of https://gitlab.freedesktop.org/drm/kernel: (23 commits)
  Revert "drm/etnaviv: Expose a few more chipspecs to userspace"
  drm/etnaviv: fix tx clock gating on some GC7000 variants
  drm/xe/guc: Fix arguments passed to relay G2H handlers
  drm/xe: call free_gsc_pkt only once on action add failure
  drm/xe: Remove sysfs only once on action add failure
  fbdev: fix incorrect address computation in deferred IO
  drm/amdgpu/mes: fix use-after-free issue
  drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3
  drm/amdgpu: Fix the ring buffer size for queue VM flush
  drm/amdkfd: Add VRAM accounting for SVM migration
  drm/amd/pm: Restore config space after reset
  drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend
  drm/amdkfd: Fix rescheduling of restore worker
  drm/amdgpu: Update BO eviction priorities
  drm/amdgpu/vpe: fix vpe dpm setup failed
  drm/amdgpu: Assign correct bits for SDMA HDP flush
  drm/amdgpu/pm: Remove gpu_od if it's an empty directory
  drm/amdkfd: make sure VM is ready for updating operations
  drm/amdgpu: Fix leak when GPU memory allocation fails
  drm/amdkfd: Fix eviction fence handling
  ...

61ef6208

Merge tag 'mtk-soc-fixes-for-v6.9' of... · 9f26bc71

Arnd Bergmann authored Apr 26, 2024

Merge tag 'mtk-soc-fixes-for-v6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/mediatek/linux into for-next

MediaTek driver fixes for v6.9

This fixes the MediaTek SVS driver to look for the right thermal zone
names, and adds a missing Kconfig dependency for mtk-socinfo.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

9f26bc71

Merge patch series "RISC-V: Test th.sxstatus.MAEE bit before enabling MAEE" · 6beb6bc5

Palmer Dabbelt authored Apr 25, 2024

Christoph Müllner <christoph.muellner@vrull.eu> says:

Currently, the Linux kernel suffers from a boot regression when running
on the c906 QEMU emulation. Details have been reported here by Björn Töpel:
https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg04766.html

The main issue is, that Linux enables XTheadMae for CPUs that have a T-Head
mvendorid but QEMU maintainers don't want to emulate a CPU that uses
reserved bits in PTEs. See also the following discussion for more
context:
https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00775.html

This series renames "T-Head PBMT" to "MAE"/"XTheadMae" and only enables
it if the th.sxstatus.MAEE bit is set.

The th.sxstatus CSR is documented here:
https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadsxstatus.adoc

XTheadMae is documented here:
https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadmae.adoc

The QEMU patch to emulate th.sxstatus with the MAEE bit not set is here:
https://lore.kernel.org/all/20240329120427.684677-1-christoph.muellner@vrull.eu/

After applying the referenced QEMU patch, this patchset allows to
successfully boot a C906 QEMU system emulation ("-cpu thead-c906").

* b4-shazam-lts:
riscv: T-Head: Test availability bit before enabling MAE errata
riscv: thead: Rename T-Head PBMT to MAE

Link: https://lore.kernel.org/r/20240407213236.2121592-1-christoph.muellner@vrull.euSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

6beb6bc5

RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2 · 49408400

Andrew Jones authored Mar 22, 2024

Commit 0de65288 ("RISC-V: selftests: cbo: Ensure asm operands
match constraints") attempted to ensure MK_CBO() would always
provide to a compile-time constant when given a constant, but
cpu_to_le32() isn't necessarily going to do that. Switch to manually
shifting the bytes, when needed, to finally get this right.
Reported-by: Woodrow Shen <woodrow.shen@sifive.com>
Closes: https://lore.kernel.org/all/CABquHATcBTUwfLpd9sPObBgNobqQKEAZ2yxk+TWSpyO5xvpXpg@mail.gmail.com/
Fixes: a29e2a48 ("RISC-V: selftests: Add CBO tests")
Fixes: 0de65288 ("RISC-V: selftests: cbo: Ensure asm operands match constraints")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240322134728.151255-2-ajones@ventanamicro.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

49408400

perf riscv: Fix the warning due to the incompatible type · 9c49085d

Ben Zong-You Xie authored Mar 05, 2024

In the 32-bit platform, the second argument of getline is expectd to be
'size_t *'(aka 'unsigned int *'), but line_sz is of type
'unsigned long *'. Therefore, declare line_sz as size_t.
Signed-off-by: Ben Zong-You Xie <ben717@andestech.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240305120501.1785084-3-ben717@andestech.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

9c49085d

Merge tag 'qcom-drivers-fixes-for-6.9' of... · 14672a9b

Arnd Bergmann authored Apr 26, 2024

Merge tag 'qcom-drivers-fixes-for-6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into for-next

Qualcomm driver fix for v6.9

This reworks the memory layout of the argument buffers passed to trusted
applications in QSEECOM, to avoid failures and system crashes.

* tag 'qcom-drivers-fixes-for-6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  firmware: qcom: uefisecapp: Fix memory related IO errors and crashes

Link: https://lore.kernel.org/r/20240420163816.1133528-1-andersson@kernel.orgSigned-off-by: Arnd Bergmann <arnd@arndb.de>

14672a9b

Merge tag 'imx-fixes-6.9-2' of... · 7e685383

Arnd Bergmann authored Apr 26, 2024

Merge tag 'imx-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into for-next

i.MX fixes for 6.9, round 2:

- Fix i.MX8MP the second CSI2 assigned-clock property which got wrong by
  commit f78835d1 ("arm64: dts: imx8mp: reparent MEDIA_MIPI_PHY1_REF
  to CLK_24M")
- Correct USB over-current polarity for imx6ull-tarragon board

* tag 'imx-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6ull-tarragon: fix USB over-current polarity
  arm64: dts: imx8mp: Fix assigned-clocks for second CSI2

Link: https://lore.kernel.org/r/ZioopqscxwUOwQkf@dragonSigned-off-by: Arnd Bergmann <arnd@arndb.de>

7e685383