Commits · fd73a4395d477ae134f319f7368a9f8a6264fd8b · Kirill Smelkov / linux

11 Aug, 2023 3 commits

erofs: boost negative xattr lookup with bloom filter · fd73a439

Jingbo Xu authored Jul 22, 2023

Optimise the negative xattr lookup with bloom filter.

The bit value for the bloom filter map has a reverse semantics for
compatibility. That is, the bit value of 0 indicates existence, while
the bit value of 1 indicates the absence of corresponding xattr.

The initial version is _only_ enabled when xattr_filter_reserved is
zero. The filter map internals may change in the future, in which case
the reserved flag will be set non-zero and we don't need bothering the
compatible bits again at that time. For now disable the optimization if
this reserved flag is non-zero.
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230722094538.11754-3-jefflexu@linux.alibaba.comSigned-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

fd73a439

erofs: update on-disk format for xattr name filter · 3f339920

Jingbo Xu authored Jul 22, 2023

The xattr name bloom filter feature is going to be introduced to speed
up the negative xattr lookup, e.g. system.posix_acl_[access|default]
lookup when running "ls -lR" workload.

There are some commonly used extended attributes (n) and the total
number of these is approximately 30.

	trusted.overlay.opaque
	trusted.overlay.redirect
	trusted.overlay.origin
	trusted.overlay.impure
	trusted.overlay.nlink
	trusted.overlay.upper
	trusted.overlay.metacopy
	trusted.overlay.protattr
	user.overlay.opaque
	user.overlay.redirect
	user.overlay.origin
	user.overlay.impure
	user.overlay.nlink
	user.overlay.upper
	user.overlay.metacopy
	user.overlay.protattr
	security.evm
	security.ima
	security.selinux
	security.SMACK64
	security.SMACK64IPIN
	security.SMACK64IPOUT
	security.SMACK64EXEC
	security.SMACK64TRANSMUTE
	security.SMACK64MMAP
	security.apparmor
	security.capability
	system.posix_acl_access
	system.posix_acl_default
	user.mime_type

Given the number of bits of the bloom filter (m) is 32, the optimal
value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).

The single hash function is implemented as:

	xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)

where `index` represents the index of corresponding predefined short name
prefix, while `name` represents the name string after stripping the above
predefined name prefix.

The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
used to give a better spread when mapping these 30 extended attributes
into 32-bit bloom filter as:

	bit  0: security.ima
	bit  1:
	bit  2: trusted.overlay.nlink
	bit  3:
	bit  4: user.overlay.nlink
	bit  5: trusted.overlay.upper
	bit  6: user.overlay.origin
	bit  7: trusted.overlay.protattr
	bit  8: security.apparmor
	bit  9: user.overlay.protattr
	bit 10: user.overlay.opaque
	bit 11: security.selinux
	bit 12: security.SMACK64TRANSMUTE
	bit 13: security.SMACK64
	bit 14: security.SMACK64MMAP
	bit 15: user.overlay.impure
	bit 16: security.SMACK64IPIN
	bit 17: trusted.overlay.redirect
	bit 18: trusted.overlay.origin
	bit 19: security.SMACK64IPOUT
	bit 20: trusted.overlay.opaque
	bit 21: system.posix_acl_default
	bit 22:
	bit 23: user.mime_type
	bit 24: trusted.overlay.impure
	bit 25: security.SMACK64EXEC
	bit 26: user.overlay.redirect
	bit 27: user.overlay.upper
	bit 28: security.evm
	bit 29: security.capability
	bit 30: system.posix_acl_access
	bit 31: trusted.overlay.metacopy, user.overlay.metacopy

h_name_filter is introduced to the on-disk per-inode xattr header to
place the corresponding xattr name filter, where bit value 1 indicates
non-existence for compatibility.

This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
compatible feature bit.

Reserve one byte in on-disk superblock as the on-disk format for xattr
name filter may change in the future.  With this flag we don't need
bothering these compatible bits again at that time.
Suggested-by: Alexander Larsson <alexl@redhat.com>
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230722094538.11754-2-jefflexu@linux.alibaba.comSigned-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

3f339920

erofs: DEFLATE compression support · ffa09b3b

Gao Xiang authored Aug 10, 2023

Add DEFLATE compression as the 3rd supported algorithm.

DEFLATE is a popular generic-purpose compression algorithm for quite
long time (many advanced formats like gzip, zlib, zip, png are all
based on that) as Apple documentation written "If you require
interoperability with non-Apple devices, use COMPRESSION_ZLIB. [1]".

Due to its popularity, there are several hardware on-market DEFLATE
accelerators, such as (s390) DFLTCC, (Intel) IAA/QAT, (HiSilicon) ZIP
accelerator, etc. In addition, there are also several high-performence
IP cores and even open-source FPGA approches available for DEFLATE.
Therefore, it's useful to support DEFLATE compression in order to find
a way to utilize these accelerators for asynchronous I/Os and get
benefits from these later.

Besides, it's a good choice to trade off between compression ratios
and performance compared to LZ4 and LZMA. The DEFLATE core format is
simple as well as easy to understand, therefore the code size of its
decompressor is small even for the bootloader use cases. The runtime
memory consumption is quite limited too (e.g. 32K + ~7K for each zlib
stream). As usual, EROFS ourperforms similar approaches too.

Alternatively, DEFLATE could still be used for some specific files
since EROFS supports multiple compression algorithms in one image.

[1] https://developer.apple.com/documentation/compression/compression_algorithmReviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230810154859.118330-1-hsiangkao@linux.alibaba.com

ffa09b3b

06 Aug, 2023 8 commits

Linux 6.5-rc5 · 52a93d39
Linus Torvalds authored Aug 06, 2023

52a93d39

Merge tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 0108963f

Linus Torvalds authored Aug 06, 2023

Pull vfs fixes from Christian Brauner:

 - Fix a wrong check for O_TMPFILE during RESOLVE_CACHED lookup

 - Clean up directory iterators and clarify file_needs_f_pos_lock()

* tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: rely on ->iterate_shared to determine f_pos locking
  vfs: get rid of old '->iterate' directory operation
  proc: fix missing conversion to 'iterate_shared'
  open: make RESOLVE_CACHED correctly test for O_TMPFILE

0108963f

fs: rely on ->iterate_shared to determine f_pos locking · 7d84d1b9

Christian Brauner authored Aug 06, 2023

Now that we removed ->iterate we don't need to check for either
->iterate or ->iterate_shared in file_needs_f_pos_lock(). Simply check
for ->iterate_shared instead. This will tell us whether we need to
unconditionally take the lock. Not just does it allow us to avoid
checking f_inode's mode it also actually clearly shows that we're
locking because of readdir.
Signed-off-by: Christian Brauner <brauner@kernel.org>

7d84d1b9

vfs: get rid of old '->iterate' directory operation · 3e327154

Linus Torvalds authored Aug 05, 2023

All users now just use '->iterate_shared()', which only takes the
directory inode lock for reading.

Filesystems that never got convered to shared mode now instead use a
wrapper that drops the lock, re-takes it in write mode, calls the old
function, and then downgrades the lock back to read mode.

This way the VFS layer and other callers no longer need to care about
filesystems that never got converted to the modern era.

The filesystems that use the new wrapper are ceph, coda, exfat, jfs,
ntfs, ocfs2, overlayfs, and vboxsf.

Honestly, several of them look like they really could just iterate their
directories in shared mode and skip the wrapper entirely, but the point
of this change is to not change semantics or fix filesystems that
haven't been fixed in the last 7+ years, but to finally get rid of the
dual iterators.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

3e327154

proc: fix missing conversion to 'iterate_shared' · 0a2c2baa

Linus Torvalds authored Aug 05, 2023

I'm looking at the directory handling due to the discussion about f_pos
locking (see commit 79796425: "file: reinstate f_pos locking
optimization for regular files"), and wanting to clean that up.

And one source of ugliness is how we were supposed to move filesystems
over to the '->iterate_shared()' function that only takes the inode lock
for reading many many years ago, but several filesystems still use the
bad old '->iterate()' that takes the inode lock for exclusive access.

See commit 61922694 ("introduce a parallel variant of ->iterate()")
that also added some documentation stating

      Old method is only used if the new one is absent; eventually it will
      be removed.  Switch while you still can; the old one won't stay.

and that was back in April 2016.  Here we are, many years later, and the
old version is still clearly sadly alive and well.

Now, some of those old style iterators are probably just because the
filesystem may end up having per-inode mutable data that it uses for
iterating a directory, but at least one case is just a mistake.

Al switched over most filesystems to use '->iterate_shared()' back when
it was introduced.  In particular, the /proc filesystem was converted as
one of the first ones in commit f50752ea ("switch all procfs
directories ->iterate_shared()").

But then later one new user of '->iterate()' was then re-introduced by
commit 6d9c939d ("procfs: add smack subdir to attrs").

And that's clearly not what we wanted, since that new case just uses the
same 'proc_pident_readdir()' and 'proc_pident_lookup()' helper functions
that other /proc pident directories use, and they are most definitely
safe to use with the inode lock held shared.

So just fix it.

This still leaves a fair number of oddball filesystems using the
old-style directory iterator (ceph, coda, exfat, jfs, ntfs, ocfs2,
overlayfs, and vboxsf), but at least we don't have any remaining in the
core filesystems.

I'm going to add a wrapper function that just drops the read-lock and
takes it as a write lock, so that we can clean up the core vfs layer and
make all the ugly 'this filesystem needs exclusive inode locking' be
just filesystem-internal warts.

I just didn't want to make that conversion when we still had a core user
left.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

0a2c2baa

open: make RESOLVE_CACHED correctly test for O_TMPFILE · a0fc452a

Aleksa Sarai authored Aug 06, 2023

O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old
fast-path check for RESOLVE_CACHED would reject all users passing
O_DIRECTORY with -EAGAIN, when in fact the intended test was to check
for __O_TMPFILE.

Cc: stable@vger.kernel.org # v5.12+
Fixes: 99668f61 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

a0fc452a

Merge tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux · f0ab9f34

Linus Torvalds authored Aug 05, 2023

Pull rust fixes from Miguel Ojeda:

 - Allocator: prevent mis-aligned allocation

 - Types: delete 'ForeignOwnable::borrow_mut'. A sound replacement is
   planned for the merge window

 - Build: fix bindgen error with UBSAN_BOUNDS_STRICT

* tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux:
  rust: fix bindgen build error with UBSAN_BOUNDS_STRICT
  rust: delete `ForeignOwnable::borrow_mut`
  rust: allocator: Prevent mis-aligned allocation

f0ab9f34

Merge tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · fb0d9199

Linus Torvalds authored Aug 05, 2023

Pull ata fix from Damien Le Moal:

 - Prevent the scsi disk driver from issuing a START STOP UNIT command
   for ATA devices during system resume as this causes various issues
   reported by multiple users.

* tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata,scsi: do not issue START STOP UNIT on resume

fb0d9199

05 Aug, 2023 5 commits

Merge tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6 · f6a69168

Linus Torvalds authored Aug 05, 2023

Pull smb client fix from Steve French:

 - Fix DFS interlink problem (different namespace)

* tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix dfs link mount against w2k8

f6a69168

Merge tag 'powerpc-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 251a94f1

Linus Torvalds authored Aug 05, 2023

Pull powerpc fixes from Michael Ellerman:

 - Fix vmemmap altmap boundary check which could cause memory hotunplug
   failure

 - Create a dummy stackframe to fix ftrace stack unwind

 - Fix secondary thread bringup for Book3E ELFv2 kernels

 - Use early_ioremap/unmap() in via_calibrate_decr()

Thanks to Aneesh Kumar K.V, Benjamin Gray, Christophe Leroy, David
Hildenbrand, and Naveen N Rao.

* tag 'powerpc-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powermac: Use early_* IO variants in via_calibrate_decr()
  powerpc/64e: Fix secondary thread bringup for ELFv2 kernels
  powerpc/ftrace: Create a dummy stackframe to fix stack unwind
  powerpc/mm/altmap: Fix altmap boundary check

251a94f1

Merge tag 'parisc-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 947c2a83

Linus Torvalds authored Aug 05, 2023

Pull parisc architecture fixes from Helge Deller:

 - early fixmap preallocation to fix boot failures on kernel >= 6.4

 - remove DMA leftover code in parport_gsc

 - drop old comments and code style fixes

* tag 'parisc-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: unaligned: Add required spaces after ','
  parport: gsc: remove DMA leftover code
  parisc: pci-dma: remove unused and dead EISA code and comment
  parisc/mm: preallocate fixmap page tables at init

947c2a83

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · c9d26d8d

Linus Torvalds authored Aug 04, 2023

Pull clk fixes from Stephen Boyd:
 "A few clk driver fixes for some SoC clk drivers:

   - Change a usleep() to udelay() to avoid scheduling while atomic in
     the Amlogic PLL code

   - Revert a patch to the Mediatek MT8183 driver that caused an
     out-of-bounds write

   - Return the right error value when devm_of_iomap() fails in
     imx93_clocks_probe()

   - Constrain the Kconfig for the fixed mmio clk so that it depends on
     HAS_IOMEM and can't be compiled on architectures such as s390"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: fixed-mmio: make COMMON_CLK_FIXED_MMIO depend on HAS_IOMEM
  clk: imx93: Propagate correct error in imx93_clocks_probe()
  clk: mediatek: mt8183: Add back SSPM related clocks
  clk: meson: change usleep_range() to udelay() for atomic context

c9d26d8d

Merge tag 'hyperv-fixes-signed-20230804' of... · 024ff300

Linus Torvalds authored Aug 04, 2023

Merge tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Fix a bug in a python script for Hyper-V (Ani Sinha)

 - Workaround a bug in Hyper-V when IBT is enabled (Michael Kelley)

 - Fix an issue parsing MP table when Linux runs in VTL2 (Saurabh
   Sengar)

 - Several cleanup patches (Nischala Yelchuri, Kameron Carr, YueHaibing,
   ZhiHu)

* tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Remove unused extern declaration vmbus_ontimer()
  x86/hyperv: add noop functions to x86_init mpparse functions
  vmbus_testing: fix wrong python syntax for integer value comparison
  x86/hyperv: fix a warning in mshyperv.h
  x86/hyperv: Disable IBT when hypercall page lacks ENDBR instruction
  x86/hyperv: Improve code for referencing hyperv_pcpu_input_arg
  Drivers: hv: Change hv_free_hyperv_page() to take void * argument

024ff300

04 Aug, 2023 13 commits

Merge tag 'riscv-for-linus-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · e661f98c

Linus Torvalds authored Aug 04, 2023

Pull RISC-V fixes from Palmer Dabbelt:

 - A pair of fixes for build-related failures in the selftests

 - A fix for a sparse warning in acpi_os_ioremap()

 - A fix to restore the kernel PA offset in vmcoreinfo, to fix crash
   handling

* tag 'riscv-for-linus-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  Documentation: kdump: Add va_kernel_pa_offset for RISCV64
  riscv: Export va_kernel_pa_offset in vmcoreinfo
  RISC-V: ACPI: Fix acpi_os_ioremap to return iomem address
  selftests: riscv: Fix compilation error with vstate_exec_nolibc.c
  selftests/riscv: fix potential build failure during the "emit_tests" step

e661f98c

Merge tag 'pm-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · ea4f142f

Linus Torvalds authored Aug 04, 2023

Pull power management fix from Rafael Wysocki:
 "Fix a sparse warning triggered by the TPMI interface recently added to
  the Intel RAPL power capping driver (Zhang Rui)"

* tag 'pm-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  powercap: intel_rapl: Fix a sparse warning in TPMI interface

ea4f142f

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · e6fda526

Linus Torvalds authored Aug 04, 2023

Pull arm64 fixes from Catalin Marinas:
 "More SVE/SME fixes for ptrace() and for the (potentially future) case
  where SME is implemented in hardware without SVE support"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE
  arm64/fpsimd: Sync FPSIMD state with SVE for SME only systems
  arm64/ptrace: Don't enable SVE when setting streaming SVE
  arm64/ptrace: Flush FP state when setting ZT0
  arm64/fpsimd: Clear SME state in the target task when setting the VL

e6fda526

Merge tag 'mtd/fixes-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · c8273a25

Linus Torvalds authored Aug 04, 2023

Pull mtd fixes from Miquel Raynal:
 "Raw NAND fixes:
   - fsl_upm: Fix an off-by one test in fun_exec_op()
   - Rockchip:
       - Align hwecc vs. raw page helper layouts
       - Fix oobfree offset and description
   - Meson: Fix OOB available bytes for ECC
   - Omap ELM: Fix incorrect type in assignment

  SPI-NOR fix:
   - Avoid holes in struct spi_mem_op

  Hyperbus fix:
   - Add Tudor as reviewer in MAINTAINERS

  SPI-NAND fixes:
   - Winbond and Toshiba: Fix ecc_get_status"

* tag 'mtd/fixes-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: fsl_upm: Fix an off-by one test in fun_exec_op()
  mtd: spi-nor: avoid holes in struct spi_mem_op
  MAINTAINERS: Add myself as reviewer for HYPERBUS
  mtd: rawnand: rockchip: Align hwecc vs. raw page helper layouts
  mtd: rawnand: rockchip: fix oobfree offset and description
  mtd: rawnand: meson: fix OOB available bytes for ECC
  mtd: rawnand: omap_elm: Fix incorrect type in assignment
  mtd: spinand: winbond: Fix ecc_get_status
  mtd: spinand: toshiba: Fix ecc_get_status

c8273a25

Merge tag 'drm-fixes-2023-08-04' of git://anongit.freedesktop.org/drm/drm · 4142fc67

Linus Torvalds authored Aug 04, 2023

Pull drm fixes from Dave Airlie:
 "Small set of fixes this week, i915 and a few misc ones. I didn't see
  an amd pull so maybe next week it'll have a few more on that driver.

  ttm:
   - NULL ptr deref fix

  panel:
   - add missing MODULE_DEVICE_TABLE

  imx/ipuv3:
   - timing fix

  i915:
   - Fix bug in getting msg length in AUX CH registers handler
   - Gen12 AUX invalidation fixes
   - Fix premature release of request's reusable memory"

* tag 'drm-fixes-2023-08-04' of git://anongit.freedesktop.org/drm/drm:
  drm/panel: samsung-s6d7aa0: Add MODULE_DEVICE_TABLE
  drm/i915: Fix premature release of request's reusable memory
  drm/i915/gt: Support aux invalidation on all engines
  drm/i915/gt: Poll aux invalidation register bit on invalidation
  drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control and in the CS
  drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  drm/i915/gt: Ensure memory quiesced before invalidation
  drm/i915: Add the gen12_needs_ccs_aux_inv helper
  drm/i915/gt: Cleanup aux invalidation registers
  drm/i915/gvt: Fix bug in getting msg length in AUX CH registers handler
  drm/imx/ipuv3: Fix front porch adjustment upon hactive aligning
  drm/ttm: check null pointer before accessing when swapping

4142fc67

Merge tag 'ceph-for-6.5-rc5' of https://github.com/ceph/ceph-client · 4593f3c2

Linus Torvalds authored Aug 04, 2023

Pull ceph fixes from Ilya Dryomov:
 "Two patches to improve RBD exclusive lock interaction with
  osd_request_timeout option and another fix to reduce the potential for
  erroneous blocklisting -- this time in CephFS. All going to stable"

* tag 'ceph-for-6.5-rc5' of https://github.com/ceph/ceph-client:
  libceph: fix potential hang in ceph_osdc_notify()
  rbd: prevent busy loop when requesting exclusive lock
  ceph: defer stopping mdsc delayed_work

4593f3c2

file: reinstate f_pos locking optimization for regular files · 79796425

Linus Torvalds authored Aug 03, 2023

In commit 20ea1e7d ("file: always lock position for
FMODE_ATOMIC_POS") we ended up always taking the file pos lock, because
pidfd_getfd() could get a reference to the file even when it didn't have
an elevated file count due to threading of other sharing cases.

But Mateusz Guzik reports that the extra locking is actually measurable,
so let's re-introduce the optimization, and only force the locking for
directory traversal.

Directories need the lock for correctness reasons, while regular files
only need it for "POSIX semantics". Since pidfd_getfd() is about
debuggers etc special things that are _way_ outside of POSIX, we can
relax the rules for that case.
Reported-by: Mateusz Guzik <mjguzik@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/linux-fsdevel/20230803095311.ijpvhx3fyrbkasul@f/Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

79796425

arm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE · 69af56ae

Mark Brown authored Aug 03, 2023

We have a function sve_sync_from_fpsimd_zeropad() which is used by the
ptrace code to update the SVE state when the user writes to the the
FPSIMD register set. Currently this checks that the task has SVE
enabled but this will miss updates for tasks which have streaming SVE
enabled if SVE has not been enabled for the thread, also do the
conversion if the task has streaming SVE enabled.

Fixes: e12310a0 ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-3-49df214bfb3e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

69af56ae

arm64/fpsimd: Sync FPSIMD state with SVE for SME only systems · 507ea5dd

Mark Brown authored Aug 03, 2023

Currently we guard FPSIMD/SVE state conversions with a check for the system
supporting SVE but SME only systems may need to sync streaming mode SVE
state so add a check for SME support too. These functions are only used
by the ptrace code.

Fixes: e12310a0 ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-2-49df214bfb3e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

507ea5dd

arm64/ptrace: Don't enable SVE when setting streaming SVE · 045aecdf

Mark Brown authored Aug 03, 2023

Systems which implement SME without also implementing SVE are
architecturally valid but were not initially supported by the kernel,
unfortunately we missed one issue in the ptrace code.

The SVE register setting code is shared between SVE and streaming mode
SVE. When we set full SVE register state we currently enable TIF_SVE
unconditionally, in the case where streaming SVE is being configured on a
system that supports vanilla SVE this is not an issue since we always
initialise enough state for both vector lengths but on a system which only
support SME it will result in us attempting to restore the SVE vector
length after having set streaming SVE registers.

Fix this by making the enabling of SVE conditional on setting SVE vector
state. If we set streaming SVE state and SVE was not already enabled this
will result in a SVE access trap on next use of normal SVE, this will cause
us to flush our register state but this is fine since the only way to
trigger a SVE access trap would be to exit streaming mode which will cause
the in register state to be flushed anyway.

Fixes: e12310a0 ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-1-49df214bfb3e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

045aecdf

rust: fix bindgen build error with UBSAN_BOUNDS_STRICT · b0554488

Andrea Righi authored Jul 11, 2023

With commit 2d47c695 ("ubsan: Tighten UBSAN_BOUNDS on GCC") if
CONFIG_UBSAN is enabled and gcc supports -fsanitize=bounds-strict, we
can trigger the following build error due to bindgen lacking support for
this additional build option:

   BINDGEN rust/bindings/bindings_generated.rs
 error: unsupported argument 'bounds-strict' to option '-fsanitize='

Fix by adding -fsanitize=bounds-strict to the list of skipped gcc flags
for bindgen.

Fixes: 2d47c695 ("ubsan: Tighten UBSAN_BOUNDS on GCC")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Link: https://lore.kernel.org/r/20230711071914.133946-1-andrea.righi@canonical.comSigned-off-by: Miguel Ojeda <ojeda@kernel.org>

b0554488

rust: delete `ForeignOwnable::borrow_mut` · 1d24eb2d

Alice Ryhl authored Jul 06, 2023

We discovered that the current design of `borrow_mut` is problematic.
This patch removes it until a better solution can be found.

Specifically, the current design gives you access to a `&mut T`, which
lets you change where the `ForeignOwnable` points (e.g., with
`core::mem::swap`). No upcoming user of this API intended to make that
possible, making all of them unsound.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Fixes: 0fc4424d ("rust: types: introduce `ForeignOwnable`")
Link: https://lore.kernel.org/r/20230706094615.3080784-1-aliceryhl@google.comSigned-off-by: Miguel Ojeda <ojeda@kernel.org>

1d24eb2d

rust: allocator: Prevent mis-aligned allocation · b3d8aa84

Boqun Feng authored Jul 29, 2023

Currently the rust allocator simply passes the size of the type Layout
to krealloc(), and in theory the alignment requirement from the type
Layout may be larger than the guarantee provided by SLAB, which means
the allocated object is mis-aligned.

Fix this by adjusting the allocation size to the nearest power of two,
which SLAB always guarantees a size-aligned allocation. And because Rust
guarantees that the original size must be a multiple of alignment and
the alignment must be a power of two, then the alignment requirement is
satisfied.
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Co-developed-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
Signed-off-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: stable@vger.kernel.org # v6.1+
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 247b365d ("rust: add `kernel` crate")
Link: https://github.com/Rust-for-Linux/linux/issues/974
Link: https://lore.kernel.org/r/20230730012905.643822-2-boqun.feng@gmail.com
[ Applied rewording of comment as discussed in the mailing list. ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

b3d8aa84

03 Aug, 2023 11 commits

Merge tag 'drm-intel-fixes-2023-08-03' of... · 1958b0f9

Dave Airlie authored Aug 04, 2023

Merge tag 'drm-intel-fixes-2023-08-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix bug in getting msg length in AUX CH registers handler [gvt] (Yan Zhao)
- Gen12 AUX invalidation fixes [gt] (Andi Shyti, Jonathan Cavitt)
- Fix premature release of request's reusable memory (Janusz Krzysztofik)

- Merge tag 'gvt-fixes-2023-08-02' of https://github.com/intel/gvt-linux into drm-intel-fixes (Tvrtko Ursulin)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZMtkxWGuUKpaRMmo@tursulin-desk

1958b0f9

Merge tag 'drm-misc-fixes-2023-08-03' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes · 062ff85b

Dave Airlie authored Aug 04, 2023

A NULL pointer dereference fix for TTM, a timings fix for imx/ipuv3 and
the addition of a MODULE_DEVICE_TABLE for the samsung-s6d7aa0 panel.
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ztfogof2dhtlvjwe73mvd2jp5kbldhkkav7k5culuseqblwpti@qfobohwx3c3j

062ff85b

Merge tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of... · c1a515d3

Linus Torvalds authored Aug 03, 2023

Merge tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix segfault in the powerpc specific arch_skip_callchain_idx
   function. The patch doing the reference count init/exit that went
   into 6.5 missed this function.

 - Fix regression reading the arm64 PMU cpu slots in sysfs, a patch
   removing some code duplication ended up duplicating the /sysfs prefix
   for these files.

 - Fix grouping of events related to topdown, addressing a regression on
   the CSV output produced by 'perf stat' noticed on the downstream tool
   toplev.

 - Fix the uprobe_from_different_cu 'perf test' entry, it is failing
   when gcc isn't available, so we need to check that and skip the test
   if it is not installed.

* tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf test parse-events: Test complex name has required event format
  perf pmus: Create placholder regardless of scanning core_only
  perf test uprobe_from_different_cu: Skip if there is no gcc
  perf parse-events: Only move force grouped evsels when sorting
  perf parse-events: When fixing group leaders always set the leader
  perf parse-events: Extra care around force grouped events
  perf callchain powerpc: Fix addr location init during arch_skip_callchain_idx function
  perf pmu arm64: Fix reading the PMU cpu slots in sysfs

c1a515d3

Merge tag 'cxl-fixes-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl · 638c1913

Linus Torvalds authored Aug 03, 2023

Pull cxl fixes from Vishal Verma:

 - Fixup the Sanitixe device ABI that was merged for v6.5 to hide some
   sysfs files when the necessary support is missing. Update the ABI
   documentation around this as well.

* tag 'cxl-fixes-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/memdev: Only show sanitize sysfs files when supported
  cxl/memdev: Document security state in kern-doc
  cxl/memdev: Improve sanitize ABI descriptions

638c1913

Merge tag 'net-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 999f6631

Linus Torvalds authored Aug 03, 2023

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf and wireless.

  Nothing scary here. Feels like the first wave of regressions from v6.5
  is addressed - one outstanding fix still to come in TLS for the
  sendpage rework.

  Current release - regressions:

   - udp: fix __ip_append_data()'s handling of MSG_SPLICE_PAGES

   - dsa: fix older DSA drivers using phylink

  Previous releases - regressions:

   - gro: fix misuse of CB in udp socket lookup

   - mlx5: unregister devlink params in case interface is down

   - Revert "wifi: ath11k: Enable threaded NAPI"

  Previous releases - always broken:

   - sched: cls_u32: fix match key mis-addressing

   - sched: bind logic fixes for cls_fw, cls_u32 and cls_route

   - add bound checks to a number of places which hand-parse netlink

   - bpf: disable preemption in perf_event_output helpers code

   - qed: fix scheduling in a tasklet while getting stats

   - avoid using APIs which are not hardirq-safe in couple of drivers,
     when we may be in a hard IRQ (netconsole)

   - wifi: cfg80211: fix return value in scan logic, avoid page
     allocator warning

   - wifi: mt76: mt7615: do not advertise 5 GHz on first PHY of MT7615D
     (DBDC)

  Misc:

   - drop handful of inactive maintainers, put some new in place"

* tag 'net-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (98 commits)
  MAINTAINERS: update TUN/TAP maintainers
  test/vsock: remove vsock_perf executable on `make clean`
  tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
  tcp_metrics: annotate data-races around tm->tcpm_net
  tcp_metrics: annotate data-races around tm->tcpm_vals[]
  tcp_metrics: annotate data-races around tm->tcpm_lock
  tcp_metrics: annotate data-races around tm->tcpm_stamp
  tcp_metrics: fix addr_same() helper
  prestera: fix fallback to previous version on same major version
  udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES
  net/mlx5e: Set proper IPsec source port in L4 selector
  net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio
  net/mlx5: fs_core: Make find_closest_ft more generic
  wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1()
  vxlan: Fix nexthop hash size
  ip6mr: Fix skb_under_panic in ip6mr_cache_report()
  s390/qeth: Don't call dev_close/dev_open (DOWN/UP)
  net: tap_open(): set sk_uid from current_fsuid()
  net: tun_chr_open(): set sk_uid from current_fsuid()
  net: dcb: choose correct policy to parse DCB_ATTR_BCN
  ...

999f6631

MAINTAINERS: update TUN/TAP maintainers · 0765c5f2

Jakub Kicinski authored Aug 02, 2023

Willem and Jason have agreed to take over the maintainer
duties for TUN/TAP, thank you!

There's an existing entry for TUN/TAP which only covers
the user mode Linux implementation.
Since we haven't heard from Maxim on the list for almost
a decade, extend that entry and take it over, rather than
adding a new one.
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230802182843.4193099-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0765c5f2

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 3932f227

Jakub Kicinski authored Aug 03, 2023

Martin KaFai Lau says:

====================
pull-request: bpf 2023-08-03

We've added 5 non-merge commits during the last 7 day(s) which contain
a total of 3 files changed, 37 insertions(+), 20 deletions(-).

The main changes are:

1) Disable preemption in perf_event_output helpers code,
   from Jiri Olsa

2) Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing,
   from Lin Ma

3) Multiple warning splat fixes in cpumap from Hou Tao

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, cpumap: Handle skb as well when clean up ptr_ring
  bpf, cpumap: Make sure kthread is running before map update returns
  bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing
  bpf: Disable preemption in bpf_event_output
  bpf: Disable preemption in bpf_perf_event_output
====================

Link: https://lore.kernel.org/r/20230803181429.994607-1-martin.lau@linux.devSigned-off-by: Jakub Kicinski <kuba@kernel.org>

3932f227

Merge tag 'wireless-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · 0d48a84b

Jakub Kicinski authored Aug 03, 2023

Kalle Valo says:

====================
wireless fixes for v6.5

We did some house cleaning in MAINTAINERS file so several patches
about that. Few regressions fixed and also fix some recently enabled
memcpy() warnings. Only small commits and nothing special standing
out.

* tag 'wireless-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1()
  wifi: ray_cs: Replace 1-element array with flexible array
  MAINTAINERS: add Jeff as ath10k, ath11k and ath12k maintainer
  MAINTAINERS: wifi: mark mlw8k as orphan
  MAINTAINERS: wifi: mark b43 as orphan
  MAINTAINERS: wifi: mark zd1211rw as orphan
  MAINTAINERS: wifi: mark wl3501 as orphan
  MAINTAINERS: wifi: mark rndis_wlan as orphan
  MAINTAINERS: wifi: mark ar5523 as orphan
  MAINTAINERS: wifi: mark cw1200 as orphan
  MAINTAINERS: wifi: atmel: mark as orphan
  MAINTAINERS: wifi: rtw88: change Ping as the maintainer
  Revert "wifi: ath6k: silence false positive -Wno-dangling-pointer warning on GCC 12"
  wifi: cfg80211: Fix return value in scan logic
  Revert "wifi: ath11k: Enable threaded NAPI"
  MAINTAINERS: Update mwifiex maintainer list
  wifi: mt76: mt7615: do not advertise 5 GHz on first phy of MT7615D (DBDC)
====================

Link: https://lore.kernel.org/r/20230803140058.57476C433C9@smtp.kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0d48a84b

test/vsock: remove vsock_perf executable on `make clean` · 3c50c8b2

Stefano Garzarella authored Aug 03, 2023

We forgot to add vsock_perf to the rm command in the `clean`
target, so now we have a left over after `make clean` in
tools/testing/vsock.

Fixes: 8abbffd2 ("test/vsock: vsock_perf utility")
Cc: AVKrasnov@sberdevices.ru
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://lore.kernel.org/r/20230803085454.30897-1-sgarzare@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

3c50c8b2

Merge branch 'tcp_metrics-series-of-fixes' · 374297e8

Jakub Kicinski authored Aug 03, 2023

Eric Dumazet says:

====================
tcp_metrics: series of fixes

This series contains a fix for addr_same() and various
data-race annotations.

We still have to address races over tm->tcpm_saddr and
tm->tcpm_daddr later.
====================

Link: https://lore.kernel.org/r/20230802131500.1478140-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

374297e8

tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen · ddf251fa

Eric Dumazet authored Aug 02, 2023

Whenever tcpm_new() reclaims an old entry, tcpm_suck_dst()
would overwrite data that could be read from tcp_fastopen_cache_get()
or tcp_metrics_fill_info().

We need to acquire fastopen_seqlock to maintain consistency.

For newly allocated objects, tcpm_new() can switch to kzalloc()
to avoid an extra fastopen_seqlock acquisition.

Fixes: 1fe4c481 ("net-tcp: Fast Open client - cookie cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-7-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

ddf251fa