Commits · 0dc9f430027b8bd9073fdafdfcdeb1a073ab5594 · Kirill Smelkov / linux

20 May, 2024 8 commits

sunrpc: fix NFSACL RPC retry on soft mount · 0dc9f430

Dan Aloni authored Apr 25, 2024

It used to be quite awhile ago since 1b63a751 ('SUNRPC: Refactor
rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that
all mount parameters propagate to NFSACL clients. However since that
change, if mount options as follows are given:

    soft,timeo=50,retrans=16,vers=3

The resultant NFSACL client receives:

    cl_softrtry: 1
    cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0, to_retries=2, to_exponential=0

These values lead to NFSACL operations not being retried under the
condition of transient network outages with soft mount. Instead, getacl
call fails after 60 seconds with EIO.

The simple fix is to pass the existing client's `cl_timeout` as the new
client timeout.

Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@gmail.com/T/
Fixes: 1b63a751 ('SUNRPC: Refactor rpc_clone_client()')
Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

0dc9f430

SUNRPC: fix handling expired GSS context · 9b62ef6d

Olga Kornievskaia authored Apr 18, 2024

In the case where we have received a successful reply to an RPC request,
but while processing the reply the client in rpc_decode_header() finds
an expired context, the code ends up propagating the error to the caller
instead of getting a new context and retrying the request.

To give more details, in rpc_decode_header() we call rpcauth_checkverf()
will call into the gss and internally will at some point call
gss_validate() which has a check if the current’s context lifetime
expired, and it would fail. The reason for the failure gets ‘scrubbed’
and translated to EACCES so when we get back to rpc_decode_header() we
just go to “out_verifier” which for that error would get converted to
“out_garbage” (ie it’s treated as garballed reply) and the next
action is call_encode. Which (1) doesn’t reencode or re-send (not to
mention no upcall happens because context expires as that reason just
not known) and it again fails in the same decoding process. After
re-trying it 3 times the error is propagated back to the caller
(ie nfs4_write_done_cb() in the case a failing write).

To fix this, instead we need to look to the case where the server
decides that context has expired and replies with an RPC auth error.
In that case, the rpc_decode_header() goes to "out_msg_denied" in that
we return EKEYREJECTED which in call_decode() is sent to “call_reserve”
which triggers an upcalls and a re-try of the operation.

The proposed fix is in case of a failed rpc_decode_header() to check
if credentials were set to be invalid and use that as a proxy for
deciding that context has expired and then treat is same way as
receiving an auth error.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

9b62ef6d

nfs: keep server info for remounts · b322bf9e

Martin Kaiser authored Apr 14, 2024

With newer kernels that use fs_context for nfs mounts, remounts fail with
-EINVAL.

$ mount -t nfs -o nolock 10.0.0.1:/tmp/test /mnt/test/
$ mount -t nfs -o remount /mnt/test/
mount: mounting 10.0.0.1:/tmp/test on /mnt/test failed: Invalid argument

For remounts, the nfs server address and port are populated by
nfs_init_fs_context and later overwritten with 0x00 bytes by
nfs23_parse_monolithic. The remount then fails as the server address is
invalid.

Fix this by not overwriting nfs server info in nfs23_parse_monolithic if
we're doing a remount.

Fixes: f2aedb71 ("NFS: Add fs_context support.")
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

b322bf9e

NFSv4: Fixup smatch warning for ambiguous return · 37ffe065

Benjamin Coddington authored Apr 17, 2024

Dan Carpenter reports smatch warning for nfs4_try_migration() when a memory
allocation failure results in a zero return value. In this case, a
transient allocation failure error will likely be retried the next time the
server responds with NFS4ERR_MOVED.

We can fixup the smatch warning with a small refactor: attempt all three
allocations before testing and returning on a failure.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: c3ed2227 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

37ffe065

NFS: make sure lock/nolock overriding local_lock mount option · bf95f82e

Chen Hanxiao authored Apr 02, 2024

Currently, mount option lock/nolock and local_lock option
may override NFS_MOUNT_LOCAL_FLOCK NFS_MOUNT_LOCAL_FCNTL flags
when passing in different order:

mount -o vers=3,local_lock=all,lock:
	local_lock=none

mount -o vers=3,lock,local_lock=all:
	local_lock=all

This patch will let lock/nolock override local_lock option
as nfs(5) suggested.
Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

bf95f82e

NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly. · 7c6c5249

NeilBrown authored Mar 25, 2024

With two clients, each with NFSv3 mounts of the same directory, the sequence:

   client1            client2
  ls -l afile
                      echo hello there > afile
  echo HELLO > afile
  cat afile

will show
   HELLO
   there

because the O_TRUNC requested in the final 'echo' doesn't take effect.
This is because the "Negative dentry, just create a file" section in
lookup_open() assumes that the file *does* get created since the dentry
was negative, so it sets FMODE_CREATED, and this causes do_open() to
clear O_TRUNC and so the file doesn't get truncated.

Even mounting with -o lookupcache=none does not help as
nfs_neg_need_reval() always returns false if LOOKUP_CREATE is set.

This patch fixes the problem by providing an atomic_open inode operation
for NFSv3 (and v2).  The code is largely the code from the branch in
lookup_open() when atomic_open is not provided.  The significant change
is that the O_TRUNC flag is passed a new nfs_do_create() which add
'trunc' handling to nfs_create().

With this change we also optimise away an unnecessary LOOKUP before the
file is created.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

7c6c5249

pNFS/filelayout: Specify the layout segment range in LAYOUTGET · 464b424f

Anna Schumaker authored Mar 20, 2024

Move from only requesting full file layout segments to requesting layout
segments that match our I/O size. This means the server is still free to
return a full file layout if it wants, but partial layouts will no
longer cause an error.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

464b424f

pNFS/filelayout: Remove the whole file layout requirement · 9c75576e

Anna Schumaker authored Mar 20, 2024

Layout segments have been supported in pNFS for years, so remove the
requirement that the server always sends whole file layouts.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

9c75576e

12 May, 2024 5 commits

Linux 6.9 · a38297e3
Linus Torvalds authored May 12, 2024

a38297e3

Merge tag 'kselftest-fix-vfork-2024-05-12' of... · af300a39

Linus Torvalds authored May 12, 2024

Merge tag 'kselftest-fix-vfork-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull Kselftest fixes from Mickaël Salaün:
"Fix Kselftest's vfork() side effects.

As reported by Kernel Test Robot and Sean Christopherson, some
tests fail since v6.9-rc1 . This is due to the use of vfork() which
introduced some side effects. Similarly, while making it more generic,
a previous commit made some Landlock file system tests flaky, and
subject to the host's file system mount configuration.

This fixes all these side effects by replacing vfork() with clone3()
and CLONE_VFORK, which is cleaner (no arbitrary shared memory) and
makes the Kselftest framework more robust"

Link: https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com
Link: https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com
Link: https://lore.kernel.org/r/20240511171445.904356-1-mic@digikod.net

* tag 'kselftest-fix-vfork-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
selftests/harness: Handle TEST_F()'s explicit exit codes
selftests/harness: Fix vfork() side effects
selftests/harness: Share _metadata between forked processes
selftests/pidfd: Fix wrong expectation
selftests/harness: Constify fixture variants
selftests/landlock: Do not allocate memory in fixture data
selftests/harness: Fix interleaved scheduling leading to race conditions
selftests/harness: Fix fixture teardown
selftests/landlock: Fix FS tests when run on a private mount point
selftests/pidfd: Fix config for pidfd_setns_test

af300a39

Merge tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 2842076b

Linus Torvalds authored May 12, 2024

Pull kvm fix from Paolo Bonzini:

 - Fix NULL pointer read on s390 in ioctl(KVM_CHECK_EXTENSION) for
   /dev/kvm

* tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: Check kvm pointer when testing KVM_CAP_S390_HPAGE_1M

2842076b

Merge tag 'edac_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · ba16c1cf

Linus Torvalds authored May 12, 2024

Pull EDAC fix from Borislav Petkov:

 - Fix a race condition when clearing error count bits and toggling the
   error interrupt throug the same register, in synopsys_edac

* tag 'edac_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/synopsys: Fix ECC status and IRQ control race condition

ba16c1cf

Merge tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 775a0eca

Linus Torvalds authored May 12, 2024

Pull x86 fixes from Borislav Petkov:

 - Add a new PCI ID which belongs to a new AMD CPU family 0x1a

 - Ensure that that last level cache ID is set in all cases, in the AMD
   CPU topology parsing code, in order to prevent invalid scheduling
   domain CPU masks

* tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/topology/amd: Ensure that LLC ID is initialized
  x86/amd_nb: Add new PCI IDs for AMD family 0x1a

775a0eca

11 May, 2024 10 commits

selftests/harness: Handle TEST_F()'s explicit exit codes · 323feb3b

Mickaël Salaün authored May 11, 2024

If TEST_F() explicitly calls exit(code) with code different than 0, then
_metadata->exit_code is set to this code (e.g. KVM_ONE_VCPU_TEST()).  We
need to keep in mind that _metadata->exit_code can be KSFT_SKIP while
the process exit code is 0.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Reported-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Closes: https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com
Fixes: 0710a1a7 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Link: https://lore.kernel.org/r/20240511171445.904356-11-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

323feb3b

selftests/harness: Fix vfork() side effects · f453cc30

Mickaël Salaün authored May 11, 2024

Setting the time namespace with CLONE_NEWTIME returns -EUSERS if the
calling thread shares memory with another thread (because of the shared
vDSO), which is the case when it is created with vfork().

Fix pidfd_setns_test by replacing test harness's vfork() call with a
clone3() call with CLONE_VFORK, and an explicit sharing of the
_metadata and self objects.

Replace _metadata->teardown_parent with a new FIXTURE_TEARDOWN_PARENT()
helper that can replace FIXTURE_TEARDOWN().  This is a cleaner approach
and it enables to selectively share the fixture data between the child
process running tests and the parent process running the fixture
teardown.  This also avoids updating several tests to not rely on the
self object's copy-on-write property (e.g. storing the returned value of
a fork() call).

Cc: Christian Brauner <brauner@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Günther Noack <gnoack@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com
Fixes: 0710a1a7 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-10-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

f453cc30

selftests/harness: Share _metadata between forked processes · 24cf65a6

Mickaël Salaün authored May 11, 2024

Unconditionally share _metadata between all forked processes, which
enables to actually catch errors which were previously ignored.

This is required for a following commit replacing vfork() with clone3()
and CLONE_VFORK (i.e. not sharing the full memory) .  It should also be
useful to share _metadata to extend expectations to test process's
forks.  For instance, this change identified a wrong expectation in
pidfd_setns_test.

Because this _metadata is used by the new XFAIL_ADD(), use a global
pointer initialized in TEST_F().  This is OK because only XFAIL_ADD()
use it, and XFAIL_ADD() already depends on TEST_F().

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Will Drewry <wad@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-9-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

24cf65a6

selftests/pidfd: Fix wrong expectation · 821bc4a8

Mickaël Salaün authored May 11, 2024

Replace a wrong EXPECT_GT(self->child_pid_exited, 0) with EXPECT_GE(),
which will be actually tested on the parent and child sides with a
following commit.

Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20240511171445.904356-8-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

821bc4a8

selftests/harness: Constify fixture variants · cc80aa9a

Mickaël Salaün authored May 11, 2024

FIXTURE_VARIANT_ADD() types are passed as const pointers to
FIXTURE_TEARDOWN().  Make that explicit by constifying the variants
declarations.

Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Will Drewry <wad@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-7-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

cc80aa9a

selftests/landlock: Do not allocate memory in fixture data · 3656bc23

Mickaël Salaün authored May 11, 2024

Do not allocate self->dir_path in the test process because this would
not be visible in the FIXTURE_TEARDOWN() process when relying on
fork()/clone3() instead of vfork().

This change is required for a following commit removing vfork() call to
not break the layout3_fs.* test cases.

Cc: Günther Noack <gnoack@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-6-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

3656bc23

selftests/harness: Fix interleaved scheduling leading to race conditions · a86f1890

Mickaël Salaün authored May 11, 2024

Fix a race condition when running several FIXTURE_TEARDOWN() managing
the same resource.  This fixes a race condition in the Landlock file
system tests when creating or unmounting the same directory.

Using clone3() with CLONE_VFORK guarantees that the child and grandchild
test processes are sequentially scheduled.  This is implemented with a
new clone3_vfork() helper replacing the fork() call.

This avoids triggering this error in __wait_for_test():
  Test ended in some other way [127]

Cc: Christian Brauner <brauner@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Günther Noack <gnoack@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Fixes: 41cca054 ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-5-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

a86f1890

selftests/harness: Fix fixture teardown · fff37bd3

Mickaël Salaün authored May 11, 2024

Make sure fixture teardowns are run when test cases failed, including
when _metadata->teardown_parent is set to true.

Make sure only one fixture teardown is run per test case, handling the
case where the test child forks.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Shengyu Li <shengyu.li.evgeny@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Fixes: 72d7cb5c ("selftests/harness: Prevent infinite loop due to Assert in FIXTURE_TEARDOWN")
Fixes: 0710a1a7 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-4-mic@digikod.net
Rule: add
Link: https://lore.kernel.org/stable/20240506165518.474504-4-mic%40digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

fff37bd3

selftests/landlock: Fix FS tests when run on a private mount point · 7e4042ab

Mickaël Salaün authored May 11, 2024

According to the test environment, the mount point of the test's working
directory may be shared or not, which changes the visibility of the
nested "tmp" mount point for the test's parent process calling
umount("tmp").

This was spotted while running tests in containers [1], where mount
points are private.

Cc: Günther Noack <gnoack@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Link: https://github.com/landlock-lsm/landlock-test-tools/pull/4 [1]
Fixes: 41cca054 ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-3-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

7e4042ab

selftests/pidfd: Fix config for pidfd_setns_test · 37dc2e0d

Mickaël Salaün authored May 11, 2024

Required by switch_timens() to open /proc/self/ns/time_for_children.

CONFIG_GENERIC_VDSO_TIME_NS is not available on UML, so pidfd_setns_test
cannot be run successfully on this architecture.

Cc: Shuah Khan <skhan@linuxfoundation.org>
Fixes: 2b40c5db ("selftests/pidfd: add pidfd setns tests")
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20240511171445.904356-2-mic@digikod.netSigned-off-by: Mickaël Salaün <mic@digikod.net>

37dc2e0d

10 May, 2024 17 commits

Merge tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel · cf87f46f

Linus Torvalds authored May 10, 2024

Pull drm fixes from Dave Airlie:
 "This should be the last set of fixes for 6.9, i915, xe and amdgpu are
  the bulk here, one of the previous nouveau fixes turned up an issue,
  so reverting it, otherwise one core and a couple of meson fixes.

  core:
   - fix connector debugging output

  i915:
   - Automate CCS Mode setting during engine resets
   - Fix audio time stamp programming for DP
   - Fix parsing backlight BDB data

  xe:
   - Fix use zero-length element array
   - Move more from system wq to ordered private wq
   - Do not ignore return for drmm_mutex_init

  amdgpu:
   - DCN 3.5 fix
   - MST DSC fixes
   - S0i3 fix
   - S4 fix
   - HDP MMIO mapping fix
   - Fix a regression in visible vram handling

  amdkfd:
   - Spatial partition fix

  meson:
   - dw-hdmi: power-up fixes
   - dw-hdmi: add badngap setting for g12

  nouveau:
   - revert SG_DEBUG fix that has a side effect"

* tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel:
  Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()"
  drm/amdgpu: Fix comparison in amdgpu_res_cpu_visible
  drm/amdkfd: don't allow mapping the MMIO HDP page with large pages
  drm/xe: Use ordered WQ for G2H handler
  drm/xe/guc: Check error code when initializing the CT mutex
  drm/xe/ads: Use flexible-array
  Revert "drm/amdkfd: Add partition id field to location_id"
  dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users
  drm/amd/display: MST DSC check for older devices
  drm/amd/display: Fix idle optimization checks for multi-display and dual eDP
  drm/amd/display: Fix DSC-re-computing
  drm/amd/display: Enable urgent latency adjustments for DCN35
  drm/connector: Add \n to message about demoting connector force-probes
  drm/i915/bios: Fix parsing backlight BDB data
  drm/i915/audio: Fix audio time stamp programming for DP
  drm/i915/gt: Automate CCS Mode setting during engine resets
  drm/meson: dw-hdmi: add bandgap setting for g12
  drm/meson: dw-hdmi: power up phy on device init

cf87f46f

Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of... · c22c3e07

Linus Torvalds authored May 10, 2024

Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM fixes from Andrew Morton:
 "18 hotfixes, 7 of which are cc:stable.

  More fixups for this cycle's page_owner updates. And a few userfaultfd
  fixes. Otherwise, random singletons - see the individual changelogs
  for details"

* tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add entry for Barry Song
  selftests/mm: fix powerpc ARCH check
  mailmap: add entry for John Garry
  XArray: set the marks correctly when splitting an entry
  selftests/vDSO: fix runtime errors on LoongArch
  selftests/vDSO: fix building errors on LoongArch
  mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list
  fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry()
  fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan
  mm/vmalloc: fix return value of vb_alloc if size is 0
  mm: use memalloc_nofs_save() in page_cache_ra_order()
  kmsan: compiler_types: declare __no_sanitize_or_inline
  lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()
  tools: fix userspace compilation with new test_xarray changes
  MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER
  mm: page_owner: fix wrong information in dump_page_owner
  maple_tree: fix mas_empty_area_rev() null pointer dereference
  mm/userfaultfd: reset ptes when close() for wr-protected ones

c22c3e07

Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()" · a222a647

Dave Airlie authored May 11, 2024

This reverts commit 52a6947b.

This causes loading failures in
[    0.367379] nouveau 0000:01:00.0: NVIDIA GP104 (134000a1)
[    0.474499] nouveau 0000:01:00.0: bios: version 86.04.50.80.13
[    0.474620] nouveau 0000:01:00.0: pmu: firmware unavailable
[    0.474977] nouveau 0000:01:00.0: fb: 8192 MiB GDDR5
[    0.484371] nouveau 0000:01:00.0: sec2(acr): mbox 00000001 00000000
[    0.484377] nouveau 0000:01:00.0: sec2(acr):load: boot failed: -5
[    0.484379] nouveau 0000:01:00.0: acr: init failed, -5
[    0.484466] nouveau 0000:01:00.0: init failed with -5
[    0.484468] nouveau: DRM-master:00000000:00000080: init failed with -5
[    0.484470] nouveau 0000:01:00.0: DRM-master: Device allocation failed: -5
[    0.485078] nouveau 0000:01:00.0: probe with driver nouveau failed with error -50

I tried tracking it down but ran out of time this week, will revisit next week.
Reported-by: Dan Moulding <dan@danm.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

a222a647

Merge tag 'drm-misc-fixes-2024-05-10' of... · b61821bb

Dave Airlie authored May 11, 2024

Merge tag 'drm-misc-fixes-2024-05-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

core:
- fix connector debugging output

meson:
- dw-hdmi: power-up fixes
- dw-hdmi: add badngap setting for g12
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510072027.GA9131@linux.fritz.box

b61821bb

Merge tag 'gpio-fixes-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · cfb4be1a

Linus Torvalds authored May 10, 2024

Pull gpio fixes from Bartosz Golaszewski:
 "Some last-minute fixes for this release from the GPIO subsystem.

  The first two address a regression in performance reported to me after
  the conversion to using SRCU in GPIOLIB that was merged during the
  v6.9 merge window. The second patch is not technically a fix but since
  after the first one we no longer need to use a per-descriptor SRCU
  struct, I think it's worth to simplify the code before it gets
  released on Sunday.

  The next two commits fix two memory issues: one use-after-free bug and
  one instance of possibly leaking kernel stack memory to user-space.

  Summary:

   - fix a performance regression in GPIO requesting and releasing after
     the conversion to SRCU

   - fix a use-after-free bug due to a race-condition

   - fix leaking stack memory to user-space in a GPIO uABI corner case"

* tag 'gpio-fixes-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: cdev: fix uninitialised kfifo
  gpiolib: cdev: Fix use after free in lineinfo_changed_notify
  gpiolib: use a single SRCU struct for all GPIO descriptors
  gpiolib: fix the speed of descriptor label setting with SRCU

cfb4be1a

Merge tag 'amd-drm-fixes-6.9-2024-05-10' of... · 06fbf84f

Dave Airlie authored May 11, 2024

Merge tag 'amd-drm-fixes-6.9-2024-05-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.9-2024-05-10:

amdgpu:
- DCN 3.5 fix
- MST DSC fixes
- S0i3 fix
- S4 fix
- HDP MMIO mapping fix
- Fix a regression in visible vram handling

amdkfd:
- Spatial partition fix
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510171110.1394940-1-alexander.deucher@amd.com

06fbf84f

mailmap: add entry for Barry Song · 672614a3

Barry Song authored May 06, 2024

Include a .mailmap entry to synchronize with both my past and current
emails. Among them, three business mailboxes are dead.

Link: https://lkml.kernel.org/r/20240506042009.10854-1-21cnbao@gmail.comSigned-off-by: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

672614a3

selftests/mm: fix powerpc ARCH check · 7e642344

Michael Ellerman authored May 06, 2024

In commit 0518dbe9 ("selftests/mm: fix cross compilation with LLVM")
the logic to detect the machine architecture in the Makefile was changed
to use ARCH, and only fallback to uname -m if ARCH is unset.  However the
tests of ARCH were not updated to account for the fact that ARCH is
"powerpc" for powerpc builds, not "ppc64".

Fix it by changing the checks to look for "powerpc", and change the
uname -m logic to convert "ppc64.*" into "powerpc".

With that fixed the following tests now build for powerpc again:
 * protection_keys
 * va_high_addr_switch
 * virtual_address_range
 * write_to_hugetlbfs

Link: https://lkml.kernel.org/r/20240506115825.66415-1-mpe@ellerman.id.au
Fixes: 0518dbe9 ("selftests/mm: fix cross compilation with LLVM")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org>	[6.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

7e642344

Merge tag 'block-6.9-20240510' of git://git.kernel.dk/linux · f4345f05

Linus Torvalds authored May 10, 2024

Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith:
     - nvme target fixes (Sagi, Dan, Maurizo)
     - new vendor quirk for broken MSI (Sean)

 - Virtual boundary fix for a regression in this merge window (Ming)

* tag 'block-6.9-20240510' of git://git.kernel.dk/linux:
  nvmet-rdma: fix possible bad dereference when freeing rsps
  nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists()
  nvmet: make nvmet_wq unbound
  nvmet-auth: return the error code to the nvmet_auth_ctrl_hash() callers
  nvme-pci: Add quirk for broken MSIs
  block: set default max segment size in case of virt_boundary

f4345f05

Merge tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · ed44935c

Linus Torvalds authored May 10, 2024

Pull spi fixes from Mark Brown:
 "Two device specific fixes here, one avoiding glitches on chip select
  with the STM32 driver and one for incorrectly configured clocks on the
  Microchip QSPI controller"

* tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: microchip-core-qspi: fix setting spi bus clock rate
  spi: stm32: enable controller before asserting CS

ed44935c

Merge tag 'regulator-fix-v6.9-rc7' of... · 99dff484

Linus Torvalds authored May 10, 2024

Merge tag 'regulator-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "Two fixes here, one from Johan which fixes error handling when we
  attempt to create duplicate debugfs files and one for an incorrect
  specification of ramp_delay with the rtq2208"

* tag 'regulator-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: fix debugfs creation regression
  regulator: rtq2208: Fix the BUCK ramp_delay range to maximum of 16mVstep/us

99dff484

Merge tag 'timers-urgent-2024-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 92d50301

Linus Torvalds authored May 10, 2024

Pull timer fix from Ingo Molnar:
 "Fix possible (but unlikely) out-of-bounds access in the timer
  migration per-CPU-init code"

* tag 'timers-urgent-2024-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers/migration: Prevent out of bounds access on failure

92d50301

Merge tag 'iommu-fixes-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 98957025

Linus Torvalds authored May 10, 2024

Pull iommu fixes from Joerg Roedel:

 - Fix offset miscalculation on ARM-SMMU driver

 - AMD IOMMU fix for initializing state of untrusted devices

* tag 'iommu-fixes-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault()
  iommu/amd: Enhance def_domain_type to handle untrusted device

98957025

drm/amdgpu: Fix comparison in amdgpu_res_cpu_visible · 8d2c9307

Michel Dänzer authored May 08, 2024

It incorrectly claimed a resource isn't CPU visible if it's located at
the very end of CPU visible VRAM.

Fixes: a6ff969f ("drm/amdgpu: fix visible VRAM handling during faults")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3343Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reported-and-Tested-by: Jeremy Day <jsday@noreason.ca>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org

8d2c9307

drm/amdkfd: don't allow mapping the MMIO HDP page with large pages · be4a2a81

Alex Deucher authored Apr 14, 2024

We don't get the right offset in that case.  The GPU has
an unused 4K area of the register BAR space into which you can
remap registers.  We remap the HDP flush registers into this
space to allow userspace (CPU or GPU) to flush the HDP when it
updates VRAM.  However, on systems with >4K pages, we end up
exposing PAGE_SIZE of MMIO space.

Fixes: d8e408a8 ("drm/amdkfd: Expose HDP registers to user space")
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

be4a2a81

x86/topology/amd: Ensure that LLC ID is initialized · 5754ace3

Thomas Gleixner authored May 08, 2024

The original topology evaluation code initialized cpu_data::topo::llc_id
with the die ID initialy and then eventually overwrite it with information
gathered from a CPUID leaf.

The conversion analysis failed to spot that particular detail and omitted
this initial assignment under the assumption that each topology evaluation
path will set it up. That assumption is mostly correct, but turns out to be
wrong in case that the CPUID leaf 0x80000006 does not provide a LLC ID.

In that case, LLC ID is invalid and as a consequence the setup of the
scheduling domain CPU masks is incorrect which subsequently causes the
scheduler core to complain about it during CPU hotplug:

BUG: arch topology borken
the CLS domain not a subset of the MC domain

Cure it by reusing legacy_set_llc() and assigning the die ID if the LLC ID
is invalid after all possible parsers have been tried.

Fixes: f7fb3b2d ("x86/cpu: Provide an AMD/HYGON specific topology parser")
Reported-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Link: https://lore.kernel.org/r/PUZPR04MB63168AC442C12627E827368581292@PUZPR04MB6316.apcprd04.prod.outlook.com

5754ace3

gpiolib: cdev: fix uninitialised kfifo · ee0166b6

Kent Gibson authored May 10, 2024

If a line is requested with debounce, and that results in debouncing
in software, and the line is subsequently reconfigured to enable edge
detection then the allocation of the kfifo to contain edge events is
overlooked. This results in events being written to and read from an
uninitialised kfifo. Read events are returned to userspace.

Initialise the kfifo in the case where the software debounce is
already active.

Fixes: 65cff704 ("gpiolib: cdev: support setting debounce")
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Link: https://lore.kernel.org/r/20240510065342.36191-1-warthog618@gmail.comSigned-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

ee0166b6