1. 26 Jun, 2022 4 commits
    • Darrick J. Wong's avatar
      xfs: clean up the end of xfs_attri_item_recover · f94e08b6
      Darrick J. Wong authored
      The end of this function could use some cleanup -- the EAGAIN
      conditionals make it harder to figure out what's going on with the
      disposal of xattri_leaf_bp, and the dual error/ret variables aren't
      needed.  Turn the EAGAIN case into a separate block documenting all the
      subtleties of recovering in the middle of an xattr update chain, which
      makes the rest of the prologue much simpler.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      f94e08b6
    • Darrick J. Wong's avatar
      xfs: always free xattri_leaf_bp when cancelling a deferred op · b822ea17
      Darrick J. Wong authored
      While running the following fstest with logged xattrs DISabled, I
      noticed the following:
      
      # FSSTRESS_AVOID="-z -f unlink=1 -f rmdir=1 -f creat=2 -f mkdir=2 -f
      getfattr=3 -f listfattr=3 -f attr_remove=4 -f removefattr=4 -f
      setfattr=20 -f attr_set=60" ./check generic/475
      
      INFO: task u9:1:40 blocked for more than 61 seconds.
            Tainted: G           O      5.19.0-rc2-djwx #rc2
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:u9:1            state:D stack:12872 pid:   40 ppid:     2 flags:0x00004000
      Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs]
      Call Trace:
       <TASK>
       __schedule+0x2db/0x1110
       schedule+0x58/0xc0
       schedule_timeout+0x115/0x160
       __down_common+0x126/0x210
       down+0x54/0x70
       xfs_buf_lock+0x2d/0xe0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
       xfs_buf_item_unpin+0x227/0x3a0 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
       xfs_trans_committed_bulk+0x18e/0x320 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
       xlog_cil_committed+0x2ea/0x360 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
       xlog_cil_push_work+0x60f/0x690 [xfs 0532c1cb1d67dd81d15cb79ac6e415c8dec58f73]
       process_one_work+0x1df/0x3c0
       worker_thread+0x53/0x3b0
       kthread+0xea/0x110
       ret_from_fork+0x1f/0x30
       </TASK>
      
      This appears to be the result of shortform_to_leaf creating a new leaf
      buffer as part of adding an xattr to a file.  The new leaf buffer is
      held and attached to the xfs_attr_intent structure, but then the
      filesystem shuts down.  Instead of the usual path (which adds the attr
      to the held leaf buffer which releases the hold), we instead cancel the
      entire deferred operation.
      
      Unfortunately, xfs_attr_cancel_item doesn't release any attached leaf
      buffers, so we leak the locked buffer.  The CIL cannot do anything
      about that, and hangs.  Fix this by teaching it to release leaf buffers,
      and make XFS a little more careful about not leaving a dangling
      reference.
      
      The prologue of xfs_attri_item_recover is (in this author's opinion) a
      little hard to figure out, so I'll clean that up in the next patch.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      b822ea17
    • Kaixu Xia's avatar
      xfs: use invalidate_lock to check the state of mmap_lock · 82af8806
      Kaixu Xia authored
      We should use invalidate_lock and XFS_MMAPLOCK_SHARED to check the state
      of mmap_lock rw_semaphore in xfs_isilocked(), rather than i_rwsem and
      XFS_IOLOCK_SHARED.
      
      Fixes: 2433480a ("xfs: Convert to use invalidate_lock")
      Signed-off-by: default avatarKaixu Xia <kaixuxia@tencent.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      82af8806
    • Kaixu Xia's avatar
      xfs: factor out the common lock flags assert · ca76a761
      Kaixu Xia authored
      There are similar lock flags assert in xfs_ilock(), xfs_ilock_nowait(),
      xfs_iunlock(), thus we can factor it out into a helper that is clear.
      Signed-off-by: default avatarKaixu Xia <kaixuxia@tencent.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      ca76a761
  2. 23 Jun, 2022 2 commits
    • Dave Chinner's avatar
      xfs: introduce xfs_inodegc_push() · 5e672cd6
      Dave Chinner authored
      The current blocking mechanism for pushing the inodegc queue out to
      disk can result in systems becoming unusable when there is a long
      running inodegc operation. This is because the statfs()
      implementation currently issues a blocking flush of the inodegc
      queue and a significant number of common system utilities will call
      statfs() to discover something about the underlying filesystem.
      
      This can result in userspace operations getting stuck on inodegc
      progress, and when trying to remove a heavily reflinked file on slow
      storage with a full journal, this can result in delays measuring in
      hours.
      
      Avoid this problem by adding "push" function that expedites the
      flushing of the inodegc queue, but doesn't wait for it to complete.
      
      Convert xfs_fs_statfs() and xfs_qm_scall_getquota() to use this
      mechanism so they don't block but still ensure that queued
      operations are expedited.
      
      Fixes: ab23a776 ("xfs: per-cpu deferred inode inactivation queues")
      Reported-by: default avatarChris Dunlop <chris@onthe.net.au>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      [djwong: fix _getquota_next to use _inodegc_push too]
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      5e672cd6
    • Dave Chinner's avatar
      xfs: bound maximum wait time for inodegc work · 7cf2b0f9
      Dave Chinner authored
      Currently inodegc work can sit queued on the per-cpu queue until
      the workqueue is either flushed of the queue reaches a depth that
      triggers work queuing (and later throttling). This means that we
      could queue work that waits for a long time for some other event to
      trigger flushing.
      
      Hence instead of just queueing work at a specific depth, use a
      delayed work that queues the work at a bound time. We can still
      schedule the work immediately at a given depth, but we no long need
      to worry about leaving a number of items on the list that won't get
      processed until external events prevail.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      7cf2b0f9
  3. 16 Jun, 2022 3 commits
    • Darrick J. Wong's avatar
      xfs: preserve DIFLAG2_NREXT64 when setting other inode attributes · e89ab76d
      Darrick J. Wong authored
      It is vitally important that we preserve the state of the NREXT64 inode
      flag when we're changing the other flags2 fields.
      
      Fixes: 9b7d16e3 ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChandan Babu R <chandan.babu@oracle.com>
      Reviewed-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      e89ab76d
    • Darrick J. Wong's avatar
      xfs: fix variable state usage · 10930b25
      Darrick J. Wong authored
      The variable @args is fed to a tracepoint, and that's the only place
      it's used.  This is fine for the kernel, but for userspace, tracepoints
      are #define'd out of existence, which results in this warning on gcc
      11.2:
      
      xfs_attr.c: In function ‘xfs_attr_node_try_addname’:
      xfs_attr.c:1440:42: warning: unused variable ‘args’ [-Wunused-variable]
       1440 |         struct xfs_da_args              *args = attr->xattri_da_args;
            |                                          ^~~~
      
      Clean this up.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      10930b25
    • Darrick J. Wong's avatar
      xfs: fix TOCTOU race involving the new logged xattrs control knob · f4288f01
      Darrick J. Wong authored
      I found a race involving the larp control knob, aka the debugging knob
      that lets developers enable logging of extended attribute updates:
      
      Thread 1			Thread 2
      
      echo 0 > /sys/fs/xfs/debug/larp
      				setxattr(REPLACE)
      				xfs_has_larp (returns false)
      				xfs_attr_set
      
      echo 1 > /sys/fs/xfs/debug/larp
      
      				xfs_attr_defer_replace
      				xfs_attr_init_replace_state
      				xfs_has_larp (returns true)
      				xfs_attr_init_remove_state
      
      				<oops, wrong DAS state!>
      
      This isn't a particularly severe problem right now because xattr logging
      is only enabled when CONFIG_XFS_DEBUG=y, and developers *should* know
      what they're doing.
      
      However, the eventual intent is that callers should be able to ask for
      the assistance of the log in persisting xattr updates.  This capability
      might not be required for /all/ callers, which means that dynamic
      control must work correctly.  Once an xattr update has decided whether
      or not to use logged xattrs, it needs to stay in that mode until the end
      of the operation regardless of what subsequent parallel operations might
      do.
      
      Therefore, it is an error to continue sampling xfs_globals.larp once
      xfs_attr_change has made a decision about larp, and it was not correct
      for me to have told Allison that ->create_intent functions can sample
      the global log incompat feature bitfield to decide to elide a log item.
      
      Instead, create a new op flag for the xfs_da_args structure, and convert
      all other callers of xfs_has_larp and xfs_sb_version_haslogxattrs within
      the attr update state machine to look for the operations flag.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      f4288f01
  4. 12 Jun, 2022 10 commits
  5. 11 Jun, 2022 9 commits
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 7a68065e
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
       "A set of fixes. Most address the new warning we emit at build time
        when irq chips are not immutable with some additional tweaks to
        gpio-crystalcove from Andy and a small tweak to gpio-dwapd.
      
         - make irq_chip structs immutable in several Diolan and intel drivers
           to get rid of the new warning we emit when fiddling with irq chips
      
         - don't print error messages on probe deferral in gpio-dwapb"
      
      * tag 'gpio-fixes-for-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: dwapb: Don't print error on -EPROBE_DEFER
        gpio: dln2: make irq_chip immutable
        gpio: sch: make irq_chip immutable
        gpio: merrifield: make irq_chip immutable
        gpio: wcove: make irq_chip immutable
        gpio: crystalcove: Join function declarations and long lines
        gpio: crystalcove: Use specific type and API for IRQ number
        gpio: crystalcove: make irq_chip immutable
      7a68065e
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · cecb3540
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Driver fixes and and one core patch.
      
        Nine of the driver patches are minor fixes and reworks to lpfc and the
        rest are trivial and minor fixes elsewhere"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: pmcraid: Fix missing resource cleanup in error case
        scsi: ipr: Fix missing/incorrect resource cleanup in error case
        scsi: mpt3sas: Fix out-of-bounds compiler warning
        scsi: lpfc: Update lpfc version to 14.2.0.4
        scsi: lpfc: Allow reduced polling rate for nvme_admin_async_event cmd completion
        scsi: lpfc: Add more logging of cmd and cqe information for aborted NVMe cmds
        scsi: lpfc: Fix port stuck in bypassed state after LIP in PT2PT topology
        scsi: lpfc: Resolve NULL ptr dereference after an ELS LOGO is aborted
        scsi: lpfc: Address NULL pointer dereference after starget_to_rport()
        scsi: lpfc: Resolve some cleanup issues following SLI path refactoring
        scsi: lpfc: Resolve some cleanup issues following abort path refactoring
        scsi: lpfc: Correct BDE type for XMIT_SEQ64_WQE in lpfc_ct_reject_event()
        scsi: vmw_pvscsi: Expand vcpuHint to 16 bits
        scsi: sd: Fix interpretation of VPD B9h length
      cecb3540
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · abe71eb3
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
       "Fixes all over the place, most notably fixes for latent bugs in
        drivers that got exposed by suppressing interrupts before DRIVER_OK,
        which in turn has been done by 8b4ec69d ("virtio: harden vring
        IRQ")"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        um: virt-pci: set device ready in probe()
        vdpa: make get_vq_group and set_group_asid optional
        virtio: Fix all occurences of the "the the" typo
        vduse: Fix NULL pointer dereference on sysfs access
        vringh: Fix loop descriptors check in the indirect cases
        vdpa/mlx5: clean up indenting in handle_ctrl_vlan()
        vdpa/mlx5: fix error code for deleting vlan
        virtio-mmio: fix missing put_device() when vm_cmdline_parent registration failed
        vdpa/mlx5: Fix syntax errors in comments
        virtio-rng: make device ready before making request
      abe71eb3
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-5.19-1' of... · 0678afa6
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen.
       "Fix build errors and a stale comment"
      
      * tag 'loongarch-fixes-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: Remove MIPS comment about cycle counter
        LoongArch: Fix copy_thread() build errors
        LoongArch: Fix the !CONFIG_SMP build
      0678afa6
    • Linus Torvalds's avatar
      iov_iter: fix build issue due to possible type mis-match · 1c27f1fc
      Linus Torvalds authored
      Commit 6c776766 ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()")
      introduced a problem on some 32-bit architectures (at least arm, xtensa,
      csky,sparc and mips), that have a 'size_t' that is 'unsigned int'.
      
      The reason is that we now do
      
          min(nr * PAGE_SIZE - offset, maxsize);
      
      where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is
      'unsigned long'.  As a result, the normal C type rules means that the
      first argument to 'min()' ends up being 'unsigned long'.
      
      In contrast, 'maxsize' is of type 'size_t'.
      
      Now, 'size_t' and 'unsigned long' are always the same physical type in
      the kernel, so you'd think this doesn't matter, and from an actual
      arithmetic standpoint it doesn't.
      
      But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if
      it could also be 'unsigned long'.  In that situation, both are unsigned
      32-bit types, but they are not the *same* type.
      
      And as a result 'min()' will complain about the distinct types (ignore
      the "pointer types" part of the error message: that's an artifact of the
      way we have made 'min()' check types for being the same):
      
        lib/iov_iter.c: In function 'iter_xarray_get_pages':
        include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror]
           20 |         (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
              |                                   ^~
        lib/iov_iter.c:1464:16: note: in expansion of macro 'min'
         1464 |         return min(nr * PAGE_SIZE - offset, maxsize);
              |                ^~~
      
      This was not visible on 64-bit architectures (where we always define
      'size_t' to be 'unsigned long').
      
      Force these cases to use 'min_t(size_t, x, y)' to make the type explicit
      and avoid the issue.
      
      [ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned
        long' arithmetically. We've certainly historically seen environments
        with 16-bit address spaces and 32-bit 'unsigned long'.
      
        Similarly, even in 64-bit modern environments, 'size_t' could be its
        own type distinct from 'unsigned long', even if it were arithmetically
        identical.
      
        So the above type commentary is only really descriptive of the kernel
        environment, not some kind of universal truth for the kinds of wild
        and crazy situations that are allowed by the C standard ]
      Reported-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1c27f1fc
    • Jason A. Donenfeld's avatar
      wireguard: selftests: use maximum cpu features and allow rng seeding · 17b0128a
      Jason A. Donenfeld authored
      By forcing the maximum CPU that QEMU has available, we expose additional
      capabilities, such as the RNDR instruction, which increases test
      coverage. This then allows the CI to skip the fake seeding step in some
      cases. Also enable STRICT_KERNEL_RWX to catch issues related to early
      jump labels when the RNG is initialized at boot.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      17b0128a
    • Kuan-Ying Lee's avatar
      scripts/gdb: change kernel config dumping method · 1f7a6cf6
      Kuan-Ying Lee authored
      MAGIC_START("IKCFG_ST") and MAGIC_END("IKCFG_ED") are moved out
      from the kernel_config_data variable.
      
      Thus, we parse kernel_config_data directly instead of considering
      offset of MAGIC_START and MAGIC_END.
      
      Fixes: 13610aa9 ("kernel/configs: use .incbin directive to embed config_data.gz")
      Signed-off-by: default avatarKuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      1f7a6cf6
    • Vincent Whitchurch's avatar
      um: virt-pci: set device ready in probe() · eacea844
      Vincent Whitchurch authored
      Call virtio_device_ready() to make this driver work after commit
      b4ec69d7e09 ("virtio: harden vring IRQ"), since the driver uses the
      virtqueues in the probe function.  (The virtio core sets the device
      ready when probe returns.)
      
      Fixes: 8b4ec69d ("virtio: harden vring IRQ")
      Fixes: 68f5d3f3 ("um: add PCI over virtio emulation driver")
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Message-Id: <20220610151203.3492541-1-vincent.whitchurch@axis.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Tested-by: default avatarJohannes Berg <johannes@sipsolutions.net>
      eacea844
    • Linus Torvalds's avatar
      Merge tag 'nfsd-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 0885eacd
      Linus Torvalds authored
      Pull nfsd fixes from Chuck Lever:
       "Notable changes:
      
         - There is now a backup maintainer for NFSD
      
        Notable fixes:
      
         - Prevent array overruns in svc_rdma_build_writes()
      
         - Prevent buffer overruns when encoding NFSv3 READDIR results
      
         - Fix a potential UAF in nfsd_file_put()"
      
      * tag 'nfsd-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
        SUNRPC: Clean up xdr_get_next_encode_buffer()
        SUNRPC: Clean up xdr_commit_encode()
        SUNRPC: Optimize xdr_reserve_space()
        SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
        SUNRPC: Trap RDMA segment overflows
        NFSD: Fix potential use-after-free in nfsd_file_put()
        MAINTAINERS: reciprocal co-maintainership for file locking and nfsd
      0885eacd
  6. 10 Jun, 2022 12 commits