1. 29 Jun, 2023 5 commits
    • Dave Chinner's avatar
      xfs: allow extent free intents to be retried · 0853b5de
      Dave Chinner authored
      Extent freeing neeeds to be able to avoid a busy extent deadlock
      when the transaction itself holds the only busy extents in the
      allocation group. This may occur if we have an EFI that contains
      multiple extents to be freed, and the freeing the second intent
      requires the space the first extent free released to expand the
      AGFL. If we block on the busy extent at this point, we deadlock.
      
      We hold a dirty transaction that contains a entire atomic extent
      free operations within it, so if we can abort the extent free
      operation and commit the progress that we've made, the busy extent
      can be resolved by a log force. Hence we can restart the aborted
      extent free with a new transaction and continue to make
      progress without risking deadlocks.
      
      To enable this, we need the EFI processing code to be able to handle
      an -EAGAIN error to tell it to commit the current transaction and
      retry again. This mechanism is already built into the defer ops
      processing (used bythe refcount btree modification intents), so
      there's relatively little handling we need to add to the EFI code to
      enable this.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarChandan Babu R <chandan.babu@oracle.com>
      0853b5de
    • Dave Chinner's avatar
      xfs: pass alloc flags through to xfs_extent_busy_flush() · 6a2a9d77
      Dave Chinner authored
      To avoid blocking in xfs_extent_busy_flush() when freeing extents
      and the only busy extents are held by the current transaction, we
      need to pass the XFS_ALLOC_FLAG_FREEING flag context all the way
      into xfs_extent_busy_flush().
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarChandan Babu R <chandan.babu@oracle.com>
      6a2a9d77
    • Dave Chinner's avatar
      xfs: use deferred frees for btree block freeing · b742d7b4
      Dave Chinner authored
      Btrees that aren't freespace management trees use the normal extent
      allocation and freeing routines for their blocks. Hence when a btree
      block is freed, a direct call to xfs_free_extent() is made and the
      extent is immediately freed. This puts the entire free space
      management btrees under this path, so we are stacking btrees on
      btrees in the call stack. The inobt, finobt and refcount btrees
      all do this.
      
      However, the bmap btree does not do this - it calls
      xfs_free_extent_later() to defer the extent free operation via an
      XEFI and hence it gets processed in deferred operation processing
      during the commit of the primary transaction (i.e. via intent
      chaining).
      
      We need to change xfs_free_extent() to behave in a non-blocking
      manner so that we can avoid deadlocks with busy extents near ENOSPC
      in transactions that free multiple extents. Inserting or removing a
      record from a btree can cause a multi-level tree merge operation and
      that will free multiple blocks from the btree in a single
      transaction. i.e. we can call xfs_free_extent() multiple times, and
      hence the btree manipulation transaction is vulnerable to this busy
      extent deadlock vector.
      
      To fix this, convert all the remaining callers of xfs_free_extent()
      to use xfs_free_extent_later() to queue XEFIs and hence defer
      processing of the extent frees to a context that can be safely
      restarted if a deadlock condition is detected.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarChandan Babu R <chandan.babu@oracle.com>
      b742d7b4
    • Dave Chinner's avatar
      xfs: don't reverse order of items in bulk AIL insertion · 939bd50d
      Dave Chinner authored
      XFS has strict metadata ordering requirements. One of the things it
      does is maintain the commit order of items from transaction commit
      through the CIL and into the AIL. That is, if a transaction logs
      item A before item B in a modification, then they will be inserted
      into the CIL in the order {A, B}. These items are then written into
      the iclog during checkpointing in the order {A, B}. When the
      checkpoint commits, they are supposed to be inserted into the AIL in
      the order {A, B}, and when they are pushed from the AIL, they are
      pushed in the order {A, B}.
      
      If we crash, log recovery then replays the two items from the
      checkpoint in the order {A, B}, resulting in the objects the items
      apply to being queued for writeback at the end of the checkpoint
      in the order {A, B}. This means recovery behaves the same way as the
      runtime code.
      
      In places, we have subtle dependencies on this ordering being
      maintained. One of this place is performing intent recovery from the
      log. It assumes that recovering an intent will result in a
      non-intent object being the first thing that is modified in the
      recovery transaction, and so when the transaction commits and the
      journal flushes, the first object inserted into the AIL beyond the
      intent recovery range will be a non-intent item.  It uses the
      transistion from intent items to non-intent items to stop the
      recovery pass.
      
      A recent log recovery issue indicated that an intent was appearing
      as the first item in the AIL beyond the recovery range, hence
      breaking the end of recovery detection that exists.
      
      Tracing indicated insertion of the items into the AIL was apparently
      occurring in the right order (the intent was last in the commit item
      list), but the intent was appearing first in the AIL. IOWs, the
      order of items in the AIL was {D,C,B,A}, not {A,B,C,D}, and bulk
      insertion was reversing the order of the items in the batch of items
      being inserted.
      
      Lucky for us, all the items fed to bulk insertion have the same LSN,
      so the reversal of order does not affect the log head/tail tracking
      that is based on the contents of the AIL. It only impacts on code
      that has implicit, subtle dependencies on object order, and AFAICT
      only the intent recovery loop is impacted by it.
      
      Make sure bulk AIL insertion does not reorder items incorrectly.
      
      Fixes: 0e57f6a3 ("xfs: bulk AIL insertion during transaction commit")
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarChandan Babu R <chandan.babu@oracle.com>
      939bd50d
    • Colin Ian King's avatar
      xfs: remove redundant initializations of pointers drop_leaf and save_leaf · 347eb95b
      Colin Ian King authored
      Pointers drop_leaf and save_leaf are initialized with values that are never
      read, they are being re-assigned later on just before they are used. Remove
      the redundant early initializations and keep the later assignments at the
      point where they are used. Cleans up two clang scan build warnings:
      
      fs/xfs/libxfs/xfs_attr_leaf.c:2288:29: warning: Value stored to 'drop_leaf'
      during its initialization is never read [deadcode.DeadStores]
      fs/xfs/libxfs/xfs_attr_leaf.c:2289:29: warning: Value stored to 'save_leaf'
      during its initialization is never read [deadcode.DeadStores]
      Signed-off-by: default avatarColin Ian King <colin.i.king@gmail.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      347eb95b
  2. 13 Jun, 2023 4 commits
    • Long Li's avatar
      xfs: fix ag count overflow during growfs · c3b880ac
      Long Li authored
      I found a corruption during growfs:
      
       XFS (loop0): Internal error agbno >= mp->m_sb.sb_agblocks at line 3661 of
         file fs/xfs/libxfs/xfs_alloc.c.  Caller __xfs_free_extent+0x28e/0x3c0
       CPU: 0 PID: 573 Comm: xfs_growfs Not tainted 6.3.0-rc7-next-20230420-00001-gda8c95746257
       Call Trace:
        <TASK>
        dump_stack_lvl+0x50/0x70
        xfs_corruption_error+0x134/0x150
        __xfs_free_extent+0x2c1/0x3c0
        xfs_ag_extend_space+0x291/0x3e0
        xfs_growfs_data+0xd72/0xe90
        xfs_file_ioctl+0x5f9/0x14a0
        __x64_sys_ioctl+0x13e/0x1c0
        do_syscall_64+0x39/0x80
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
       XFS (loop0): Corruption detected. Unmount and run xfs_repair
       XFS (loop0): Internal error xfs_trans_cancel at line 1097 of file
         fs/xfs/xfs_trans.c.  Caller xfs_growfs_data+0x691/0xe90
       CPU: 0 PID: 573 Comm: xfs_growfs Not tainted 6.3.0-rc7-next-20230420-00001-gda8c95746257
       Call Trace:
        <TASK>
        dump_stack_lvl+0x50/0x70
        xfs_error_report+0x93/0xc0
        xfs_trans_cancel+0x2c0/0x350
        xfs_growfs_data+0x691/0xe90
        xfs_file_ioctl+0x5f9/0x14a0
        __x64_sys_ioctl+0x13e/0x1c0
        do_syscall_64+0x39/0x80
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
       RIP: 0033:0x7f2d86706577
      
      The bug can be reproduced with the following sequence:
      
       # truncate -s  1073741824 xfs_test.img
       # mkfs.xfs -f -b size=1024 -d agcount=4 xfs_test.img
       # truncate -s 2305843009213693952  xfs_test.img
       # mount -o loop xfs_test.img /mnt/test
       # xfs_growfs -D  1125899907891200  /mnt/test
      
      The root cause is that during growfs, user space passed in a large value
      of newblcoks to xfs_growfs_data_private(), due to current sb_agblocks is
      too small, new AG count will exceed UINT_MAX. Because of AG number type
      is unsigned int and it would overflow, that caused nagcount much smaller
      than the actual value. During AG extent space, delta blocks in
      xfs_resizefs_init_new_ags() will much larger than the actual value due to
      incorrect nagcount, even exceed UINT_MAX. This will cause corruption and
      be detected in __xfs_free_extent. Fix it by growing the filesystem to up
      to the maximally allowed AGs and not return EINVAL when new AG count
      overflow.
      Signed-off-by: default avatarLong Li <leo.lilong@huawei.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      c3b880ac
    • Christoph Hellwig's avatar
      xfs: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method · b2943499
      Christoph Hellwig authored
      Since commit a2ad63da ("VFS: add FMODE_CAN_ODIRECT file flag") file
      systems can just set the FMODE_CAN_ODIRECT flag at open time instead of
      wiring up a dummy direct_IO method to indicate support for direct I/O.
      Do that for xfs so that noop_direct_IO can eventually be removed.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      b2943499
    • Darrick J. Wong's avatar
      xfs: drop EXPERIMENTAL tag for large extent counts · 61d7e827
      Darrick J. Wong authored
      This feature has been baking in upstream for ~10mo with no bug reports.
      It seems to work fine here, let's get rid of the scary warnings?
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      61d7e827
    • Darrick J. Wong's avatar
      xfs: don't deplete the reserve pool when trying to shrink the fs · 06f3ef6e
      Darrick J. Wong authored
      Every now and then, xfs/168 fails with this logged in dmesg:
      
      Reserve blocks depleted! Consider increasing reserve pool size.
      EXPERIMENTAL online shrink feature in use. Use at your own risk!
      Per-AG reservation for AG 1 failed.  Filesystem may run out of space.
      Per-AG reservation for AG 1 failed.  Filesystem may run out of space.
      Error -28 reserving per-AG metadata reserve pool.
      Corruption of in-memory data (0x8) detected at xfs_ag_shrink_space+0x23c/0x3b0 [xfs] (fs/xfs/libxfs/xfs_ag.c:1007).  Shutting down filesystem.
      
      It's silly to deplete the reserved blocks pool just to shrink the
      filesystem, particularly since the fs goes down after that.
      
      Fixes: fb2fc172 ("xfs: support shrinking unused space in the last AG")
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      06f3ef6e
  3. 11 Jun, 2023 3 commits
    • Linus Torvalds's avatar
      Linux 6.4-rc6 · 858fd168
      Linus Torvalds authored
      858fd168
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v6.4_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4c605260
      Linus Torvalds authored
      Pull x86 fix from Borislav Petkov:
      
       - Set up the kernel CS earlier in the boot process in case EFI boots
         the kernel after bypassing the decompressor and the CS descriptor
         used ends up being the EFI one which is not mapped in the identity
         page table, leading to early SEV/SNP guest communication exceptions
         resulting in the guest crashing
      
      * tag 'x86_urgent_for_v6.4_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/head/64: Switch to KERNEL_CS as soon as new GDT is installed
      4c605260
    • Linus Torvalds's avatar
      Merge tag '6.4-rc5-smb3-server-fixes' of git://git.samba.org/ksmbd · 65d7ca59
      Linus Torvalds authored
      Pull smb server fixes from Steve French:
       "Five smb3 server fixes, all also for stable:
      
         - Fix four slab out of bounds warnings: improve checks for protocol
           id, and for small packet length, and for create context parsing,
           and for negotiate context parsing
      
         - Fix for incorrect dereferencing POSIX ACLs"
      
      * tag '6.4-rc5-smb3-server-fixes' of git://git.samba.org/ksmbd:
        ksmbd: validate smb request protocol id
        ksmbd: check the validation of pdu_size in ksmbd_conn_handler_loop
        ksmbd: fix posix_acls and acls dereferencing possible ERR_PTR()
        ksmbd: fix out-of-bound read in parse_lease_state()
        ksmbd: fix out-of-bound read in deassemble_neg_contexts()
      65d7ca59
  4. 10 Jun, 2023 3 commits
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 022ce886
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Biggest news is that Andi Shyti steps in for maintaining the
        controller drivers. Thank you very much!
      
        Other than that, one new driver maintainer and the rest is usual
        driver bugfixes. at24 has a Kconfig dependecy fix"
      
      * tag 'i2c-for-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        MAINTAINERS: Add entries for Renesas RZ/V2M I2C driver
        eeprom: at24: also select REGMAP
        i2c: sprd: Delete i2c adapter in .remove's error path
        i2c: mv64xxx: Fix reading invalid status value in atomic mode
        i2c: designware: fix idx_write_cnt in read loop
        i2c: mchp-pci1xxxx: Avoid cast to incompatible function type
        i2c: img-scb: Fix spelling mistake "innacurate" -> "inaccurate"
        MAINTAINERS: Add myself as I2C host drivers maintainer
      022ce886
    • Linus Torvalds's avatar
      Merge tag 'soundwire-6.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · 6be5e47b
      Linus Torvalds authored
      Pull soundwire fixes from Vinod Koul:
       "Core fix for missing flag clear, error patch handling in qcom driver
        and BIOS quirk for HP Spectre x360:
      
         - HP Spectre x360 soundwire DMI quirk
      
         - Error path handling for qcom driver
      
         - Core fix for missing clear of alloc_slave_rt"
      
      * tag 'soundwire-6.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
        soundwire: stream: Add missing clear of alloc_slave_rt
        soundwire: qcom: add proper error paths in qcom_swrm_startup()
        soundwire: dmi-quirks: add new mapping for HP Spectre x360
      6be5e47b
    • Linus Torvalds's avatar
      Merge tag 'arm-fixes-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 859c7459
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "Most of the changes this time are for the Qualcomm Snapdragon
        platforms.
      
        There are bug fixes for error handling in Qualcomm icc-bwmon,
        rpmh-rsc, ramp_controller and rmtfs driver as well as the AMD tee
        firmware driver and a missing initialization in the Arm ff-a firmware
        driver. The Qualcomm RPMh and EDAC drivers need some rework to work
        correctly on all supported chips.
      
        The DT fixes include:
      
         - i.MX8 fixes for gpio, pinmux and clock settings
      
         - ADS touchscreen gpio polarity settings in several machines
      
         - Address dtb warnings for caches, panel and input-enable properties
           on Qualcomm platforms
      
         - Incorrect data on qualcomm platforms fir SA8155P power domains,
           SM8550 LLCC, SC7180-lite SDRAM frequencies and SM8550 soundwire
      
         - Remoteproc firmware paths are corrected for Sony Xperia 10 IV"
      
      * tag 'arm-fixes-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (36 commits)
        firmware: arm_ffa: Set handle field to zero in memory descriptor
        ARM: dts: Fix erroneous ADS touchscreen polarities
        arm64: dts: imx8mn-beacon: Fix SPI CS pinmux
        arm64: dts: imx8-ss-dma: assign default clock rate for lpuarts
        arm64: dts: imx8qm-mek: correct GPIOs for USDHC2 CD and WP signals
        EDAC/qcom: Get rid of hardcoded register offsets
        EDAC/qcom: Remove superfluous return variable assignment in qcom_llcc_core_setup()
        arm64: dts: qcom: sm8550: Use the correct LLCC register scheme
        dt-bindings: cache: qcom,llcc: Fix SM8550 description
        arm64: dts: qcom: sc7180-lite: Fix SDRAM freq for misidentified sc7180-lite boards
        arm64: dts: qcom: sm8550: use uint16 for Soundwire interval
        soc: qcom: rpmhpd: Add SA8155P power domains
        arm64: dts: qcom: Split out SA8155P and use correct RPMh power domains
        dt-bindings: power: qcom,rpmpd: Add SA8155P
        soc: qcom: Rename ice to qcom_ice to avoid module name conflict
        soc: qcom: rmtfs: Fix error code in probe()
        soc: qcom: ramp_controller: Fix an error handling path in qcom_ramp_controller_probe()
        ARM: dts: at91: sama7g5ek: fix debounce delay property for shdwc
        ARM: at91: pm: fix imbalanced reference counter for ethernet devices
        arm64: dts: qcom: sm6375-pdx225: Fix remoteproc firmware paths
        ...
      859c7459
  5. 09 Jun, 2023 23 commits
  6. 08 Jun, 2023 2 commits