1. 01 Jun, 2021 5 commits
  2. 27 May, 2021 2 commits
    • Dave Chinner's avatar
      xfs: bunmapi has unnecessary AG lock ordering issues · 0fe0bbe0
      Dave Chinner authored
      large directory block size operations are assert failing because
      xfs_bunmapi() is not completely removing fragmented directory blocks
      like so:
      
      XFS: Assertion failed: done, file: fs/xfs/libxfs/xfs_dir2.c, line: 677
      ....
      Call Trace:
       xfs_dir2_shrink_inode+0x1a8/0x210
       xfs_dir2_block_to_sf+0x2ae/0x410
       xfs_dir2_block_removename+0x21a/0x280
       xfs_dir_removename+0x195/0x1d0
       xfs_rename+0xb79/0xc50
       ? avc_has_perm+0x8d/0x1a0
       ? avc_has_perm_noaudit+0x9a/0x120
       xfs_vn_rename+0xdb/0x150
       vfs_rename+0x719/0xb50
       ? __lookup_hash+0x6a/0xa0
       do_renameat2+0x413/0x5e0
       __x64_sys_rename+0x45/0x50
       do_syscall_64+0x3a/0x70
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      We are aborting the bunmapi() pass because of this specific chunk of
      code:
      
                      /*
                       * Make sure we don't touch multiple AGF headers out of order
                       * in a single transaction, as that could cause AB-BA deadlocks.
                       */
                      if (!wasdel && !isrt) {
                              agno = XFS_FSB_TO_AGNO(mp, del.br_startblock);
                              if (prev_agno != NULLAGNUMBER && prev_agno > agno)
                                      break;
                              prev_agno = agno;
                      }
      
      This is designed to prevent deadlocks in AGF locking when freeing
      multiple extents by ensuring that we only ever lock in increasing
      AG number order. Unfortunately, this also violates the "bunmapi will
      always succeed" semantic that some high level callers depend on,
      such as xfs_dir2_shrink_inode(), xfs_da_shrink_inode() and
      xfs_inactive_symlink_rmt().
      
      This AG lock ordering was introduced back in 2017 to fix deadlocks
      triggered by generic/299 as reported here:
      
      https://lore.kernel.org/linux-xfs/800468eb-3ded-9166-20a4-047de8018582@gmail.com/
      
      This codebase is old enough that it was before we were defering all
      AG based extent freeing from within xfs_bunmapi(). THat is, we never
      actually lock AGs in xfs_bunmapi() any more - every non-rt based
      extent free is added to the defer ops list, as is all BMBT block
      freeing. And RT extents are not RT based, so there's no lock
      ordering issues associated with them.
      
      Hence this AGF lock ordering code is both broken and dead. Let's
      just remove it so that the large directory block code works reliably
      again.
      
      Tested against xfs/538 and generic/299 which is the original test
      that exposed the deadlocks that this code fixed.
      
      Fixes: 5b094d6d ("xfs: fix multi-AG deadlock in xfs_bunmapi")
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      0fe0bbe0
    • Dave Chinner's avatar
      xfs: btree format inode forks can have zero extents · 991c2c59
      Dave Chinner authored
      xfs/538 is assert failing with this trace when testing with
      directory block sizes of 64kB:
      
      XFS: Assertion failed: !xfs_need_iread_extents(ifp), file: fs/xfs/libxfs/xfs_bmap.c, line: 608
      ....
      Call Trace:
       xfs_bmap_btree_to_extents+0x2a9/0x470
       ? kmem_cache_alloc+0xe7/0x220
       __xfs_bunmapi+0x4ca/0xdf0
       xfs_bunmapi+0x1a/0x30
       xfs_dir2_shrink_inode+0x71/0x210
       xfs_dir2_block_to_sf+0x2ae/0x410
       xfs_dir2_block_removename+0x21a/0x280
       xfs_dir_removename+0x195/0x1d0
       xfs_remove+0x244/0x460
       xfs_vn_unlink+0x53/0xa0
       ? selinux_inode_unlink+0x13/0x20
       vfs_unlink+0x117/0x220
       do_unlinkat+0x1a2/0x2d0
       __x64_sys_unlink+0x42/0x60
       do_syscall_64+0x3a/0x70
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      This is a check to ensure that the extents have been read into
      memory before we are doing a ifork btree manipulation. This assert
      is bogus in the above case.
      
      We have a fragmented directory block that has more extents in it
      than can fit in extent format, so the inode data fork is in btree
      format. xfs_dir2_shrink_inode() asks to remove all remaining 16
      filesystem blocks from the inode so it can convert to short form,
      and __xfs_bunmapi() removes all the extents. We now have a data fork
      in btree format but have zero extents in the fork. This incorrectly
      trips the xfs_need_iread_extents() assert because it assumes that an
      empty extent btree means the extent tree has not been read into
      memory yet. This is clearly not the case with xfs_bunmapi(), as it
      has an explicit call to xfs_iread_extents() in it to pull the
      extents into memory before it starts unmapping.
      
      Also, the assert directly after this bogus one is:
      
      	ASSERT(ifp->if_format == XFS_DINODE_FMT_BTREE);
      
      Which covers the context in which it is legal to call
      xfs_bmap_btree_to_extents just fine. Hence we should just remove the
      bogus assert as it is clearly wrong and causes a regression.
      
      The returns the test behaviour to the pre-existing assert failure in
      xfs_dir2_shrink_inode() that indicates xfs_bunmapi() has failed to
      remove all the extents in the range it was asked to unmap.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      991c2c59
  3. 26 May, 2021 1 commit
  4. 25 May, 2021 3 commits
    • Darrick J. Wong's avatar
      xfs: validate extsz hints against rt extent size when rtinherit is set · 603f000b
      Darrick J. Wong authored
      The RTINHERIT bit can be set on a directory so that newly created
      regular files will have the REALTIME bit set to store their data on the
      realtime volume.  If an extent size hint (and EXTSZINHERIT) are set on
      the directory, the hint will also be copied into the new file.
      
      As pointed out in previous patches, for realtime files we require the
      extent size hint be an integer multiple of the realtime extent, but we
      don't perform the same validation on a directory with both RTINHERIT and
      EXTSZINHERIT set, even though the only use-case of that combination is
      to propagate extent size hints into new realtime files.  This leads to
      inode corruption errors when the bad values are propagated.
      
      Because there may be existing filesystems with such a configuration, we
      cannot simply amend the inode verifier to trip on these directories and
      call it a day because that will cause previously "working" filesystems
      to start throwing errors abruptly.  Note that it's valid to have
      directories with rtinherit set even if there is no realtime volume, in
      which case the problem does not manifest because rtinherit is ignored if
      there's no realtime device; and it's possible that someone set the flag,
      crashed, repaired the filesystem (which clears the hint on the realtime
      file) and continued.
      
      Therefore, mitigate this issue in several ways: First, if we try to
      write out an inode with both rtinherit/extszinherit set and an unaligned
      extent size hint, turn off the hint to correct the error.  Second, if
      someone tries to misconfigure a directory via the fssetxattr ioctl, fail
      the ioctl.  Third, reverify both extent size hint values when we
      propagate heritable inode attributes from parent to child, to prevent
      misconfigurations from spreading.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      603f000b
    • Darrick J. Wong's avatar
      xfs: standardize extent size hint validation · 6b69e485
      Darrick J. Wong authored
      While chasing a bug involving invalid extent size hints being propagated
      into newly created realtime files, I noticed that the xfs_ioctl_setattr
      checks for the extent size hints weren't the same as the ones now
      encoded in libxfs and used for validation in repair and mkfs.
      
      Because the checks in libxfs are more stringent than the ones in the
      ioctl, it's possible for a live system to set inode flags that
      immediately result in corruption warnings.  Specifically, it's possible
      to set an extent size hint on an rtinherit directory without checking if
      the hint is aligned to the realtime extent size, which makes no sense
      since that combination is used only to seed new realtime files.
      
      Replace the open-coded and inadequate checks with the libxfs verifier
      versions and update the code comments a bit.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      6b69e485
    • Darrick J. Wong's avatar
      xfs: check free AG space when making per-AG reservations · 0f934251
      Darrick J. Wong authored
      The new online shrink code exposed a gap in the per-AG reservation
      code, which is that we only return ENOSPC to callers if the entire fs
      doesn't have enough free blocks.  Except for debugging mode, the
      reservation init code doesn't ever check that there's enough free space
      in that AG to cover the reservation.
      
      Not having enough space is not considered an immediate fatal error that
      requires filesystem offlining because (a) it's shouldn't be possible to
      wind up in that state through normal file operations and (b) even if
      one did, freeing data blocks would recover the situation.
      
      However, online shrink now needs to know if shrinking would not leave
      enough space so that it can abort the shrink operation.  Hence we need
      to promote this assertion into an actual error return.
      
      Observed by running xfs/168 with a 1k block size, though in theory this
      could happen with any configuration.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      0f934251
  5. 20 May, 2021 3 commits
  6. 17 May, 2021 1 commit
    • Darrick J. Wong's avatar
      xfs: adjust rt allocation minlen when extszhint > rtextsize · 9d5e8492
      Darrick J. Wong authored
      xfs_bmap_rtalloc doesn't handle realtime extent files with extent size
      hints larger than the rt volume's extent size properly, because
      xfs_bmap_extsize_align can adjust the offset/length parameters to try to
      fit the extent size hint.
      
      Under these conditions, minlen has to be large enough so that any
      allocation returned by xfs_rtallocate_extent will be large enough to
      cover at least one of the blocks that the caller asked for.  If the
      allocation is too short, bmapi_write will return no mapping for the
      requested range, which causes ENOSPC errors in other parts of the
      filesystem.
      
      Therefore, adjust minlen upwards to fix this.  This can be found by
      running generic/263 (g/127 or g/522) with a realtime extent size hint
      that's larger than the rt volume extent size.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      9d5e8492
  7. 16 May, 2021 7 commits
    • Linus Torvalds's avatar
      Linux 5.13-rc2 · d07f6ca9
      Linus Torvalds authored
      d07f6ca9
    • Linus Torvalds's avatar
      Merge tag 'driver-core-5.13-rc2' of... · 28183dbf
      Linus Torvalds authored
      Merge tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fixes from Greg KH:
       "Here are two driver fixes for driver core changes that happened in
        5.13-rc1.
      
        The clk driver fix resolves a many-reported issue with booting some
        devices, and the USB typec fix resolves the reported problem of USB
        systems on some embedded boards.
      
        Both of these have been in linux-next this week with no reported
        issues"
      
      * tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        clk: Skip clk provider registration when np is NULL
        usb: typec: tcpm: Don't block probing of consumers of "connector" nodes
      28183dbf
    • Linus Torvalds's avatar
      Merge tag 'staging-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 6942d81a
      Linus Torvalds authored
      Pull staging and IIO driver fixes from Greg KH:
       "Here are some small IIO driver fixes and one Staging driver fix for
        5.13-rc2.
      
        Nothing major, just some resolutions for reported problems:
      
         - gcc-11 bogus warning fix for rtl8723bs
      
         - iio driver tiny fixes
      
        All of these have been in linux-next for many days with no reported
        issues"
      
      * tag 'staging-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        iio: tsl2583: Fix division by a zero lux_val
        iio: core: return ENODEV if ioctl is unknown
        iio: core: fix ioctl handlers removal
        iio: gyro: mpu3050: Fix reported temperature value
        iio: hid-sensors: select IIO_TRIGGERED_BUFFER under HID_SENSOR_IIO_TRIGGER
        iio: proximity: pulsedlight: Fix rumtime PM imbalance on error
        iio: light: gp2ap002: Fix rumtime PM imbalance on error
        staging: rtl8723bs: avoid bogus gcc warning
      6942d81a
    • Linus Torvalds's avatar
      Merge tag 'usb-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 4a668429
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB fixes for 5.13-rc2. They consist of a number
        of resolutions for reported issues:
      
         - typec fixes for found problems
      
         - xhci fixes and quirk additions
      
         - dwc3 driver fixes
      
         - minor fixes found by Coverity
      
         - cdc-wdm fixes for reported problems
      
        All of these have been in linux-next for a few days with no reported
        issues"
      
      * tag 'usb-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
        usb: core: hub: fix race condition about TRSMRCY of resume
        usb: typec: tcpm: Fix SINK_DISCOVERY current limit for Rp-default
        xhci: Add reset resume quirk for AMD xhci controller.
        usb: xhci: Increase timeout for HC halt
        xhci: Do not use GFP_KERNEL in (potentially) atomic context
        xhci: Fix giving back cancelled URBs even if halted endpoint can't reset
        xhci-pci: Allow host runtime PM as default for Intel Alder Lake xHCI
        usb: musb: Fix an error message
        usb: typec: tcpm: Fix wrong handling for Not_Supported in VDM AMS
        usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work
        usb: typec: ucsi: Retrieve all the PDOs instead of just the first 4
        usb: fotg210-hcd: Fix an error message
        docs: usb: function: Modify path name
        usb: dwc3: omap: improve extcon initialization
        usb: typec: ucsi: Put fwnode in any case during ->probe()
        usb: typec: tcpm: Fix wrong handling in GET_SINK_CAP
        usb: dwc2: Remove obsolete MODULE_ constants from platform.c
        usb: dwc3: imx8mp: fix error return code in dwc3_imx8mp_probe()
        usb: dwc3: imx8mp: detect dwc3 core node via compatible string
        usb: dwc3: gadget: Return success always for kick transfer in ep queue
        ...
      4a668429
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8ce36481
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "Two fixes for timers:
      
         - Use the ALARM feature check in the alarmtimer core code insted of
           the old method of checking for the set_alarm() callback.
      
           Drivers can have that callback set but the feature bit cleared. If
           such a RTC device is selected then alarms wont work.
      
         - Use a proper define to let the preprocessor check whether Hyper-V
           VDSO clocksource should be active.
      
           The code used a constant in an enum with #ifdef, which evaluates to
           always false and disabled the clocksource for VDSO"
      
      * tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource/drivers/hyper-v: Re-enable VDSO_CLOCKMODE_HVCLOCK on X86
        alarmtimer: Check RTC features instead of ops
      8ce36481
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · f44e58bb
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - two patches for error path fixes
      
       - a small series for fixing a regression with swiotlb with Xen on Arm
      
      * tag 'for-linus-5.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/swiotlb: check if the swiotlb has already been initialized
        arm64: do not set SWIOTLB_NO_FORCE when swiotlb is required
        xen/arm: move xen_swiotlb_detect to arm/swiotlb-xen.h
        xen/unpopulated-alloc: fix error return code in fill_list()
        xen/gntdev: fix gntdev_mmap() error exit path
      f44e58bb
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ccb013c2
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
       "The three SEV commits are not really urgent material. But we figured
        since getting them in now will avoid a huge amount of conflicts
        between future SEV changes touching tip, the kvm and probably other
        trees, sending them to you now would be best.
      
        The idea is that the tip, kvm etc branches for 5.14 will all base
        ontop of -rc2 and thus everything will be peachy. What is more, those
        changes are purely mechanical and defines movement so they should be
        fine to go now (famous last words).
      
        Summary:
      
         - Enable -Wundef for the compressed kernel build stage
      
         - Reorganize SEV code to streamline and simplify future development"
      
      * tag 'x86_urgent_for_v5.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot/compressed: Enable -Wundef
        x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFG
        x86/sev: Move GHCB MSR protocol and NAE definitions in a common header
        x86/sev-es: Rename sev-es.{ch} to sev.{ch}
      ccb013c2
  8. 15 May, 2021 18 commits
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 63d1cb53
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix a regression in the conversion of the 64-bit BookE interrupt
         entry to C.
      
       - Fix KVM hosts running with the hash MMU since the recent KVM gfn
         changes.
      
       - Fix a deadlock in our paravirt spinlocks when hcall tracing is
         enabled.
      
       - Several fixes for oopses in our runtime code patching for security
         mitigations.
      
       - A couple of minor fixes for the recent conversion of 32-bit interrupt
         entry/exit to C.
      
       - Fix __get_user() causing spurious crashes in sigreturn due to a bad
         inline asm constraint, spotted with GCC 11.
      
       - A fix for the way we track IRQ masking state vs NMI interrupts when
         using the new scv system call entry path.
      
       - A couple more minor fixes.
      
      Thanks to Cédric Le Goater, Christian Zigotzky, Christophe Leroy,
      Naveen N. Rao, Nicholas Piggin Paul Menzel, and Sean Christopherson.
      
      * tag 'powerpc-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64e/interrupt: Fix nvgprs being clobbered
        powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled
        powerpc/64s: Fix stf mitigation patching w/strict RWX & hash
        powerpc/64s: Fix entry flush patching w/strict RWX & hash
        powerpc/64s: Fix crashes when toggling entry flush barrier
        powerpc/64s: Fix crashes when toggling stf barrier
        KVM: PPC: Book3S HV: Fix kvm_unmap_gfn_range_hv() for Hash MMU
        powerpc/legacy_serial: Fix UBSAN: array-index-out-of-bounds
        powerpc/signal: Fix possible build failure with unsafe_copy_fpr_{to/from}_user
        powerpc/uaccess: Fix __get_user() with CONFIG_CC_HAS_ASM_GOTO_OUTPUT
        powerpc/pseries: warn if recursing into the hcall tracing code
        powerpc/pseries: use notrace hcall variant for H_CEDE idle
        powerpc/pseries: Don't trace hcall tracing wrapper
        powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks
        powerpc/syscall: Calling kuap_save_and_lock() is wrong
        powerpc/interrupts: Fix kuep_unlock() call
      63d1cb53
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c12a29ed
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Fix an idle CPU selection bug, and an AMD Ryzen maximum frequency
        enumeration bug"
      
      * tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen generations
        sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu()
      c12a29ed
    • Linus Torvalds's avatar
      Merge tag 'objtool-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7c425b7
      Linus Torvalds authored
      Pull objtool fixes from Ingo Molnar:
       "Fix a couple of endianness bugs that crept in"
      
      * tag 'objtool-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool/x86: Fix elf_add_alternative() endianness
        objtool: Fix elf_create_undef_symbol() endianness
      e7c425b7
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 077fc644
      Linus Torvalds authored
      Pull irq fix from Ingo Molnar:
       "Fix build warning on SH"
      
      * tag 'irq-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sh: Remove unused variable
      077fc644
    • Linus Torvalds's avatar
      Merge tag 'core-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 91b7a0f0
      Linus Torvalds authored
      Pull x86 stack randomization fix from Ingo Molnar:
       "Fix an assembly constraint that affected LLVM up to version 12"
      
      * tag 'core-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        stack: Replace "o" output with "r" input constraint
      91b7a0f0
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · a4147415
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "13 patches.
      
        Subsystems affected by this patch series: resource, squashfs, hfsplus,
        modprobe, and mm (hugetlb, slub, userfaultfd, ksm, pagealloc, kasan,
        pagemap, and ioremap)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/ioremap: fix iomap_max_page_shift
        docs: admin-guide: update description for kernel.modprobe sysctl
        hfsplus: prevent corruption in shrinking truncate
        mm/filemap: fix readahead return types
        kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled
        mm: fix struct page layout on 32-bit systems
        ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()"
        userfaultfd: release page in error path to avoid BUG_ON
        squashfs: fix divide error in calculate_skip()
        kernel/resource: fix return code check in __request_free_mem_region
        mm, slub: move slub_debug static key enabling outside slab_mutex
        mm/hugetlb: fix cow where page writtable in child
        mm/hugetlb: fix F_SEAL_FUTURE_WRITE
      a4147415
    • Linus Torvalds's avatar
      Merge tag 'arc-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · f36edc55
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - PAE fixes
      
       - syscall num check off-by-one bug
      
       - misc fixes
      
      * tag 'arc-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: mm: Use max_high_pfn as a HIGHMEM zone border
        ARC: mm: PAE: use 40-bit physical page mask
        ARC: entry: fix off-by-one error in syscall number validation
        ARC: kgdb: add 'fallthrough' to prevent a warning
        arc: Fix typos/spellos
      f36edc55
    • Linus Torvalds's avatar
      Merge tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block · 8f4ae0f6
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Fix for shared tag set exit (Bart)
      
       - Correct ioctl range for zoned ioctls (Damien)
      
       - Removed dead/unused function (Lin)
      
       - Fix perf regression for shared tags (Ming)
      
       - Fix out-of-bounds issue with kyber and preemption (Omar)
      
       - BFQ merge fix (Paolo)
      
       - Two error handling fixes for nbd (Sun)
      
       - Fix weight update in blk-iocost (Tejun)
      
       - NVMe pull request (Christoph):
            - correct the check for using the inline bio in nvmet (Chaitanya
              Kulkarni)
            - demote unsupported command warnings (Chaitanya Kulkarni)
            - fix corruption due to double initializing ANA state (me, Hou Pu)
            - reset ns->file when open fails (Daniel Wagner)
            - fix a NULL deref when SEND is completed with error in nvmet-rdma
              (Michal Kalderon)
      
       - Fix kernel-doc warning (Bart)
      
      * tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block:
        block/partitions/efi.c: Fix the efi_partition() kernel-doc header
        blk-mq: Swap two calls in blk_mq_exit_queue()
        blk-mq: plug request for shared sbitmap
        nvmet: use new ana_log_size instead the old one
        nvmet: seset ns->file when open fails
        nbd: share nbd_put and return by goto put_nbd
        nbd: Fix NULL pointer in flush_workqueue
        blkdev.h: remove unused codes blk_account_rq
        block, bfq: avoid circular stable merges
        blk-iocost: fix weight updates of inner active iocgs
        nvmet: demote fabrics cmd parse err msg to debug
        nvmet: use helper to remove the duplicate code
        nvmet: demote discovery cmd parse err msg to debug
        nvmet-rdma: Fix NULL deref when SEND is completed with error
        nvmet: fix inline bio check for passthru
        nvmet: fix inline bio check for bdev-ns
        nvme-multipath: fix double initialization of ANA state
        kyber: fix out of bounds access when preempted
        block: uapi: fix comment about block device ioctl
      8f4ae0f6
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-block · 56015910
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Just a few minor fixes/changes:
      
         - Fix issue with double free race for linked timeout completions
      
         - Fix reference issue with timeouts
      
         - Remove last few places that make SQPOLL special, since it's just an
           io thread now.
      
         - Bump maximum allowed registered buffers, as we don't allocate as
           much anymore"
      
      * tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-block:
        io_uring: increase max number of reg buffers
        io_uring: further remove sqpoll limits on opcodes
        io_uring: fix ltout double free on completion race
        io_uring: fix link timeout refs
      56015910
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-5.13-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 41f035c0
      Linus Torvalds authored
      Pull erofs fixes from Gao Xiang:
       "This mainly fixes 1 lcluster-sized pclusters for the big pcluster
        feature, which can be forcely generated by mkfs as a specific on-disk
        case for per-(sub)file compression strategies but missed to handle in
        runtime properly.
      
        Also, documentation updates are included to fix the broken
        illustration due to the ReST conversion by accident and complete the
        big pcluster introduction.
      
        Summary:
      
         - update documentation to fix the broken illustration due to ReST
           conversion by accident at that time and complete the big pcluster
           introduction
      
         - fix 1 lcluster-sized pclusters for the big pcluster feature"
      
      * tag 'erofs-for-5.13-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: fix 1 lcluster-sized pcluster for big pcluster
        erofs: update documentation about data compression
        erofs: fix broken illustration in documentation
      41f035c0
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · a5ce4296
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
       "A regression fix for a bootup crash condition introduced in this merge
        window and some other minor fixups:
      
         - Fix regression in ACPI NFIT table handling leading to crashes and
           driver load failures.
      
         - Move the nvdimm mailing list
      
         - Miscellaneous minor fixups"
      
      * tag 'libnvdimm-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        ACPI: NFIT: Fix support for variable 'SPA' structure size
        MAINTAINERS: Move nvdimm mailing list
        tools/testing/nvdimm: Make symbol '__nfit_test_ioremap' static
        libnvdimm: Remove duplicate struct declaration
      a5ce4296
    • Linus Torvalds's avatar
      Merge tag 'dax-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 393f42f1
      Linus Torvalds authored
      Pull dax fixes from Dan Williams:
       "A fix for a hang condition due to missed wakeups in the filesystem-dax
        core when exercised by virtiofs.
      
        This bug has been there from the beginning, but the condition has
        not triggered on other filesystems since they hold a lock over
        invalidation events"
      
      * tag 'dax-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        dax: Wake up all waiters after invalidating dax entry
        dax: Add a wakeup mode parameter to put_unlocked_entry()
        dax: Add an enum for specifying dax wakup mode
      393f42f1
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2021-05-15' of git://anongit.freedesktop.org/drm/drm · 33f85ca4
      Linus Torvalds authored
      Pull more drm fixes from Dave Airlie:
       "Looks like I wasn't the only one not fully switched on this week. The
        msm pull has a missing tag so I missed it, and i915 team were a bit
        late. In my defence I did have a day with the roof of my home office
        removed, so was sitting at my kids desk.
      
        msm:
         - dsi regression fix
         - dma-buf pinning fix
         - displayport fixes
         - llc fix
      
        i915:
         - Fix active callback alignment annotations and subsequent crashes
         - Retract link training strategy to slow and wide, again
         - Avoid division by zero on gen2
         - Use correct width reads for C0DRB3/C1DRB3 registers
         - Fix double free in pdp allocation failure path
         - Fix HDMI 2.1 PCON downstream caps check"
      
      * tag 'drm-fixes-2021-05-15' of git://anongit.freedesktop.org/drm/drm:
        drm/i915: Use correct downstream caps for check Src-Ctl mode for PCON
        drm/i915/overlay: Fix active retire callback alignment
        drm/i915: Fix crash in auto_retire
        drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdp
        drm/i915: Read C0DRB3/C1DRB3 as 16 bits again
        drm/i915: Avoid div-by-zero on gen2
        drm/i915/dp: Use slow and wide link training for everything
        drm/msm/dp: initialize audio_comp when audio starts
        drm/msm/dp: check sink_count before update is_connected status
        drm/msm: fix minor version to indicate MSM_PARAM_SUSPENDS support
        drm/msm/dsi: fix msm_dsi_phy_get_clk_provider return code
        drm/msm/dsi: dsi_phy_28nm_8960: fix uninitialized variable access
        drm/msm: fix LLC not being enabled for mmu500 targets
        drm/msm: Do not unpin/evict exported dma-buf's
      33f85ca4
    • Tetsuo Handa's avatar
      tty: vt: always invoke vc->vc_sw->con_resize callback · ffb324e6
      Tetsuo Handa authored
      syzbot is reporting OOB write at vga16fb_imageblit() [1], for
      resize_screen() from ioctl(VT_RESIZE) returns 0 without checking whether
      requested rows/columns fit the amount of memory reserved for the graphical
      screen if current mode is KD_GRAPHICS.
      
      ----------
        #include <sys/types.h>
        #include <sys/stat.h>
        #include <fcntl.h>
        #include <sys/ioctl.h>
        #include <linux/kd.h>
        #include <linux/vt.h>
      
        int main(int argc, char *argv[])
        {
              const int fd = open("/dev/char/4:1", O_RDWR);
              struct vt_sizes vt = { 0x4100, 2 };
      
              ioctl(fd, KDSETMODE, KD_GRAPHICS);
              ioctl(fd, VT_RESIZE, &vt);
              ioctl(fd, KDSETMODE, KD_TEXT);
              return 0;
        }
      ----------
      
      Allow framebuffer drivers to return -EINVAL, by moving vc->vc_mode !=
      KD_GRAPHICS check from resize_screen() to fbcon_resize().
      
      Link: https://syzkaller.appspot.com/bug?extid=1f29e126cf461c4de3b3 [1]
      Reported-by: default avatarsyzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com>
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Tested-by: default avatarsyzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ffb324e6
    • Christophe Leroy's avatar
      mm/ioremap: fix iomap_max_page_shift · 86d0c164
      Christophe Leroy authored
      iomap_max_page_shift is expected to contain a page shift, so it can't be a
      'bool', has to be an 'unsigned int'
      
      And fix the default values: P4D_SHIFT is when huge iomap is allowed.
      
      However, on some architectures (eg: powerpc book3s/64), P4D_SHIFT is not a
      constant so it can't be used to initialise a static variable.  So,
      initialise iomap_max_page_shift with a maximum shift supported by the
      architecture, it is gated by P4D_SHIFT in vmap_try_huge_p4d() anyway.
      
      Link: https://lkml.kernel.org/r/ad2d366015794a9f21320dcbdd0a8eb98979e9df.1620898113.git.christophe.leroy@csgroup.eu
      Fixes: bbc180a5 ("mm: HUGE_VMAP arch support cleanup")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      86d0c164
    • Rasmus Villemoes's avatar
      docs: admin-guide: update description for kernel.modprobe sysctl · f4d3f25a
      Rasmus Villemoes authored
      When I added CONFIG_MODPROBE_PATH, I neglected to update Documentation/.
      It's still true that this defaults to /sbin/modprobe, but now via a level
      of indirection.  So document that the kernel might have been built with
      something other than /sbin/modprobe as the initial value.
      
      Link: https://lkml.kernel.org/r/20210420125324.1246826-1-linux@rasmusvillemoes.dk
      Fixes: 17652f42 ("modules: add CONFIG_MODPROBE_PATH")
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f4d3f25a
    • Jouni Roivas's avatar
      hfsplus: prevent corruption in shrinking truncate · c3187cf3
      Jouni Roivas authored
      I believe there are some issues introduced by commit 31651c60
      ("hfsplus: avoid deadlock on file truncation")
      
      HFS+ has extent records which always contains 8 extents.  In case the
      first extent record in catalog file gets full, new ones are allocated from
      extents overflow file.
      
      In case shrinking truncate happens to middle of an extent record which
      locates in extents overflow file, the logic in hfsplus_file_truncate() was
      changed so that call to hfs_brec_remove() is not guarded any more.
      
      Right action would be just freeing the extents that exceed the new size
      inside extent record by calling hfsplus_free_extents(), and then check if
      the whole extent record should be removed.  However since the guard
      (blk_cnt > start) is now after the call to hfs_brec_remove(), this has
      unfortunate effect that the last matching extent record is removed
      unconditionally.
      
      To reproduce this issue, create a file which has at least 10 extents, and
      then perform shrinking truncate into middle of the last extent record, so
      that the number of remaining extents is not under or divisible by 8.  This
      causes the last extent record (8 extents) to be removed totally instead of
      truncating into middle of it.  Thus this causes corruption, and lost data.
      
      Fix for this is simply checking if the new truncated end is below the
      start of this extent record, making it safe to remove the full extent
      record.  However call to hfs_brec_remove() can't be moved to it's previous
      place since we're dropping ->tree_lock and it can cause a race condition
      and the cached info being invalidated possibly corrupting the node data.
      
      Another issue is related to this one.  When entering into the block
      (blk_cnt > start) we are not holding the ->tree_lock.  We break out from
      the loop not holding the lock, but hfs_find_exit() does unlock it.  Not
      sure if it's possible for someone else to take the lock under our feet,
      but it can cause hard to debug errors and premature unlocking.  Even if
      there's no real risk of it, the locking should still always be kept in
      balance.  Thus taking the lock now just before the check.
      
      Link: https://lkml.kernel.org/r/20210429165139.3082828-1-jouni.roivas@tuxera.com
      Fixes: 31651c60 ("hfsplus: avoid deadlock on file truncation")
      Signed-off-by: default avatarJouni Roivas <jouni.roivas@tuxera.com>
      Reviewed-by: default avatarAnton Altaparmakov <anton@tuxera.com>
      Cc: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Cc: Viacheslav Dubeyko <slava@dubeyko.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c3187cf3
    • Matthew Wilcox (Oracle)'s avatar
      mm/filemap: fix readahead return types · 076171a6
      Matthew Wilcox (Oracle) authored
      A readahead request will not allocate more memory than can be represented
      by a size_t, even on systems that have HIGHMEM available.  Change the
      length functions from returning an loff_t to a size_t.
      
      Link: https://lkml.kernel.org/r/20210510201201.1558972-1-willy@infradead.org
      Fixes: 32c0a6bc ("btrfs: add and use readahead_batch_length")
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      076171a6