1. 16 May, 2022 1 commit
    • Darrick J. Wong's avatar
      iomap: don't invalidate folios after writeback errors · e9c3a8e8
      Darrick J. Wong authored
      XFS has the unique behavior (as compared to the other Linux filesystems)
      that on writeback errors it will completely invalidate the affected
      folio and force the page cache to reread the contents from disk.  All
      other filesystems leave the page mapped and up to date.
      
      This is a rude awakening for user programs, since (in the case where
      write fails but reread doesn't) file contents will appear to revert to
      old disk contents with no notification other than an EIO on fsync.  This
      might have been annoying back in the days when iomap dealt with one page
      at a time, but with multipage folios, we can now throw away *megabytes*
      worth of data for a single write error.
      
      On *most* Linux filesystems, a program can respond to an EIO on write by
      redirtying the entire file and scheduling it for writeback.  This isn't
      foolproof, since the page that failed writeback is no longer dirty and
      could be evicted, but programs that want to recover properly *also*
      have to detect XFS and regenerate every write they've made to the file.
      
      When running xfs/314 on arm64, I noticed a UAF when xfs_discard_folio
      invalidates multipage folios that could be undergoing writeback.  If,
      say, we have a 256K folio caching a mix of written and unwritten
      extents, it's possible that we could start writeback of the first (say)
      64K of the folio and then hit a writeback error on the next 64K.  We
      then free the iop attached to the folio, which is really bad because
      writeback completion on the first 64k will trip over the "blocks per
      folio > 1 && !iop" assertion.
      
      This can't be fixed by only invalidating the folio if writeback fails at
      the start of the folio, since the folio is marked !uptodate, which trips
      other assertions elsewhere.  Get rid of the whole behavior entirely.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      e9c3a8e8
  2. 08 May, 2022 26 commits
  3. 07 May, 2022 3 commits
  4. 06 May, 2022 10 commits
    • Linus Torvalds's avatar
      Merge tag 'for-5.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 4b97bac0
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Regression fixes in zone activation:
      
         - move a loop invariant out of the loop to avoid checking space
           status
      
         - properly handle unlimited activation
      
        Other fixes:
      
         - for subpage, force the free space v2 mount to avoid a warning and
           make it easy to switch a filesystem on different page size systems
      
         - export sysfs status of exclusive operation 'balance paused', so the
           user space tools can recognize it and allow adding a device with
           paused balance
      
         - fix assertion failure when logging directory key range item"
      
      * tag 'for-5.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: sysfs: export the balance paused state of exclusive operation
        btrfs: fix assertion failure when logging directory key range item
        btrfs: zoned: activate block group properly on unlimited active zone device
        btrfs: zoned: move non-changing condition check out of the loop
        btrfs: force v2 space cache usage for subpage mount
      4b97bac0
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · adcffc17
      Linus Torvalds authored
      Pull NFS client fixes from Trond Myklebust:
       "Highlights include:
      
        Stable fixes:
      
         - Fix a socket leak when setting up an AF_LOCAL RPC client
      
         - Ensure that knfsd connects to the gss-proxy daemon on setup
      
        Bugfixes:
      
         - Fix a refcount leak when migrating a task off an offlined transport
      
         - Don't gratuitously invalidate inode attributes on delegation return
      
         - Don't leak sockets in xs_local_connect()
      
         - Ensure timely close of disconnected AF_LOCAL sockets"
      
      * tag 'nfs-for-5.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        Revert "SUNRPC: attempt AF_LOCAL connect on setup"
        SUNRPC: Ensure gss-proxy connects on setup
        SUNRPC: Ensure timely close of disconnected AF_LOCAL sockets
        SUNRPC: Don't leak sockets in xs_local_connect()
        NFSv4: Don't invalidate inode attributes on delegation return
        SUNRPC release the transport of a relocated task with an assigned transport
      adcffc17
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · bce58da1
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "x86:
      
         - Account for family 17h event renumberings in AMD PMU emulation
      
         - Remove CPUID leaf 0xA on AMD processors
      
         - Fix lockdep issue with locking all vCPUs
      
         - Fix loss of A/D bits in SPTEs
      
         - Fix syzkaller issue with invalid guest state"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: VMX: Exit to userspace if vCPU has injected exception and invalid state
        KVM: SEV: Mark nested locking of vcpu->lock
        kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU
        KVM: x86/svm: Account for family 17h event renumberings in amd_pmc_perf_hw_id
        KVM: x86/mmu: Use atomic XCHG to write TDP MMU SPTEs with volatile bits
        KVM: x86/mmu: Move shadow-present check out of spte_has_volatile_bits()
        KVM: x86/mmu: Don't treat fully writable SPTEs as volatile (modulo A/D)
      bce58da1
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 497fe3bb
      Linus Torvalds authored
      Pull RISC-V fix from Palmer Dabbelt:
      
       - A fix to relocate the DTB early in boot, in cases where the
         bootloader doesn't put the DTB in a region that will end up
         mapped by the kernel.
      
         This manifests as a crash early in boot on a handful of
         configurations.
      
      * tag 'riscv-for-linus-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: relocate DTB if it's outside memory region
      497fe3bb
    • Sean Christopherson's avatar
      KVM: VMX: Exit to userspace if vCPU has injected exception and invalid state · 053d2290
      Sean Christopherson authored
      Exit to userspace with an emulation error if KVM encounters an injected
      exception with invalid guest state, in addition to the existing check of
      bailing if there's a pending exception (KVM doesn't support emulating
      exceptions except when emulating real mode via vm86).
      
      In theory, KVM should never get to such a situation as KVM is supposed to
      exit to userspace before injecting an exception with invalid guest state.
      But in practice, userspace can intervene and manually inject an exception
      and/or stuff registers to force invalid guest state while a previously
      injected exception is awaiting reinjection.
      
      Fixes: fc4fad79 ("KVM: VMX: Reject KVM_RUN if emulation is required with pending exception")
      Reported-by: syzbot+cfafed3bb76d3e37581b@syzkaller.appspotmail.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220502221850.131873-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      053d2290
    • Peter Gonda's avatar
      KVM: SEV: Mark nested locking of vcpu->lock · 0c2c7c06
      Peter Gonda authored
      svm_vm_migrate_from() uses sev_lock_vcpus_for_migration() to lock all
      source and target vcpu->locks. Unfortunately there is an 8 subclass
      limit, so a new subclass cannot be used for each vCPU. Instead maintain
      ownership of the first vcpu's mutex.dep_map using a role specific
      subclass: source vs target. Release the other vcpu's mutex.dep_maps.
      
      Fixes: b5663931 ("KVM: SEV: Add support for SEV intra host migration")
      Reported-by: John Sperbeck<jsperbeck@google.com>
      Suggested-by: default avatarDavid Rientjes <rientjes@google.com>
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarPeter Gonda <pgonda@google.com>
      
      Message-Id: <20220502165807.529624-1-pgonda@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0c2c7c06
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 4df22ca8
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "A few recent regressions in rxe's multicast code, and some old driver
        bugs:
      
         - Error case unwind bug in rxe for rkeys
      
         - Dot not call netdev functions under a spinlock in rxe multicast
           code
      
         - Use the proper BH lock type in rxe multicast code
      
         - Fix idrma deadlock and crash
      
         - Add a missing flush to drain irdma QPs when in error
      
         - Fix high userspace latency in irdma during destroy due to
           synchronize_rcu()
      
         - Rare race in siw MPA processing"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/rxe: Change mcg_lock to a _bh lock
        RDMA/rxe: Do not call  dev_mc_add/del() under a spinlock
        RDMA/siw: Fix a condition race issue in MPA request processing
        RDMA/irdma: Fix possible crash due to NULL netdev in notifier
        RDMA/irdma: Reduce iWARP QP destroy time
        RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
        RDMA/rxe: Recheck the MR in when generating a READ reply
        RDMA/irdma: Fix deadlock in irdma_cleanup_cm_core()
        RDMA/rxe: Fix "Replace mr by rkey in responder resources"
      4df22ca8
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 64267926
      Linus Torvalds authored
      Pull mmc fixes from Ulf Hansson:
       "MMC core:
      
         - Fix initialization for eMMC's HS200/HS400 mode
      
        MMC host:
      
         - sdhci-msm: Reset GCC_SDCC_BCR register to prevent timeout issues
      
         - sunxi-mmc: Fix DMA descriptors allocated above 32 bits"
      
      * tag 'mmc-v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: sdhci-msm: Reset GCC_SDCC_BCR register for SDHC
        mmc: sunxi-mmc: Fix DMA descriptors allocated above 32 bits
        mmc: core: Set HS clock speed before sending HS CMD13
      64267926
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2022-05-06' of git://anongit.freedesktop.org/drm/drm · 5fa576d7
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "A pretty quiet week, one fbdev, msm, kconfig, and two amdgpu fixes,
        about what I'd expect for rc6.
      
        fbdev:
      
         - hotunplugging fix
      
        amdgpu:
      
         - Fix a xen dom0 regression on APUs
      
         - Fix a potential array overflow if a receiver were to send an
           erroneous audio channel count
      
        msm:
      
         - lockdep fix.
      
        it6505:
      
         - kconfig fix"
      
      * tag 'drm-fixes-2022-05-06' of git://anongit.freedesktop.org/drm/drm:
        drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNT
        drm/amdgpu: do not use passthrough mode in Xen dom0
        drm/bridge: ite-it6505: add missing Kconfig option select
        fbdev: Make fb_release() return -ENODEV if fbdev was unregistered
        drm/msm/dp: remove fail safe mode related code
      5fa576d7
    • Puyou Lu's avatar
      gpio: pca953x: fix irq_stat not updated when irq is disabled (irq_mask not set) · dba78579
      Puyou Lu authored
      When one port's input state get inverted (eg. from low to hight) after
      pca953x_irq_setup but before setting irq_mask (by some other driver such as
      "gpio-keys"), the next inversion of this port (eg. from hight to low) will not
      be triggered any more (because irq_stat is not updated at the first time). Issue
      should be fixed after this commit.
      
      Fixes: 89ea8bbe ("gpio: pca953x.c: add interrupt handling capability")
      Signed-off-by: default avatarPuyou Lu <puyou.lu@gmail.com>
      Signed-off-by: default avatarBartosz Golaszewski <brgl@bgdev.pl>
      dba78579