1. 10 Sep, 2021 6 commits
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 589e5cab
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - Intel VT-d:
           - PASID leakage in intel_svm_unbind_mm()
           - Deadlock in intel_svm_drain_prq()
      
       - AMD IOMMU: Fixes for an unhandled page-fault bug when AVIC is used
         for a KVM guest.
      
       - Make CONFIG_IOMMU_DEFAULT_DMA_LAZY architecture instead of IOMMU
         driver dependent
      
      * tag 'iommu-fixes-v5.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu: Clarify default domain Kconfig
        iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()
        iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()
        iommu/amd: Remove iommu_init_ga()
        iommu/amd: Relocate GAMSup check to early_enable_iommus
      589e5cab
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 5ffc06eb
      Linus Torvalds authored
      Pull habanalabs updates from Greg KH:
       "Here is another round of misc driver patches for 5.15-rc1.
      
        In here is only updates for the Habanalabs driver. This request is
        late because the previously-objected-to dma-buf patches are all
        removed and some fixes that you and others found are now included in
        here as well.
      
        All of these have been in linux-next for well over a week with no
        reports of problems, and they are all self-contained to only this one
        driver. Full details are in the shortlog"
      
      * tag 'char-misc-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (61 commits)
        habanalabs/gaudi: hwmon default card name
        habanalabs: add support for f/w reset
        habanalabs/gaudi: block ICACHE_BASE_ADDERESS_HIGH in TPC
        habanalabs: cannot sleep while holding spinlock
        habanalabs: never copy_from_user inside spinlock
        habanalabs: remove unnecessary device status check
        habanalabs: disable IRQ in user interrupts spinlock
        habanalabs: add "in device creation" status
        habanalabs/gaudi: invalidate PMMU mem cache on init
        habanalabs/gaudi: size should be printed in decimal
        habanalabs/gaudi: define DC POWER for secured PMC
        habanalabs/gaudi: unmask out of bounds SLM access interrupt
        habanalabs: add userptr_lookup node in debugfs
        habanalabs/gaudi: fetch TPC/MME ECC errors from F/W
        habanalabs: modify multi-CS to wait on stream masters
        habanalabs/gaudi: add monitored SOBs to state dump
        habanalabs/gaudi: restore user registers when context opens
        habanalabs/gaudi: increase boot fit timeout
        habanalabs: update to latest firmware headers
        habanalabs/gaudi: minimize number of register reads
        ...
      5ffc06eb
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm · a668acb8
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Just an initial bunch of fixes for the merge window, amdgpu is most of
        them with a few ttm fixes and an fbdev avoid multiply overflow fix.
      
        core:
         - Make some dma-buf config options depend on DMA_SHARED_BUFFER
         - Handle multiplication overflow of fbdev xres/yres in the core
      
        ttm:
         - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed
         - Fix ttm deadlock if target BO isn't idle
         - ttm build fix
         - ttm docs fix
      
        dma-buf:
         - config option fixes
      
        fbdev:
         - limit resolutions to avoid int overflow
      
        i915:
         - stddef change.
      
        amdgpu:
         - Misc cleanups, typo fixes
         - EEPROM fix
         - Add some new PCI IDs
         - Scatter/Gather display support for Yellow Carp
         - PCIe DPM fix for RKL platforms
         - RAS fix
      
        amdkfd:
         - SVM fix
      
        vc4:
         - static function fix
      
        mgag200:
         - fix uninit var
      
        panfrost:
         - lock_region fixes"
      
      * tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm: (36 commits)
        drm/ttm: Fix a deadlock if the target BO is not idle during swap
        fbmem: don't allow too huge resolutions
        dma-buf: DMABUF_SYSFS_STATS should depend on DMA_SHARED_BUFFER
        dma-buf: DMABUF_DEBUG should depend on DMA_SHARED_BUFFER
        drm/i915: use linux/stddef.h due to "isystem: trim/fixup stdarg.h and other headers"
        dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER
        drm/amdkfd: drop process ref count when xnack disable
        drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode
        drm/amdgpu: fix fdinfo race with process exit
        drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
        drm/amdgpu: stop scheduler when calling hw_fini (v2)
        drm/amdgpu: Clear RAS interrupt status on aldebaran
        drm/amd/display: Initialize lt_settings on instantiation
        drm/amd/display: cleanup idents after a revert
        drm/amd/display: Fix memory leak reported by coverity
        drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resource
        drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum"
        drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform
        drm/amdgpu: show both cmd id and name when psp cmd failed
        drm/amd/display: setup system context for APUs
        ...
      a668acb8
    • Amir Goldstein's avatar
      fsnotify: fix sb_connectors leak · 4396a731
      Amir Goldstein authored
      Fix a leak in s_fsnotify_connectors counter in case of a race between
      concurrent add of new fsnotify mark to an object.
      
      The task that lost the race fails to drop the counter before freeing
      the unused connector.
      
      Following umount() hangs in fsnotify_sb_delete()/wait_var_event(),
      because s_fsnotify_connectors never drops to zero.
      
      Fixes: ec44610f ("fsnotify: count all objects with attached connectors")
      Reported-by: default avatarMurphy Zhou <jencce.kernel@gmail.com>
      Link: https://lore.kernel.org/linux-fsdevel/20210907063338.ycaw6wvhzrfsfdlp@xzhoux.usersys.redhat.com/Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4396a731
    • xinhui pan's avatar
      drm/ttm: Fix a deadlock if the target BO is not idle during swap · 70982eef
      xinhui pan authored
      The ret value might be -EBUSY, caller will think lru lock is still
      locked but actually NOT. So return -ENOSPC instead. Otherwise we hit
      list corruption.
      
      ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0,
      caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will
      be stuck as we actually did not free any BO memory. This usually happens
      when the fence is not signaled for a long time.
      Signed-off-by: default avatarxinhui pan <xinhui.pan@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Fixes: ebd59851 ("drm/ttm: move swapout logic around v3")
      Link: https://patchwork.freedesktop.org/patch/msgid/20210907040832.1107747-1-xinhui.pan@amd.comSigned-off-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      70982eef
    • Dave Airlie's avatar
      Merge tag 'drm-misc-next-fixes-2021-09-09' of... · b011522c
      Dave Airlie authored
      Merge tag 'drm-misc-next-fixes-2021-09-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
      
      drm-misc-next-fixes for v5.15:
      - Make some dma-buf config options depend on DMA_SHARED_BUFFER.
      - Handle multiplication overflow of fbdev xres/yres in the core.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/37c5fe2e-5be8-45c3-286b-d8d536a5cef2@linux.intel.com
      b011522c
  2. 09 Sep, 2021 30 commits
    • Linus Torvalds's avatar
      Merge tag '5.15-rc-ksmbd-part2' of git://git.samba.org/ksmbd · bf9f243f
      Linus Torvalds authored
      Pull ksmbd fixes from Steve French:
      
       - various fixes pointed out by coverity, and a minor cleanup patch
      
       - id mapping and ownership fixes
      
       - an smbdirect fix
      
      * tag '5.15-rc-ksmbd-part2' of git://git.samba.org/ksmbd:
        ksmbd: fix control flow issues in sid_to_id()
        ksmbd: fix read of uninitialized variable ret in set_file_basic_info
        ksmbd: add missing assignments to ret on ndr_read_int64 read calls
        ksmbd: add validation for ndr read/write functions
        ksmbd: remove unused ksmbd_file_table_flush function
        ksmbd: smbd: fix dma mapping error in smb_direct_post_send_data
        ksmbd: Reduce error log 'speed is unknown' to debug
        ksmbd: defer notify_change() call
        ksmbd: remove setattr preparations in set_file_basic_info()
        ksmbd: ensure error is surfaced in set_file_basic_info()
        ndr: fix translation in ndr_encode_posix_acl()
        ksmbd: fix translation in sid_to_id()
        ksmbd: fix subauth 0 handling in sid_to_id()
        ksmbd: fix translation in acl entries
        ksmbd: fix translation in ksmbd_acls_fattr()
        ksmbd: fix translation in create_posix_rsp_buf()
        ksmbd: fix translation in smb2_populate_readdir_entry()
        ksmbd: fix lookup on idmapped mounts
      bf9f243f
    • Linus Torvalds's avatar
      Merge tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 8dde2086
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - fix max_inline mount option limit on 64k page system
      
       - lockdep fixes:
           - update bdev time in a safer way
           - move bdev put outside of sb write section when removing device
           - fix possible deadlock when mounting seed/sprout filesystem
      
       - zoned mode: fix split extent accounting
      
       - minor include fixup
      
      * tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: zoned: fix double counting of split ordered extent
        btrfs: fix lockdep warning while mounting sprout fs
        btrfs: delay blkdev_put until after the device remove
        btrfs: update the bdev time directly when closing
        btrfs: use correct header for div_u64 in misc.h
        btrfs: fix upper limit for max_inline for page size 64K
      8dde2086
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · ae79394a
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small fixes that have been gathered before rc1,
        including a few regression fixes for the problem in the previous pull
        request"
      
      * tag 'sound-fix-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: gus: Fix repeated probe for ISA interwave card
        ALSA: gus: Fix repeated probes of snd_gus_create()
        ALSA: vx222: fix null-ptr-deref
        ASoC: rockchip: i2s: Fix concurrency between tx/rx
        ASoC: mt8195: correct the dts parsing logic about DPTX and HDMITX
        ASoC: Intel: boards: Fix CONFIG_SND_SOC_SDW_MOCKUP select
        ASoC: dt-bindings: fsl_rpmsg: Add compatible string for i.MX8ULP
        ALSA: usb-audio: Add registration quirk for JBL Quantum 800
        ASoC: rt5682: fix headset background noise when S3 state
        ASoC: dt-bindings: mt8195: remove dependent headers in the example
        ASoC: mediatek: SND_SOC_MT8195 should depend on ARCH_MEDIATEK
        ASoC: samsung: s3c24xx_simtec: fix spelling mistake "devicec" -> "device"
        ASoC: audio-graph: respawn Platform Support
        ASoC: mediatek: mt8195: add MTK_PMIC_WRAP dependency
      ae79394a
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · d6c338a7
      Linus Torvalds authored
      Pull UML updates from Richard Weinberger:
      
       - Support for VMAP_STACK
      
       - Support for splice_write in hostfs
      
       - Fixes for virt-pci
      
       - Fixes for virtio_uml
      
       - Various fixes
      
      * tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
        um: fix stub location calculation
        um: virt-pci: fix uapi documentation
        um: enable VMAP_STACK
        um: virt-pci: don't do DMA from stack
        hostfs: support splice_write
        um: virtio_uml: fix memory leak on init failures
        um: virtio_uml: include linux/virtio-uml.h
        lib/logic_iomem: fix sparse warnings
        um: make PCI emulation driver init/exit static
      d6c338a7
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 35776f10
      Linus Torvalds authored
      Pull ARM development updates from Russell King:
      
       - Rename "mod_init" and "mod_exit" so that initcall debug output is
         actually useful (Randy Dunlap)
      
       - Update maintainers entries for linux-arm-kernel to indicate it is
         moderated for non-subscribers (Randy Dunlap)
      
       - Move install rules to arch/arm/Makefile (Masahiro Yamada)
      
       - Drop unnecessary ARCH_NR_GPIOS definition (Linus Walleij)
      
       - Don't warn about atags_to_fdt() stack size (David Heidelberg)
      
       - Speed up unaligned copy_{from,to}_kernel_nofault (Arnd Bergmann)
      
       - Get rid of set_fs() usage (Arnd Bergmann)
      
       - Remove checks for GCC prior to v4.6 (Geert Uytterhoeven)
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 9118/1: div64: Remove always-true __div64_const32_is_OK() duplicate
        ARM: 9117/1: asm-generic: div64: Remove always-true __div64_const32_is_OK()
        ARM: 9116/1: unified: Remove check for gcc < 4
        ARM: 9110/1: oabi-compat: fix oabi epoll sparse warning
        ARM: 9113/1: uaccess: remove set_fs() implementation
        ARM: 9112/1: uaccess: add __{get,put}_kernel_nofault
        ARM: 9111/1: oabi-compat: rework fcntl64() emulation
        ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation
        ARM: 9108/1: oabi-compat: rework epoll_wait/epoll_pwait emulation
        ARM: 9107/1: syscall: always store thread_info->abi_syscall
        ARM: 9109/1: oabi-compat: add epoll_pwait handler
        ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs()
        ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault
        ARM: 9105/1: atags_to_fdt: don't warn about stack size
        ARM: 9103/1: Drop ARCH_NR_GPIOS definition
        ARM: 9102/1: move theinstall rules to arch/arm/Makefile
        ARM: 9100/1: MAINTAINERS: mark all linux-arm-kernel@infradead list as moderated
        ARM: 9099/1: crypto: rename 'mod_init' & 'mod_exit' functions to be module-specific
      35776f10
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 43175623
      Linus Torvalds authored
      Pull more tracing updates from Steven Rostedt:
      
       - Add migrate-disable counter to tracing header
      
       - Fix error handling in event probes
      
       - Fix missed unlock in osnoise in error path
      
       - Fix merge issue with tools/bootconfig
      
       - Clean up bootconfig data when init memory is removed
      
       - Fix bootconfig to loop only on subkeys
      
       - Have kernel command lines override bootconfig options
      
       - Increase field counts for synthetic events
      
       - Have histograms dynamic allocate event elements to save space
      
       - Fixes in testing and documentation
      
      * tag 'trace-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing/boot: Fix to loop on only subkeys
        selftests/ftrace: Exclude "(fault)" in testing add/remove eprobe events
        tracing: Dynamically allocate the per-elt hist_elt_data array
        tracing: synth events: increase max fields count
        tools/bootconfig: Show whole test command for each test case
        bootconfig: Fix missing return check of xbc_node_compose_key function
        tools/bootconfig: Fix tracing_on option checking in ftrace2bconf.sh
        docs: bootconfig: Add how to use bootconfig for kernel parameters
        init/bootconfig: Reorder init parameter from bootconfig and cmdline
        init: bootconfig: Remove all bootconfig data when the init memory is removed
        tracing/osnoise: Fix missed cpus_read_unlock() in start_per_cpu_kthreads()
        tracing: Fix some alloc_event_probe() error handling bugs
        tracing: Add migrate-disabled counter to tracing output.
      43175623
    • Linus Torvalds's avatar
      Merge tag 's390-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · f154c806
      Linus Torvalds authored
      Pull more s390 updates from Heiko Carstens:
       "Except for the xpram device driver removal it is all about fixes and
        cleanups.
      
         - Fix topology update on cpu hotplug, so notifiers see expected
           masks. This bug was uncovered with SCHED_CORE support.
      
         - Fix stack unwinding so that the correct number of entries are
           omitted like expected by common code. This fixes KCSAN selftests.
      
         - Add kmemleak annotation to stack_alloc to avoid false positive
           kmemleak warnings.
      
         - Avoid layering violation in common I/O code and don't unregister
           subchannel from child-drivers.
      
         - Remove xpram device driver for which no real use case exists since
           the kernel is 64 bit only. Also all hypervisors got required
           support removed in the meantime, which means the xpram device
           driver is dead code.
      
         - Fix -ENODEV handling of clp_get_state in our PCI code.
      
         - Enable KFENCE in debug defconfig.
      
         - Cleanup hugetlbfs s390 specific Kconfig dependency.
      
         - Quite a lot of trivial fixes to get rid of "W=1" warnings, and and
           other simple cleanups"
      
      * tag 's390-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        hugetlbfs: s390 is always 64bit
        s390/ftrace: remove incorrect __va usage
        s390/zcrypt: remove incorrect kernel doc indicators
        scsi: zfcp: fix kernel doc comments
        s390/sclp: add __nonstring annotation
        s390/hmcdrv_ftp: fix kernel doc comment
        s390: remove xpram device driver
        s390/pci: read clp_list_pci_req only once
        s390/pci: fix clp_get_state() handling of -ENODEV
        s390/cio: fix kernel doc comment
        s390/ctrlchar: fix kernel doc comment
        s390/con3270: use proper type for tasklet function
        s390/cpum_cf: move array from header to C file
        s390/mm: fix kernel doc comments
        s390/topology: fix topology information when calling cpu hotplug notifiers
        s390/unwind: use current_frame_address() to unwind current task
        s390/configs: enable CONFIG_KFENCE in debug_defconfig
        s390/entry: make oklabel within CHKSTG macro local
        s390: add kmemleak annotation in stack_alloc()
        s390/cio: dont unregister subchannel from child-drivers
      f154c806
    • Linus Torvalds's avatar
      Merge branch 'work.gfs2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 7b871c77
      Linus Torvalds authored
      Pull gfs2 setattr updates from Al Viro:
       "Make it possible for filesystems to use a generic 'may_setattr()' and
        switch gfs2 to using it"
      
      * 'work.gfs2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        gfs2: Switch to may_setattr in gfs2_setattr
        fs: Move notify_change permission checks into may_setattr
      7b871c77
    • Linus Torvalds's avatar
      Merge branch 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · e2e694b9
      Linus Torvalds authored
      Pull root filesystem type handling updates from Al Viro:
       "Teach init/do_mounts.c to handle non-block filesystems, hopefully
        preventing even more special-cased kludges (such as root=/dev/nfs,
        etc)"
      
      * 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: simplify get_filesystem_list / get_all_fs_names
        init: allow mounting arbitrary non-blockdevice filesystems as root
        init: split get_fs_names
      e2e694b9
    • Linus Torvalds's avatar
      Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 7b7699c0
      Linus Torvalds authored
      Pull iov_iter fixes from Al Viro:
       "Fixes for io-uring handling of iov_iter reexpands"
      
      * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        io_uring: reexpand under-reexpanded iters
        iov_iter: track truncated size
      7b7699c0
    • Linus Torvalds's avatar
      Merge tag 'cxl-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl · 70868a18
      Linus Torvalds authored
      Pull CXL (Compute Express Link) updates from Dan Williams:
      
       - Fix detection of CXL host bridges to filter out disabled ACPI0016
         devices in the ACPI DSDT.
      
       - Fix kernel lockdown integration to disable raw commands when raw PCI
         access is disabled.
      
       - Fix a broken debug message.
      
       - Add support for "Get Partition Info". I.e. enumerate the split
         between volatile and persistent capacity on bi-modal CXL memory
         expanders.
      
       - Re-factor the core by subject area. This is a work in progress.
      
       - Prepare libnvdimm to understand CXL labels in addition to EFI labels.
         This is a work in progress.
      
      * tag 'cxl-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (25 commits)
        cxl/registers: Fix Documentation warning
        cxl/pmem: Fix Documentation warning
        cxl/uapi: Fix defined but not used warnings
        cxl/pci: Fix debug message in cxl_probe_regs()
        cxl/pci: Fix lockdown level
        cxl/acpi: Do not add DSDT disabled ACPI0016 host bridge ports
        libnvdimm/labels: Add claim class helpers
        libnvdimm/labels: Add type-guid helpers
        libnvdimm/labels: Add blk special cases for nlabel and position helpers
        libnvdimm/labels: Add blk isetcookie set / validation helpers
        libnvdimm/labels: Add a checksum calculation helper
        libnvdimm/labels: Introduce label setter helpers
        libnvdimm/labels: Add isetcookie validation helper
        libnvdimm/labels: Introduce getters for namespace label fields
        cxl/mem: Adjust ram/pmem range to represent DPA ranges
        cxl/mem: Account for partitionable space in ram/pmem ranges
        cxl/pci: Store memory capacity values
        cxl/pci: Simplify register setup
        cxl/pci: Ignore unknown register block types
        cxl/core: Move memdev management to core
        ...
      70868a18
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 2e5fd489
      Linus Torvalds authored
      Pull libnvdimm updates from Dan Williams:
      
       - Fix a race condition in the teardown path of raw mode pmem
         namespaces.
      
       - Cleanup the code that filesystems use to detect filesystem-dax
         capabilities of their underlying block device.
      
      * tag 'libnvdimm-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        dax: remove bdev_dax_supported
        xfs: factor out a xfs_buftarg_is_dax helper
        dax: stub out dax_supported for !CONFIG_FS_DAX
        dax: remove __generic_fsdax_supported
        dax: move the dax_read_lock() locking into dax_supported
        dax: mark dax_get_by_host static
        dm: use fs_dax_get_by_bdev instead of dax_get_by_host
        dax: stop using bdevname
        fsdax: improve the FS_DAX Kconfig description and help text
        libnvdimm/pmem: Fix crash triggered when I/O in-flight during unbind
      2e5fd489
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 4b105f4a
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "I don't usually send a second PR in the merge window, but the fix to
        mlx5 is significant enough that it should start going through the
        process ASAP. Along with it comes some of the usual -rc stuff that
        would normally wait for a -rc2 or so.
      
        Summary:
      
        Important error case regression fixes in mlx5:
      
         - Wrong size used when computing the error path smaller allocation
           request leads to corruption
      
         - Confusing but ultimately harmless alignment mis-calculation
      
        Static checker warning fixes:
      
         - NULL pointer subtraction in qib
      
         - kcalloc in bnxt_re
      
         - Missing static on global variable in hfi1"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        IB/hfi1: make hist static
        RDMA/bnxt_re: Prefer kcalloc over open coded arithmetic
        IB/qib: Fix null pointer subtraction compiler warning
        RDMA/mlx5: Fix xlt_chunk_align calculation
        RDMA/mlx5: Fix number of allocated XLT entries
      4b105f4a
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 0aa25160
      Linus Torvalds authored
      Pull dmaengine updates from Vinod Koul:
       "New drivers/devices
         - Support for Renesas RZ/G2L dma controller
         - New driver for AMD PTDMA controller
      
        Updates:
         - Big pile of idxd updates
         - Updates for Altera driver, stm32-dma, dw etc"
      
      * tag 'dmaengine-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (83 commits)
        dmaengine: sh: fix some NULL dereferences
        dmaengine: sh: Fix unused initialization of pointer lmdesc
        MAINTAINERS: Fix AMD PTDMA DRIVER entry
        dmaengine: ptdma: remove PT_OFFSET to avoid redefnition
        dmaengine: ptdma: Add debugfs entries for PTDMA
        dmaengine: ptdma: register PTDMA controller as a DMA resource
        dmaengine: ptdma: Initial driver for the AMD PTDMA
        dmaengine: fsl-dpaa2-qdma: Fix spelling mistake "faile" -> "failed"
        dmaengine: idxd: remove interrupt disable for dev_lock
        dmaengine: idxd: remove interrupt disable for cmd_lock
        dmaengine: idxd: fix setting up priv mode for dwq
        dmaengine: xilinx_dma: Set DMA mask for coherent APIs
        dmaengine: ti: k3-psil-j721e: Add entry for CSI2RX
        dmaengine: sh: Add DMAC driver for RZ/G2L SoC
        dmaengine: Extend the dma_slave_width for 128 bytes
        dt-bindings: dma: Document RZ/G2L bindings
        dmaengine: ioat: depends on !UML
        dmaengine: idxd: set descriptor allocation size to threshold for swq
        dmaengine: idxd: make submit failure path consistent on desc freeing
        dmaengine: idxd: remove interrupt flag for completion list spinlock
        ...
      0aa25160
    • Robin Murphy's avatar
      iommu: Clarify default domain Kconfig · 8cc63319
      Robin Murphy authored
      Although strictly it is the AMD and Intel drivers which have an existing
      expectation of lazy behaviour by default, it ends up being rather
      unintuitive to describe this literally in Kconfig. Express it instead as
      an architecture dependency, to clarify that it is a valid config-time
      decision. The end result is the same since virtio-iommu doesn't support
      lazy mode and thus falls back to strict at runtime regardless.
      
      The per-architecture disparity is a matter of historical expectations:
      the AMD and Intel drivers have been lazy by default since 2008, and
      changing that gets noticed by people asking where their I/O throughput
      has gone. Conversely, Arm-based systems with their wider assortment of
      IOMMU drivers mostly only support strict mode anyway; only the Arm SMMU
      drivers have later grown support for passthrough and lazy mode, for
      users who wanted to explicitly trade off isolation for performance.
      These days, reducing the default level of isolation in a way which may
      go unnoticed by users who expect otherwise hardly seems worth risking
      for the sake of one line of Kconfig, so here's where we are.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Link: https://lore.kernel.org/r/69a0c6f17b000b54b8333ee42b3124c1d5a869e2.1631105737.git.robin.murphy@arm.comSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      8cc63319
    • Fenghua Yu's avatar
      iommu/vt-d: Fix a deadlock in intel_svm_drain_prq() · 6ef05051
      Fenghua Yu authored
      pasid_mutex and dev->iommu->param->lock are held while unbinding mm is
      flushing IO page fault workqueue and waiting for all page fault works to
      finish. But an in-flight page fault work also need to hold the two locks
      while unbinding mm are holding them and waiting for the work to finish.
      This may cause an ABBA deadlock issue as shown below:
      
      	idxd 0000:00:0a.0: unbind PASID 2
      	======================================================
      	WARNING: possible circular locking dependency detected
      	5.14.0-rc7+ #549 Not tainted [  186.615245] ----------
      	dsa_test/898 is trying to acquire lock:
      	ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at:
      	iopf_queue_flush_dev+0x29/0x60
      	but task is already holding lock:
      	ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
      	intel_svm_unbind+0x34/0x1e0
      	which lock already depends on the new lock.
      
      	the existing dependency chain (in reverse order) is:
      
      	-> #2 (pasid_mutex){+.+.}-{3:3}:
      	       __mutex_lock+0x75/0x730
      	       mutex_lock_nested+0x1b/0x20
      	       intel_svm_page_response+0x8e/0x260
      	       iommu_page_response+0x122/0x200
      	       iopf_handle_group+0x1c2/0x240
      	       process_one_work+0x2a5/0x5a0
      	       worker_thread+0x55/0x400
      	       kthread+0x13b/0x160
      	       ret_from_fork+0x22/0x30
      
      	-> #1 (&param->fault_param->lock){+.+.}-{3:3}:
      	       __mutex_lock+0x75/0x730
      	       mutex_lock_nested+0x1b/0x20
      	       iommu_report_device_fault+0xc2/0x170
      	       prq_event_thread+0x28a/0x580
      	       irq_thread_fn+0x28/0x60
      	       irq_thread+0xcf/0x180
      	       kthread+0x13b/0x160
      	       ret_from_fork+0x22/0x30
      
      	-> #0 (&param->lock){+.+.}-{3:3}:
      	       __lock_acquire+0x1134/0x1d60
      	       lock_acquire+0xc6/0x2e0
      	       __mutex_lock+0x75/0x730
      	       mutex_lock_nested+0x1b/0x20
      	       iopf_queue_flush_dev+0x29/0x60
      	       intel_svm_drain_prq+0x127/0x210
      	       intel_svm_unbind+0xc5/0x1e0
      	       iommu_sva_unbind_device+0x62/0x80
      	       idxd_cdev_release+0x15a/0x200 [idxd]
      	       __fput+0x9c/0x250
      	       ____fput+0xe/0x10
      	       task_work_run+0x64/0xa0
      	       exit_to_user_mode_prepare+0x227/0x230
      	       syscall_exit_to_user_mode+0x2c/0x60
      	       do_syscall_64+0x48/0x90
      	       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      	other info that might help us debug this:
      
      	Chain exists of:
      	  &param->lock --> &param->fault_param->lock --> pasid_mutex
      
      	 Possible unsafe locking scenario:
      
      	       CPU0                    CPU1
      	       ----                    ----
      	  lock(pasid_mutex);
      				       lock(&param->fault_param->lock);
      				       lock(pasid_mutex);
      	  lock(&param->lock);
      
      	 *** DEADLOCK ***
      
      	2 locks held by dsa_test/898:
      	 #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at:
      	 iommu_sva_unbind_device+0x53/0x80
      	 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
      	 intel_svm_unbind+0x34/0x1e0
      
      	stack backtrace:
      	CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549
      	Hardware name: Intel Corporation Kabylake Client platform/KBL S
      	DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016
      	Call Trace:
      	 dump_stack_lvl+0x5b/0x74
      	 dump_stack+0x10/0x12
      	 print_circular_bug.cold+0x13d/0x142
      	 check_noncircular+0xf1/0x110
      	 __lock_acquire+0x1134/0x1d60
      	 lock_acquire+0xc6/0x2e0
      	 ? iopf_queue_flush_dev+0x29/0x60
      	 ? pci_mmcfg_read+0xde/0x240
      	 __mutex_lock+0x75/0x730
      	 ? iopf_queue_flush_dev+0x29/0x60
      	 ? pci_mmcfg_read+0xfd/0x240
      	 ? iopf_queue_flush_dev+0x29/0x60
      	 mutex_lock_nested+0x1b/0x20
      	 iopf_queue_flush_dev+0x29/0x60
      	 intel_svm_drain_prq+0x127/0x210
      	 ? intel_pasid_tear_down_entry+0x22e/0x240
      	 intel_svm_unbind+0xc5/0x1e0
      	 iommu_sva_unbind_device+0x62/0x80
      	 idxd_cdev_release+0x15a/0x200
      
      pasid_mutex protects pasid and svm data mapping data. It's unnecessary
      to hold pasid_mutex while flushing the workqueue. To fix the deadlock
      issue, unlock pasid_pasid during flushing the workqueue to allow the works
      to be handled.
      
      Fixes: d5b9e4bf ("iommu/vt-d: Report prq to io-pgfault framework")
      Reported-and-tested-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.comSigned-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com
      [joro: Removed timing information from kernel log messages]
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      6ef05051
    • Fenghua Yu's avatar
      iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm() · a21518cb
      Fenghua Yu authored
      The mm->pasid will be used in intel_svm_free_pasid() after load_pasid()
      during unbinding mm. Clearing it in load_pasid() will cause PASID cannot
      be freed in intel_svm_free_pasid().
      
      Additionally mm->pasid was updated already before load_pasid() during pasid
      allocation. No need to update it again in load_pasid() during binding mm.
      Don't update mm->pasid to avoid the issues in both binding mm and unbinding
      mm.
      
      Fixes: 40483774 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers")
      Reported-and-tested-by: default avatarDave Jiang <dave.jiang@intel.com>
      Co-developed-by: default avatarJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: default avatarJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.comSigned-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.comSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      a21518cb
    • Suravee Suthikulpanit's avatar
      iommu/amd: Remove iommu_init_ga() · eb03f2d2
      Suravee Suthikulpanit authored
      Since the function has been simplified and only call iommu_init_ga_log(),
      remove the function and replace with iommu_init_ga_log() instead.
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Link: https://lore.kernel.org/r/20210820202957.187572-4-suravee.suthikulpanit@amd.com
      Fixes: 8bda0cfb ("iommu/amd: Detect and initialize guest vAPIC log")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      eb03f2d2
    • Wei Huang's avatar
      iommu/amd: Relocate GAMSup check to early_enable_iommus · c3811a50
      Wei Huang authored
      Currently, iommu_init_ga() checks and disables IOMMU VAPIC support
      (i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set.
      However it forgets to clear IRQ_POSTING_CAP from the previously set
      amd_iommu_irq_ops.capability.
      
      This triggers an invalid page fault bug during guest VM warm reboot
      if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is
      incorrectly set, and crash the system with the following kernel trace.
      
          BUG: unable to handle page fault for address: 0000000000400dd8
          RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc
          Call Trace:
           svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd]
           ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm]
           kvm_request_apicv_update+0x10c/0x150 [kvm]
           svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd]
           svm_enable_irq_window+0x26/0xa0 [kvm_amd]
           vcpu_enter_guest+0xbbe/0x1560 [kvm]
           ? avic_vcpu_load+0xd5/0x120 [kvm_amd]
           ? kvm_arch_vcpu_load+0x76/0x240 [kvm]
           ? svm_get_segment_base+0xa/0x10 [kvm_amd]
           kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm]
           kvm_vcpu_ioctl+0x22a/0x5d0 [kvm]
           __x64_sys_ioctl+0x84/0xc0
           do_syscall_64+0x33/0x40
           entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes by moving the initializing of AMD IOMMU interrupt remapping mode
      (amd_iommu_guest_ir) earlier before setting up the
      amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag.
      
      [joro:	Squashed the two patches and limited
      	check_features_on_all_iommus() to CONFIG_IRQ_REMAP
      	to fix a compile warning.]
      Signed-off-by: default avatarWei Huang <wei.huang2@amd.com>
      Co-developed-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.com
      Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.com
      Fixes: 8bda0cfb ("iommu/amd: Detect and initialize guest vAPIC log")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      c3811a50
    • Dave Airlie's avatar
      Merge tag 'drm-misc-next-fixes-2021-09-03' of... · de04744d
      Dave Airlie authored
      Merge tag 'drm-misc-next-fixes-2021-09-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
      
      drm-misc-next-fixes for v5.15:
      - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed.
      - Small fixes to panfrost, mgag200, vc4.
      - Small ttm compilation fixes.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/41ff5e54-0837-2226-a182-97ffd11ef01e@linux.intel.com
      de04744d
    • Dave Airlie's avatar
      Merge tag 'amd-drm-next-5.15-2021-09-01' of... · 06b224d5
      Dave Airlie authored
      Merge tag 'amd-drm-next-5.15-2021-09-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
      
      amd-drm-next-5.15-2021-09-01:
      
      amdgpu:
      - Misc cleanups, typo fixes
      - EEPROM fix
      - Add some new PCI IDs
      - Scatter/Gather display support for Yellow Carp
      - PCIe DPM fix for RKL platforms
      - RAS fix
      
      amdkfd:
      - SVM fix
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210901214015.4488-1-alexander.deucher@amd.com
      06b224d5
    • Linus Torvalds's avatar
      Merge branches 'akpm' and 'akpm-hotfixes' (patches from Andrew) · a3fa7a10
      Linus Torvalds authored
      Merge yet more updates and hotfixes from Andrew Morton:
       "Post-linux-next material, based upon latest upstream to catch the
        now-merged dependencies:
      
         - 10 patches.
      
           Subsystems affected by this patch series: mm (vmstat and migration)
           and compat.
      
        And bunch of hotfixes, mostly cc:stable:
      
         - 8 patches.
      
           Subsystems affected by this patch series: mm (hmm, hugetlb, vmscan,
           pagealloc, pagemap, kmemleak, mempolicy, and memblock)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        arch: remove compat_alloc_user_space
        compat: remove some compat entry points
        mm: simplify compat numa syscalls
        mm: simplify compat_sys_move_pages
        kexec: avoid compat_alloc_user_space
        kexec: move locking into do_kexec_load
        mm: migrate: change to use bool type for 'page_was_mapped'
        mm: migrate: fix the incorrect function name in comments
        mm: migrate: introduce a local variable to get the number of pages
        mm/vmstat: protect per cpu variables with preempt disable on RT
      
      * emailed hotfixes from Andrew Morton <akpm@linux-foundation.org>:
        nds32/setup: remove unused memblock_region variable in setup_memory()
        mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task
        mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp
        mmap_lock: change trace and locking order
        mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype
        mm,vmscan: fix divide by zero in get_scan_count
        mm/hugetlb: initialize hugetlb_usage in mm_init
        mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled
      a3fa7a10
    • Mike Rapoport's avatar
      nds32/setup: remove unused memblock_region variable in setup_memory() · ddb13122
      Mike Rapoport authored
      kernel test robot reports unused variable warning:
      
         arch/nds32/kernel/setup.c:247:26: warning: Unused variable: region
         [unusedVariable]
          struct memblock_region *region;
                                  ^
      
      Remove the unused variable.
      
      Link: https://lkml.kernel.org/r/20210712125218.28951-1-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ddb13122
    • yanghui's avatar
      mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task · 276aeee1
      yanghui authored
      Servers happened below panic:
      
        Kernel version:5.4.56
        BUG: unable to handle page fault for address: 0000000000002c48
        RIP: 0010:__next_zones_zonelist+0x1d/0x40
        Call Trace:
          __alloc_pages_nodemask+0x277/0x310
          alloc_page_interleave+0x13/0x70
          handle_mm_fault+0xf99/0x1390
          __do_page_fault+0x288/0x500
          do_page_fault+0x30/0x110
          page_fault+0x3e/0x50
      
      The reason for the panic is that MAX_NUMNODES is passed in the third
      parameter in __alloc_pages_nodemask(preferred_nid).  So access to
      zonelist->zoneref->zone_idx in __next_zones_zonelist will cause a panic.
      
      In offset_il_node(), first_node() returns nid from pol->v.nodes, after
      this other threads may chang pol->v.nodes before next_node().  This race
      condition will let next_node return MAX_NUMNODES.  So put pol->nodes in
      a local variable.
      
      The race condition is between offset_il_node and cpuset_change_task_nodemask:
      
        CPU0:                                     CPU1:
        alloc_pages_vma()
          interleave_nid(pol,)
            offset_il_node(pol,)
              first_node(pol->v.nodes)            cpuset_change_task_nodemask
                              //nodes==0xc          mpol_rebind_task
                                                      mpol_rebind_policy
                                                        mpol_rebind_nodemask(pol,nodes)
                              //nodes==0x3
              next_node(nid, pol->v.nodes)//return MAX_NUMNODES
      
      Link: https://lkml.kernel.org/r/20210906034658.48721-1-yanghui.def@bytedance.comSigned-off-by: default avataryanghui <yanghui.def@bytedance.com>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      276aeee1
    • Naohiro Aota's avatar
      mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp · 79d37050
      Naohiro Aota authored
      In a memory pressure situation, I'm seeing the lockdep WARNING below.
      Actually, this is similar to a known false positive which is already
      addressed by commit 6dcde60e ("xfs: more lockdep whackamole with
      kmem_alloc*").
      
      This warning still persists because it's not from kmalloc() itself but
      from an allocation for kmemleak object.  While kmalloc() itself suppress
      the warning with __GFP_NOLOCKDEP, gfp_kmemleak_mask() is dropping the
      flag for the kmemleak's allocation.
      
      Allow __GFP_NOLOCKDEP to be passed to kmemleak's allocation, so that the
      warning for it is also suppressed.
      
        ======================================================
        WARNING: possible circular locking dependency detected
        5.14.0-rc7-BTRFS-ZNS+ #37 Not tainted
        ------------------------------------------------------
        kswapd0/288 is trying to acquire lock:
        ffff88825ab45df0 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0x8a/0x250
      
        but task is already holding lock:
        ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #1 (fs_reclaim){+.+.}-{0:0}:
               fs_reclaim_acquire+0x112/0x160
               kmem_cache_alloc+0x48/0x400
               create_object.isra.0+0x42/0xb10
               kmemleak_alloc+0x48/0x80
               __kmalloc+0x228/0x440
               kmem_alloc+0xd3/0x2b0
               kmem_alloc_large+0x5a/0x1c0
               xfs_attr_copy_value+0x112/0x190
               xfs_attr_shortform_getvalue+0x1fc/0x300
               xfs_attr_get_ilocked+0x125/0x170
               xfs_attr_get+0x329/0x450
               xfs_get_acl+0x18d/0x430
               get_acl.part.0+0xb6/0x1e0
               posix_acl_xattr_get+0x13a/0x230
               vfs_getxattr+0x21d/0x270
               getxattr+0x126/0x310
               __x64_sys_fgetxattr+0x1a6/0x2a0
               do_syscall_64+0x3b/0x90
               entry_SYSCALL_64_after_hwframe+0x44/0xae
      
        -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
               __lock_acquire+0x2c0f/0x5a00
               lock_acquire+0x1a1/0x4b0
               down_read_nested+0x50/0x90
               xfs_ilock+0x8a/0x250
               xfs_can_free_eofblocks+0x34f/0x570
               xfs_inactive+0x411/0x520
               xfs_fs_destroy_inode+0x2c8/0x710
               destroy_inode+0xc5/0x1a0
               evict+0x444/0x620
               dispose_list+0xfe/0x1c0
               prune_icache_sb+0xdc/0x160
               super_cache_scan+0x31e/0x510
               do_shrink_slab+0x337/0x8e0
               shrink_slab+0x362/0x5c0
               shrink_node+0x7a7/0x1a40
               balance_pgdat+0x64e/0xfe0
               kswapd+0x590/0xa80
               kthread+0x38c/0x460
               ret_from_fork+0x22/0x30
      
        other info that might help us debug this:
         Possible unsafe locking scenario:
               CPU0                    CPU1
               ----                    ----
          lock(fs_reclaim);
                                       lock(&xfs_nondir_ilock_class);
                                       lock(fs_reclaim);
          lock(&xfs_nondir_ilock_class);
      
         *** DEADLOCK ***
        3 locks held by kswapd0/288:
         #0: ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
         #1: ffffffff848a08d8 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x269/0x5c0
         #2: ffff8881a7a820e8 (&type->s_umount_key#60){++++}-{3:3}, at: super_cache_scan+0x5a/0x510
      
      Link: https://lkml.kernel.org/r/20210907055659.3182992-1-naohiro.aota@wdc.comSigned-off-by: default avatarNaohiro Aota <naohiro.aota@wdc.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: "Darrick J . Wong" <djwong@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79d37050
    • Liam Howlett's avatar
      mmap_lock: change trace and locking order · 10994316
      Liam Howlett authored
      Print to the trace log before releasing the lock to avoid racing with
      other trace log printers of the same lock type.
      
      Link: https://lkml.kernel.org/r/20210903022041.1843024-1-Liam.Howlett@oracle.comSigned-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Suggested-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michel Lespinasse <walken.cr@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10994316
    • Miaohe Lin's avatar
      mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype · 053cfda1
      Miaohe Lin authored
      If it's not prepared to free unref page, the pcp page migratetype is
      unset.  Thus we will get rubbish from get_pcppage_migratetype() and
      might list_del(&page->lru) again after it's already deleted from the list
      leading to grumble about data corruption.
      
      Link: https://lkml.kernel.org/r/20210902115447.57050-1-linmiaohe@huawei.com
      Fixes: df1acc85 ("mm/page_alloc: avoid conflating IRQs disabled with zone->lock")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      053cfda1
    • Rik van Riel's avatar
      mm,vmscan: fix divide by zero in get_scan_count · 32d4f4b7
      Rik van Riel authored
      Commit f56ce412 ("mm: memcontrol: fix occasional OOMs due to
      proportional memory.low reclaim") introduced a divide by zero corner
      case when oomd is being used in combination with cgroup memory.low
      protection.
      
      When oomd decides to kill a cgroup, it will force the cgroup memory to
      be reclaimed after killing the tasks, by writing to the memory.max file
      for that cgroup, forcing the remaining page cache and reclaimable slab
      to be reclaimed down to zero.
      
      Previously, on cgroups with some memory.low protection that would result
      in the memory being reclaimed down to the memory.low limit, or likely
      not at all, having the page cache reclaimed asynchronously later.
      
      With f56ce412 the oomd write to memory.max tries to reclaim all the
      way down to zero, which may race with another reclaimer, to the point of
      ending up with the divide by zero below.
      
      This patch implements the obvious fix.
      
      Link: https://lkml.kernel.org/r/20210826220149.058089c6@imladris.surriel.com
      Fixes: f56ce412 ("mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim")
      Signed-off-by: default avatarRik van Riel <riel@surriel.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarChris Down <chris@chrisdown.name>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      32d4f4b7
    • Liu Zixian's avatar
      mm/hugetlb: initialize hugetlb_usage in mm_init · 13db8c50
      Liu Zixian authored
      After fork, the child process will get incorrect (2x) hugetlb_usage.  If
      a process uses 5 2MB hugetlb pages in an anonymous mapping,
      
      	HugetlbPages:	   10240 kB
      
      and then forks, the child will show,
      
      	HugetlbPages:	   20480 kB
      
      The reason for double the amount is because hugetlb_usage will be copied
      from the parent and then increased when we copy page tables from parent
      to child.  Child will have 2x actual usage.
      
      Fix this by adding hugetlb_count_init in mm_init.
      
      Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com
      Fixes: 5d317b2b ("mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status")
      Signed-off-by: default avatarLiu Zixian <liuzixian4@huawei.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      13db8c50
    • Li Zhijian's avatar
      mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled · 4b42fb21
      Li Zhijian authored
      Previously, we noticed the one rpma example was failed[1] since commit
      36f30e48 ("IB/core: Improve ODP to use hmm_range_fault()"), where it
      will use ODP feature to do RDMA WRITE between fsdax files.
      
      After digging into the code, we found hmm_vma_handle_pte() will still
      return EFAULT even though all the its requesting flags has been
      fulfilled.  That's because a DAX page will be marked as (_PAGE_SPECIAL |
      PAGE_DEVMAP) by pte_mkdevmap().
      
      Link: https://github.com/pmem/rpma/issues/1142 [1]
      Link: https://lkml.kernel.org/r/20210830094232.203029-1-lizhijian@cn.fujitsu.com
      Fixes: 40550627 ("mm/hmm: add missing call to hmm_pte_need_fault in HMM_PFN_SPECIAL handling")
      Signed-off-by: default avatarLi Zhijian <lizhijian@cn.fujitsu.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4b42fb21
  3. 08 Sep, 2021 4 commits
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-for-v5.15' of... · 730bf31b
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
      
      Pull chrome platform updates from Benson Leung:
       "cros_ec_typec:
      
         - make the cros_ec_typec driver to use the pre-existing
           cros_ec_check_features() function
      
        sensorhub:
      
         - add trace events for sample
      
        misc:
      
         - cros_ec_proto - re-send commands in the event of a timeout (for the
           FPMCU)
      
         - fix warnings in cros_ec_trace related to format output"
      
      * tag 'tag-chrome-platform-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
        platform/chrome: cros_ec_trace: Fix format warnings
        platform/chrome: cros_ec_typec: Use existing feature check
        platform/chrome: cros_ec_proto: Send command again when timeout occurs
        platform/chrome: sensorhub: Add trace events for sample
      730bf31b
    • Linus Torvalds's avatar
      Merge tag 'pm-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 30f34909
      Linus Torvalds authored
      Pull more power management updates from Rafael Wysocki:
       "These are mostly ARM cpufreq driver updates, including one new
        MediaTek driver that has just passed all of the reviews, with the
        addition of a revert of a recent intel_pstate commit, some core
        cpufreq changes and a DT-related update of the operating performance
        points (OPP) support code.
      
        Specifics:
      
         - Add new cpufreq driver for the MediaTek MT6779 platform called
           mediatek-hw along with corresponding DT bindings (Hector.Yuan).
      
         - Add DCVS interrupt support to the qcom-cpufreq-hw driver (Thara
           Gopinath).
      
         - Make the qcom-cpufreq-hw driver set the dvfs_possible_from_any_cpu
           policy flag (Taniya Das).
      
         - Blocklist more Qualcomm platforms in cpufreq-dt-platdev (Bjorn
           Andersson).
      
         - Make the vexpress cpufreq driver set the CPUFREQ_IS_COOLING_DEV
           flag (Viresh Kumar).
      
         - Add new cpufreq driver callback to allow drivers to register with
           the Energy Model in a consistent way and make several drivers use
           it (Viresh Kumar).
      
         - Change the remaining users of the .ready() cpufreq driver callback
           to move the code from it elsewhere and drop it from the cpufreq
           core (Viresh Kumar).
      
         - Revert recent intel_pstate change adding HWP guaranteed performance
           change notification support to it that led to problems, because the
           notification in question is triggered prematurely on some systems
           (Rafael Wysocki).
      
         - Convert the OPP DT bindings to DT schema and clean them up while at
           it (Rob Herring)"
      
      * tag 'pm-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
        Revert "cpufreq: intel_pstate: Process HWP Guaranteed change notification"
        cpufreq: mediatek-hw: Add support for CPUFREQ HW
        cpufreq: Add of_perf_domain_get_sharing_cpumask
        dt-bindings: cpufreq: add bindings for MediaTek cpufreq HW
        cpufreq: Remove ready() callback
        cpufreq: sh: Remove sh_cpufreq_cpu_ready()
        cpufreq: acpi: Remove acpi_cpufreq_cpu_ready()
        cpufreq: qcom-hw: Set dvfs_possible_from_any_cpu cpufreq driver flag
        cpufreq: blocklist more Qualcomm platforms in cpufreq-dt-platdev
        cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support
        cpufreq: scmi: Use .register_em() to register with energy model
        cpufreq: vexpress: Use .register_em() to register with energy model
        cpufreq: scpi: Use .register_em() to register with energy model
        dt-bindings: opp: Convert to DT schema
        dt-bindings: Clean-up OPP binding node names in examples
        ARM: dts: omap: Drop references to opp.txt
        cpufreq: qcom-cpufreq-hw: Use .register_em() to register with energy model
        cpufreq: omap: Use .register_em() to register with energy model
        cpufreq: mediatek: Use .register_em() to register with energy model
        cpufreq: imx6q: Use .register_em() to register with energy model
        ...
      30f34909
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 9c566611
      Linus Torvalds authored
      Pull more ACPI updates from Rafael Wysocki:
       "These add ACPI support to the PCI VMD driver, improve suspend-to-idle
        support for AMD platforms and update documentation.
      
        Specifics:
      
         - Add ACPI support to the PCI VMD driver (Rafael Wysocki)
      
         - Rearrange suspend-to-idle support code to reflect the platform
           firmware expectations on some AMD platforms (Mario Limonciello)
      
         - Make SSDT overlays documentation follow the code documented by it
           more closely (Andy Shevchenko)"
      
      * tag 'acpi-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: PM: s2idle: Run both AMD and Microsoft methods if both are supported
        Documentation: ACPI: Align the SSDT overlays file with the code
        PCI: VMD: ACPI: Make ACPI companion lookup work for VMD bus
      9c566611
    • Linus Torvalds's avatar
      Merge tag 'docs-5.15-2' of git://git.lwn.net/linux · 0f4b9289
      Linus Torvalds authored
      Pull more documentation updates from Jonathan Corbet:
       "Another collection of documentation patches, mostly fixes but also
        includes another set of traditional Chinese translations"
      
      * tag 'docs-5.15-2' of git://git.lwn.net/linux:
        docs: pdfdocs: Fix typo in CJK-language specific font settings
        docs: kernel-hacking: Remove inappropriate text
        docs/zh_TW: add translations for zh_TW/filesystems
        docs/zh_TW: add translations for zh_TW/cpu-freq
        docs/zh_TW: add translations for zh_TW/arm64
        docs/zh_CN: Modify the translator tag and fix the wrong word
        Documentation/features/vm: correct huge-vmap APIs
        Documentation: block: blk-mq: Fix small typo in multi-queue docs
        Documentation: in_irq() cleanup
        Documentation: arm: marvell: Add 88F6825 model into list
        Documentation/process/maintainer-pgp-guide: Replace broken link to PGP path finder
        Documentation: locking: fix references
        Documentation: Update details of The Linux Kernel Module Programming Guide
        docs: x86: Remove obsolete information about x86_64 vmalloc() faulting
        Documentation/process/applying-patches: Activate linux-next man hyperlink
      0f4b9289