1. 27 Nov, 2022 4 commits
    • Linus Torvalds's avatar
      Merge tag 'perf_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5afcab22
      Linus Torvalds authored
      Pull perf fixes from Borislav Petkov:
       "Two more fixes to the perf sigtrap handling:
      
         - output the address in the sample only when it has been requested
      
         - handle the case where user-only events can hit in kernel and thus
           upset the sigtrap sanity checking"
      
      * tag 'perf_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Consider OS filter fail
        perf: Fixup SIGTRAP and sample_flags interaction
      5afcab22
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · bf82d38c
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "x86:
      
         - Fixes for Xen emulation. While nobody should be enabling it in the
           kernel (the only public users of the feature are the selftests),
           the bug effectively allows userspace to read arbitrary memory.
      
         - Correctness fixes for nested hypervisors that do not intercept INIT
           or SHUTDOWN on AMD; the subsequent CPU reset can cause a
           use-after-free when it disables virtualization extensions. While
           downgrading the panic to a WARN is quite easy, the full fix is a
           bit more laborious; there are also tests. This is the bulk of the
           pull request.
      
         - Fix race condition due to incorrect mmu_lock use around
           make_mmu_pages_available().
      
        Generic:
      
         - Obey changes to the kvm.halt_poll_ns module parameter in VMs not
           using KVM_CAP_HALT_POLL, restoring behavior from before the
           introduction of the capability"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: Update gfn_to_pfn_cache khva when it moves within the same page
        KVM: x86/xen: Only do in-kernel acceleration of hypercalls for guest CPL0
        KVM: x86/xen: Validate port number in SCHEDOP_poll
        KVM: x86/mmu: Fix race condition in direct_page_fault
        KVM: x86: remove exit_int_info warning in svm_handle_exit
        KVM: selftests: add svm part to triple_fault_test
        KVM: x86: allow L1 to not intercept triple fault
        kvm: selftests: add svm nested shutdown test
        KVM: selftests: move idt_entry to header
        KVM: x86: forcibly leave nested mode on vCPU reset
        KVM: x86: add kvm_leave_nested
        KVM: x86: nSVM: harden svm_free_nested against freeing vmcb02 while still in use
        KVM: x86: nSVM: leave nested mode on vCPU free
        KVM: Obey kvm.halt_poll_ns in VMs not using KVM_CAP_HALT_POLL
        KVM: Avoid re-reading kvm->max_halt_poll_ns during halt-polling
        KVM: Cap vcpu->halt_poll_ns before halting rather than after
      bf82d38c
    • Linus Torvalds's avatar
      Merge tag '6.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 30a853c1
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Two small cifs/smb3 client fixes:
      
         - an unlock missing in an error path in copychunk_range found by
           xfstest 476
      
         - a fix for a use after free in a debug code path"
      
      * tag '6.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix missing unlock in cifs_file_copychunk_range()
        cifs: Use after free in debug code
      30a853c1
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.1-4' of... · faf68e35
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix CC_HAS_ASM_GOTO_TIED_OUTPUT test in Kconfig
      
       - Fix noisy "No such file or directory" message when
         KBUILD_BUILD_VERSION is passed
      
       - Include rust/ in source tarballs
      
       - Fix missing FORCE for ARCH=nios2 builds
      
      * tag 'kbuild-fixes-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        nios2: add FORCE for vmlinuz.gz
        scripts: add rust in scripts/Makefile.package
        kbuild: fix "cat: .version: No such file or directory"
        init/Kconfig: fix CC_HAS_ASM_GOTO_TIED_OUTPUT test with dash
      faf68e35
  2. 26 Nov, 2022 6 commits
  3. 25 Nov, 2022 16 commits
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v6.1-rc6' of... · f10b4396
      Linus Torvalds authored
      Merge tag 'regulator-fix-v6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "This is more changes than I'd like this late although the diffstat is
        still fairly small, I kept on holding off as new fixes came in to give
        things time to soak in -next but should probably have tagged and sent
        an additional pull request earlier.
      
        There's some relatively large fixes to the twl6030 driver to fix
        issues with the TWL6032 variant which resulted from some work on the
        core TWL6030 driver, a couple of fixes for error handling paths
        (mostly in the core), and a nice stability fix for the sgl51000 driver
        that's been pulled out of a BSP"
      
      * tag 'regulator-fix-v6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: twl6030: fix get status of twl6032 regulators
        regulator: twl6030: re-add TWL6032_SUBCLASS
        regulator: slg51000: Wait after asserting CS pin
        regulator: core: fix UAF in destroy_regulator()
        regulator: rt5759: fix OOB in validate_desc()
        regulator: core: fix kobject release warning and memory leak in regulator_register()
      f10b4396
    • Linus Torvalds's avatar
      Merge tag 'for-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 3eaea0db
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - fix a regression in nowait + buffered write
      
       - in zoned mode fix endianness when comparing super block generation
      
       - locking and lockdep fixes:
           - fix potential sleeping under spinlock when setting qgroup limit
           - lockdep warning fixes when btrfs_path is freed after copy_to_user
           - do not modify log tree while holding a leaf from fs tree locked
      
       - fix freeing of sysfs files of static features on error
      
       - use kv.alloc for zone map allocation as a fallback to avoid warnings
         due to high order allocation
      
       - send, avoid unaligned encoded writes when attempting to clone range
      
      * tag 'for-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: sysfs: normalize the error handling branch in btrfs_init_sysfs()
        btrfs: do not modify log tree while holding a leaf from fs tree locked
        btrfs: use kvcalloc in btrfs_get_dev_zone_info
        btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit()
        btrfs: send: avoid unaligned encoded writes when attempting to clone range
        btrfs: zoned: fix missing endianness conversion in sb_write_pointer
        btrfs: free btrfs_path before copying subvol info to userspace
        btrfs: free btrfs_path before copying fspath to userspace
        btrfs: free btrfs_path before copying inodes to userspace
        btrfs: free btrfs_path before copying root refs to userspace
        btrfs: fix assertion failure and blocking during nowait buffered write
      3eaea0db
    • Linus Torvalds's avatar
      Merge tag 'pm-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 88817acb
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These revert a recent change in the schedutil cpufreq governor that
        had not been expected to make any functional difference, but turned
        out to introduce a performance regression, fix an initialization issue
        in the amd-pstate driver and make it actually replace the venerable
        ACPI cpufreq driver on the supported systems by default.
      
        Specifics:
      
         - Revert a recent schedutil cpufreq governor change that introduced a
           performace regression on Pixel 6 (Sam Wu)
      
         - Fix amd-pstate driver initialization after running the kernel via
           kexec (Wyes Karny)
      
         - Turn amd-pstate into a built-in driver which allows it to take
           precedence over acpi-cpufreq by default on supported systems and
           amend it with a mechanism to disable this behavior (Perry Yuan)
      
         - Update amd-pstate documentation in accordance with the other
           changes made to it (Perry Yuan)"
      
      * tag 'pm-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Documentation: add amd-pstate kernel command line options
        Documentation: amd-pstate: add driver working mode introduction
        cpufreq: amd-pstate: add amd-pstate driver parameter for mode selection
        cpufreq: amd-pstate: change amd-pstate driver to be built-in type
        cpufreq: amd-pstate: cpufreq: amd-pstate: reset MSR_AMD_PERF_CTL register at init
        Revert "cpufreq: schedutil: Move max CPU capacity to sugov_policy"
      88817acb
    • Linus Torvalds's avatar
      Merge tag 's390-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · e3ebac80
      Linus Torvalds authored
      Pull s390 updates from Alexander Gordeev:
      
       - Fix size of incorrectly increased from four to eight bytes TOD field
         of crash dump save area. As result in case of kdump NT_S390_TODPREG
         ELF notes section contains correct value and "detected read beyond
         size of field" compiler warning goes away.
      
       - Fix memory leak in cryptographic Adjunct Processors (AP) module on
         initialization failure path.
      
       - Add Gerald Schaefer <gerald.schaefer@linux.ibm.com> and Alexander
         Gordeev <agordeev@linux.ibm.com> as S390 memory management
         maintainers. Also rename the S390 section to S390 ARCHITECTURE to be
         a bit more precise.
      
      * tag 's390-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        MAINTAINERS: add S390 MM section
        s390/crashdump: fix TOD programmable field size
        s390/ap: fix memory leak in ap_init_qci_info()
      e3ebac80
    • Linus Torvalds's avatar
      Merge tag 'hyperv-fixes-signed-20221125' of... · 081f359e
      Linus Torvalds authored
      Merge tag 'hyperv-fixes-signed-20221125' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
      
      Pull hyperv fixes from Wei Liu:
      
       - Fix IRTE allocation in Hyper-V PCI controller (Dexuan Cui)
      
       - Fix handling of SCSI srb_status and capacity change events (Michael
         Kelley)
      
       - Restore VP assist page after CPU offlining and onlining (Vitaly
         Kuznetsov)
      
       - Fix some memory leak issues in VMBus (Yang Yingliang)
      
      * tag 'hyperv-fixes-signed-20221125' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        Drivers: hv: vmbus: fix possible memory leak in vmbus_device_register()
        Drivers: hv: vmbus: fix double free in the error path of vmbus_add_channel_work()
        PCI: hv: Only reuse existing IRTE allocation for Multi-MSI
        scsi: storvsc: Fix handling of srb_status and capacity change events
        x86/hyperv: Restore VP assist page after cpu offlining/onlining
      081f359e
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2022-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 0b1dcc2c
      Linus Torvalds authored
      Pull hotfixes from Andrew Morton:
       "24 MM and non-MM hotfixes. 8 marked cc:stable and 16 for post-6.0
        issues.
      
        There have been a lot of hotfixes this cycle, and this is quite a
        large batch given how far we are into the -rc cycle. Presumably a
        reflection of the unusually large amount of MM material which went
        into 6.1-rc1"
      
      * tag 'mm-hotfixes-stable-2022-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (24 commits)
        test_kprobes: fix implicit declaration error of test_kprobes
        nilfs2: fix nilfs_sufile_mark_dirty() not set segment usage as dirty
        mm/cgroup/reclaim: fix dirty pages throttling on cgroup v1
        mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
        swapfile: fix soft lockup in scan_swap_map_slots
        hugetlb: fix __prep_compound_gigantic_page page flag setting
        kfence: fix stack trace pruning
        proc/meminfo: fix spacing in SecPageTables
        mm: multi-gen LRU: retry folios written back while isolated
        mailmap: update email address for Satya Priya
        mm/migrate_device: return number of migrating pages in args->cpages
        kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible
        MAINTAINERS: update Alex Hung's email address
        mailmap: update Alex Hung's email address
        mm: mmap: fix documentation for vma_mas_szero
        mm/damon/sysfs-schemes: skip stats update if the scheme directory is removed
        mm/memory: return vm_fault_t result from migrate_to_ram() callback
        mm: correctly charge compressed memory to its memcg
        ipc/shm: call underlying open/close vm_ops
        gcov: clang: fix the buffer overflow issue
        ...
      0b1dcc2c
    • Linus Torvalds's avatar
      Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · b3085709
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "A couple of fixes, one of them for this cycle regression..."
      
      * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        vfs: vfs_tmpfile: ensure O_EXCL flag is enforced
        fs: use acquire ordering in __fget_light()
      b3085709
    • Jens Axboe's avatar
      io_uring: clear TIF_NOTIFY_SIGNAL if set and task_work not available · 7cfe7a09
      Jens Axboe authored
      With how task_work is added and signaled, we can have TIF_NOTIFY_SIGNAL
      set and no task_work pending as it got run in a previous loop. Treat
      TIF_NOTIFY_SIGNAL like get_signal(), always clear it if set regardless
      of whether or not task_work is pending to run.
      
      Cc: stable@vger.kernel.org
      Fixes: 46a525e1 ("io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7cfe7a09
    • Linus Torvalds's avatar
      Merge tag 'sound-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · ca66e580
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A few more last-minute fixes for 6.1 that have been gathered in the
        last week; nothing looks too worrisome, mostly device-specific small
        fixes, including the ABI fix for ASoC SOF"
      
      * tag 'sound-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ASoC: soc-pcm: Add NULL check in BE reparenting
        ALSA: seq: Fix function prototype mismatch in snd_seq_expand_var_event
        ASoC: SOF: dai: move AMD_HS to end of list to restore backwards-compatibility
        ASoC: max98373: Add checks for devm_kcalloc
        ASoC: rt711-sdca: fix the latency time of clock stop prepare state machine transitions
        ASoC: soc-pcm: Don't zero TDM masks in __soc_pcm_open()
        ASoC: sgtl5000: Reset the CHIP_CLK_CTRL reg on remove
        ASoC: hdac_hda: fix hda pcm buffer overflow issue
        ASoC: stm32: i2s: remove irqf_oneshot flag
        ASoC: wm8962: Wait for updated value of WM8962_CLOCKING1 register
      ca66e580
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2022-11-25' of git://anongit.freedesktop.org/drm/drm · 6fe0e074
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Weekly fixes, amdgpu has not quite settled down.
      
        Most of the changes are small, and the non-amdgpu ones are all fine.
        There are a bunch of DP MST DSC fixes that fix some issues introduced
        in a previous larger MST rework.
      
        The biggest one is mainly propagating some error values properly
        instead of bool returns, and I think it just looks large but doesn't
        really change anything too much, except propagating errors that are
        required to avoid deadlocks. I've gone over it and a few others and
        they've had some decent testing over the last few weeks.
      
        Summary:
      
        amdgpu:
         - amdgpu gang submit fix
         - DCN 3.1.4 fixes
         - DP MST DSC deadlock fixes
         - HMM userptr fixes
         - Fix Aldebaran CU occupancy reporting
         - GFX11 fixes
         - PSP suspend/resume fix
         - DCE12 KASAN fix
         - DCN 3.2.x fixes
         - Rotated cursor fix
         - SMU 13.x fix
         - DELL platform suspend/resume fixes
         - VCN4 SR-IOV fix
         - Display regression fix for polled connectors
      
        i915:
         - Fix GVT KVM reference count handling
         - Never purge busy TTM objects
         - Fix warn in intel_display_power_*_domain() functions
      
        dma-buf:
         - Use dma_fence_unwrap_for_each when importing sync files
         - Fix race in dma_heap_add()
      
        fbcon:
         - Fix use of uninitialized memory in logo"
      
      * tag 'drm-fixes-2022-11-25' of git://anongit.freedesktop.org/drm/drm: (30 commits)
        drm/amdgpu/vcn: re-use original vcn0 doorbell value
        drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read"
        drm/amd/display: No display after resume from WB/CB
        drm/amdgpu: fix use-after-free during gpu recovery
        drm/amd/pm: update driver if header for smu_13_0_7
        drm/amd/display: Fix rotated cursor offset calculation
        drm/amd/display: Use new num clk levels struct for max mclk index
        drm/amd/display: Avoid setting pixel rate divider to N/A
        drm/amd/display: Use viewport height for subvp mall allocation size
        drm/amd/display: Update soc bounding box for dcn32/dcn321
        drm/amd/dc/dce120: Fix audio register mapping, stop triggering KASAN
        drm/amdgpu/psp: don't free PSP buffers on suspend
        fbcon: Use kzalloc() in fbcon_prepare_logo()
        dma-buf: fix racing conflict of dma_heap_add()
        drm/amd/amdgpu: reserve vm invalidation engine for firmware
        drm/amdgpu: Enable Aldebaran devices to report CU Occupancy
        drm/amdgpu: fix userptr HMM range handling v2
        drm/amdgpu: always register an MMU notifier for userptr
        drm/amdgpu/dm/mst: Fix uninitialized var in pre_compute_mst_dsc_configs_for_state()
        drm/amdgpu/dm/dp_mst: Don't grab mst_mgr->lock when computing DSC state
        ...
      6fe0e074
    • Lin Ma's avatar
      io_uring/poll: fix poll_refs race with cancelation · 12ad3d2d
      Lin Ma authored
      There is an interesting race condition of poll_refs which could result
      in a NULL pointer dereference. The crash trace is like:
      
      KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
      CPU: 0 PID: 30781 Comm: syz-executor.2 Not tainted 6.0.0-g493ffd66 #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      1.13.0-1ubuntu1.1 04/01/2014
      RIP: 0010:io_poll_remove_entry io_uring/poll.c:154 [inline]
      RIP: 0010:io_poll_remove_entries+0x171/0x5b4 io_uring/poll.c:190
      Code: ...
      RSP: 0018:ffff88810dfefba0 EFLAGS: 00010202
      RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000040000
      RDX: ffffc900030c4000 RSI: 000000000003ffff RDI: 0000000000040000
      RBP: 0000000000000008 R08: ffffffff9764d3dd R09: fffffbfff3836781
      R10: fffffbfff3836781 R11: 0000000000000000 R12: 1ffff11003422d60
      R13: ffff88801a116b04 R14: ffff88801a116ac0 R15: dffffc0000000000
      FS:  00007f9c07497700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffb5c00ea98 CR3: 0000000105680005 CR4: 0000000000770ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <TASK>
       io_apoll_task_func+0x3f/0xa0 io_uring/poll.c:299
       handle_tw_list io_uring/io_uring.c:1037 [inline]
       tctx_task_work+0x37e/0x4f0 io_uring/io_uring.c:1090
       task_work_run+0x13a/0x1b0 kernel/task_work.c:177
       get_signal+0x2402/0x25a0 kernel/signal.c:2635
       arch_do_signal_or_restart+0x3b/0x660 arch/x86/kernel/signal.c:869
       exit_to_user_mode_loop kernel/entry/common.c:166 [inline]
       exit_to_user_mode_prepare+0xc2/0x160 kernel/entry/common.c:201
       __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
       syscall_exit_to_user_mode+0x58/0x160 kernel/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      The root cause for this is a tiny overlooking in
      io_poll_check_events() when cocurrently run with poll cancel routine
      io_poll_cancel_req().
      
      The interleaving to trigger use-after-free:
      
      CPU0                                       |  CPU1
                                                 |
      io_apoll_task_func()                       |  io_poll_cancel_req()
       io_poll_check_events()                    |
        // do while first loop                   |
        v = atomic_read(...)                     |
        // v = poll_refs = 1                     |
        ...                                      |  io_poll_mark_cancelled()
                                                 |   atomic_or()
                                                 |   // poll_refs =
      IO_POLL_CANCEL_FLAG | 1
                                                 |
        atomic_sub_return(...)                   |
        // poll_refs = IO_POLL_CANCEL_FLAG       |
        // loop continue                         |
                                                 |
                                                 |  io_poll_execute()
                                                 |   io_poll_get_ownership()
                                                 |   // poll_refs =
      IO_POLL_CANCEL_FLAG | 1
                                                 |   // gets the ownership
        v = atomic_read(...)                     |
        // poll_refs not change                  |
                                                 |
        if (v & IO_POLL_CANCEL_FLAG)             |
         return -ECANCELED;                      |
        // io_poll_check_events return           |
        // will go into                          |
        // io_req_complete_failed() free req     |
                                                 |
                                                 |  io_apoll_task_func()
                                                 |  // also go into
      io_req_complete_failed()
      
      And the interleaving to trigger the kernel WARNING:
      
      CPU0                                       |  CPU1
                                                 |
      io_apoll_task_func()                       |  io_poll_cancel_req()
       io_poll_check_events()                    |
        // do while first loop                   |
        v = atomic_read(...)                     |
        // v = poll_refs = 1                     |
        ...                                      |  io_poll_mark_cancelled()
                                                 |   atomic_or()
                                                 |   // poll_refs =
      IO_POLL_CANCEL_FLAG | 1
                                                 |
        atomic_sub_return(...)                   |
        // poll_refs = IO_POLL_CANCEL_FLAG       |
        // loop continue                         |
                                                 |
        v = atomic_read(...)                     |
        // v = IO_POLL_CANCEL_FLAG               |
                                                 |  io_poll_execute()
                                                 |   io_poll_get_ownership()
                                                 |   // poll_refs =
      IO_POLL_CANCEL_FLAG | 1
                                                 |   // gets the ownership
                                                 |
        WARN_ON_ONCE(!(v & IO_POLL_REF_MASK)))   |
        // v & IO_POLL_REF_MASK = 0 WARN         |
                                                 |
                                                 |  io_apoll_task_func()
                                                 |  // also go into
      io_req_complete_failed()
      
      By looking up the source code and communicating with Pavel, the
      implementation of this atomic poll refs should continue the loop of
      io_poll_check_events() just to avoid somewhere else to grab the
      ownership. Therefore, this patch simply adds another AND operation to
      make sure the loop will stop if it finds the poll_refs is exactly equal
      to IO_POLL_CANCEL_FLAG. Since io_poll_cancel_req() grabs ownership and
      will finally make its way to io_req_complete_failed(), the req will
      be reclaimed as expected.
      
      Fixes: aa43477b ("io_uring: poll rework")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Reviewed-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      [axboe: tweak description and code style]
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      12ad3d2d
    • Lin Ma's avatar
      io_uring/filetable: fix file reference underflow · 9d94c04c
      Lin Ma authored
      There is an interesting reference bug when -ENOMEM occurs in calling of
      io_install_fixed_file(). KASan report like below:
      
      [   14.057131] ==================================================================
      [   14.059161] BUG: KASAN: use-after-free in unix_get_socket+0x10/0x90
      [   14.060975] Read of size 8 at addr ffff88800b09cf20 by task kworker/u8:2/45
      [   14.062684]
      [   14.062768] CPU: 2 PID: 45 Comm: kworker/u8:2 Not tainted 6.1.0-rc4 #1
      [   14.063099] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      [   14.063666] Workqueue: events_unbound io_ring_exit_work
      [   14.063936] Call Trace:
      [   14.064065]  <TASK>
      [   14.064175]  dump_stack_lvl+0x34/0x48
      [   14.064360]  print_report+0x172/0x475
      [   14.064547]  ? _raw_spin_lock_irq+0x83/0xe0
      [   14.064758]  ? __virt_addr_valid+0xef/0x170
      [   14.064975]  ? unix_get_socket+0x10/0x90
      [   14.065167]  kasan_report+0xad/0x130
      [   14.065353]  ? unix_get_socket+0x10/0x90
      [   14.065553]  unix_get_socket+0x10/0x90
      [   14.065744]  __io_sqe_files_unregister+0x87/0x1e0
      [   14.065989]  ? io_rsrc_refs_drop+0x1c/0xd0
      [   14.066199]  io_ring_exit_work+0x388/0x6a5
      [   14.066410]  ? io_uring_try_cancel_requests+0x5bf/0x5bf
      [   14.066674]  ? try_to_wake_up+0xdb/0x910
      [   14.066873]  ? virt_to_head_page+0xbe/0xbe
      [   14.067080]  ? __schedule+0x574/0xd20
      [   14.067273]  ? read_word_at_a_time+0xe/0x20
      [   14.067492]  ? strscpy+0xb5/0x190
      [   14.067665]  process_one_work+0x423/0x710
      [   14.067879]  worker_thread+0x2a2/0x6f0
      [   14.068073]  ? process_one_work+0x710/0x710
      [   14.068284]  kthread+0x163/0x1a0
      [   14.068454]  ? kthread_complete_and_exit+0x20/0x20
      [   14.068697]  ret_from_fork+0x22/0x30
      [   14.068886]  </TASK>
      [   14.069000]
      [   14.069088] Allocated by task 289:
      [   14.069269]  kasan_save_stack+0x1e/0x40
      [   14.069463]  kasan_set_track+0x21/0x30
      [   14.069652]  __kasan_slab_alloc+0x58/0x70
      [   14.069899]  kmem_cache_alloc+0xc5/0x200
      [   14.070100]  __alloc_file+0x20/0x160
      [   14.070283]  alloc_empty_file+0x3b/0xc0
      [   14.070479]  path_openat+0xc3/0x1770
      [   14.070689]  do_filp_open+0x150/0x270
      [   14.070888]  do_sys_openat2+0x113/0x270
      [   14.071081]  __x64_sys_openat+0xc8/0x140
      [   14.071283]  do_syscall_64+0x3b/0x90
      [   14.071466]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [   14.071791]
      [   14.071874] Freed by task 0:
      [   14.072027]  kasan_save_stack+0x1e/0x40
      [   14.072224]  kasan_set_track+0x21/0x30
      [   14.072415]  kasan_save_free_info+0x2a/0x50
      [   14.072627]  __kasan_slab_free+0x106/0x190
      [   14.072858]  kmem_cache_free+0x98/0x340
      [   14.073075]  rcu_core+0x427/0xe50
      [   14.073249]  __do_softirq+0x110/0x3cd
      [   14.073440]
      [   14.073523] Last potentially related work creation:
      [   14.073801]  kasan_save_stack+0x1e/0x40
      [   14.074017]  __kasan_record_aux_stack+0x97/0xb0
      [   14.074264]  call_rcu+0x41/0x550
      [   14.074436]  task_work_run+0xf4/0x170
      [   14.074619]  exit_to_user_mode_prepare+0x113/0x120
      [   14.074858]  syscall_exit_to_user_mode+0x1d/0x40
      [   14.075092]  do_syscall_64+0x48/0x90
      [   14.075272]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [   14.075529]
      [   14.075612] Second to last potentially related work creation:
      [   14.075900]  kasan_save_stack+0x1e/0x40
      [   14.076098]  __kasan_record_aux_stack+0x97/0xb0
      [   14.076325]  task_work_add+0x72/0x1b0
      [   14.076512]  fput+0x65/0xc0
      [   14.076657]  filp_close+0x8e/0xa0
      [   14.076825]  __x64_sys_close+0x15/0x50
      [   14.077019]  do_syscall_64+0x3b/0x90
      [   14.077199]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [   14.077448]
      [   14.077530] The buggy address belongs to the object at ffff88800b09cf00
      [   14.077530]  which belongs to the cache filp of size 232
      [   14.078105] The buggy address is located 32 bytes inside of
      [   14.078105]  232-byte region [ffff88800b09cf00, ffff88800b09cfe8)
      [   14.078685]
      [   14.078771] The buggy address belongs to the physical page:
      [   14.079046] page:000000001bd520e7 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800b09de00 pfn:0xb09c
      [   14.079575] head:000000001bd520e7 order:1 compound_mapcount:0 compound_pincount:0
      [   14.079946] flags: 0x100000000010200(slab|head|node=0|zone=1)
      [   14.080244] raw: 0100000000010200 0000000000000000 dead000000000001 ffff88800493cc80
      [   14.080629] raw: ffff88800b09de00 0000000080190018 00000001ffffffff 0000000000000000
      [   14.081016] page dumped because: kasan: bad access detected
      [   14.081293]
      [   14.081376] Memory state around the buggy address:
      [   14.081618]  ffff88800b09ce00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   14.081974]  ffff88800b09ce80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
      [   14.082336] >ffff88800b09cf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   14.082690]                                ^
      [   14.082909]  ffff88800b09cf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
      [   14.083266]  ffff88800b09d000: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
      [   14.083622] ==================================================================
      
      The actual tracing of this bug is shown below:
      
      commit 8c71fe75 ("io_uring: ensure fput() called correspondingly
      when direct install fails") adds an additional fput() in
      io_fixed_fd_install() when io_file_bitmap_get() returns error values. In
      that case, the routine will never make it to io_install_fixed_file() due
      to an early return.
      
      static int io_fixed_fd_install(...)
      {
        if (alloc_slot) {
          ...
          ret = io_file_bitmap_get(ctx);
          if (unlikely(ret < 0)) {
            io_ring_submit_unlock(ctx, issue_flags);
            fput(file);
            return ret;
          }
          ...
        }
        ...
        ret = io_install_fixed_file(req, file, issue_flags, file_slot);
        ...
      }
      
      In the above scenario, the reference is okay as io_fixed_fd_install()
      ensures the fput() is called when something bad happens, either via
      bitmap or via inner io_install_fixed_file().
      
      However, the commit 61c1b44a ("io_uring: fix deadlock on iowq file
      slot alloc") breaks the balance because it places fput() into the common
      path for both io_file_bitmap_get() and io_install_fixed_file(). Since
      io_install_fixed_file() handles the fput() itself, the reference
      underflow come across then.
      
      There are some extra commits make the current code into
      io_fixed_fd_install() -> __io_fixed_fd_install() ->
      io_install_fixed_file()
      
      However, the fact that there is an extra fput() is called if
      io_install_fixed_file() calls fput(). Traversing through the code, I
      find that the existing two callers to __io_fixed_fd_install():
      io_fixed_fd_install() and io_msg_send_fd() have fput() when handling
      error return, this patch simply removes the fput() in
      io_install_fixed_file() to fix the bug.
      
      Fixes: 61c1b44a ("io_uring: fix deadlock on iowq file slot alloc")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Link: https://lore.kernel.org/r/be4ba4b.5d44.184a0a406a4.Coremail.linma@zju.edu.cnSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9d94c04c
    • Pavel Begunkov's avatar
      io_uring: make poll refs more robust · a26a35e9
      Pavel Begunkov authored
      poll_refs carry two functions, the first is ownership over the request.
      The second is notifying the io_poll_check_events() that there was an
      event but wake up couldn't grab the ownership, so io_poll_check_events()
      should retry.
      
      We want to make poll_refs more robust against overflows. Instead of
      always incrementing it, which covers two purposes with one atomic, check
      if poll_refs is elevated enough and if so set a retry flag without
      attempts to grab ownership. The gap between the bias check and following
      atomics may seem racy, but we don't need it to be strict. Moreover there
      might only be maximum 4 parallel updates: by the first and the second
      poll entries, __io_arm_poll_handler() and cancellation. From those four,
      only poll wake ups may be executed multiple times, but they're protected
      by a spin.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarLin Ma <linma@zju.edu.cn>
      Fixes: aa43477b ("io_uring: poll rework")
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Link: https://lore.kernel.org/r/c762bc31f8683b3270f3587691348a7119ef9c9d.1668963050.git.asml.silence@gmail.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a26a35e9
    • Pavel Begunkov's avatar
      io_uring: cmpxchg for poll arm refs release · 2f389343
      Pavel Begunkov authored
      Replace atomically substracting the ownership reference at the end of
      arming a poll with a cmpxchg. We try to release ownership by setting 0
      assuming that poll_refs didn't change while we were arming. If it did
      change, we keep the ownership and use it to queue a tw, which is fully
      capable to process all events and (even tolerates spurious wake ups).
      
      It's a bit more elegant as we reduce races b/w setting the cancellation
      flag and getting refs with this release, and with that we don't have to
      worry about any kinds of underflows. It's not the fastest path for
      polling. The performance difference b/w cmpxchg and atomic dec is
      usually negligible and it's not the fastest path.
      
      Cc: stable@vger.kernel.org
      Fixes: aa43477b ("io_uring: poll rework")
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Link: https://lore.kernel.org/r/0c95251624397ea6def568ff040cad2d7926fd51.1668963050.git.asml.silence@gmail.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2f389343
    • Damien Le Moal's avatar
      zonefs: Fix active zone accounting · db58653c
      Damien Le Moal authored
      If a file zone transitions to the offline or readonly state from an
      active state, we must clear the zone active flag and decrement the
      active seq file counter. Do so in zonefs_account_active() using the new
      zonefs inode flags ZONEFS_ZONE_OFFLINE and ZONEFS_ZONE_READONLY. These
      flags are set if necessary in zonefs_check_zone_condition() based on the
      result of report zones operation after an IO error.
      
      Fixes: 87c9ce3f ("zonefs: Add active seq file accounting")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@opensource.wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      db58653c
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-6.1-2022-11-23' of... · e5770206
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-6.1-2022-11-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
      
      amd-drm-fixes-6.1-2022-11-23:
      
      amdgpu:
      - DCN 3.1.4 fixes
      - DP MST DSC deadlock fixes
      - HMM userptr fixes
      - Fix Aldebaran CU occupancy reporting
      - GFX11 fixes
      - PSP suspend/resume fix
      - DCE12 KASAN fix
      - DCN 3.2.x fixes
      - Rotated cursor fix
      - SMU 13.x fix
      - DELL platform suspend/resume fixes
      - VCN4 SR-IOV fix
      - Display regression fix for polled connectors
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20221123143453.8977-1-alexander.deucher@amd.com
      e5770206
  4. 24 Nov, 2022 14 commits