1. 24 Sep, 2022 5 commits
    • наб's avatar
      Documentation: remove nonexistent magic numbers · 766c5a3e
      наб authored
      The entire file blames back to the start of git
      (minus whitespace from the RST translation and a typo fix):
        * there are changelog comments for March 1994 through to Linux 2.5.74
        * struct tty_ldisc is two pointers nowadays, so naturally no magic
        * GDA_MAGIC is defined but unused, and it's been this way
          since start-of-git
        * M3_CARD_MAGIC isn't defined, because
          commit d56b9b9c ("[PATCH] The scheduled removal of some OSS
          drivers") removed the entire driver in 2006
        * CS_CARD_MAGIC likewise since
          commit b5d425c9 ("more scheduled OSS driver removal") in 2007
        * KMALLOC_MAGIC and VMALLOC_MAGIC were removed in
          commit e38e0cfa ("[ALSA] Remove kmalloc wrappers"),
          six months after start of git
        * SLAB_C_MAGIC has never even appeared in git
          (removed in 2.4.0-test3pre6)
      
      magic-number.rst is a low-value historial relic at best and
      misleading cruft at worst, so start with cleaning out ones that only
      appear therein
      
      Automated:
      grep MAGIC Documentation/process/magic-number.rst | while read -r mag _;
      do git grep -wF "$mag" | grep -vq '^Documentation.*magic-number.rst:' ||
      sed -i "/^$mag/d" \
      Documentation/{,translations/{zh_CN,zh_TW,it_IT}/}process/magic-number.rst
      done
      Signed-off-by: default avatarAhelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
      Link: https://lore.kernel.org/r/8389a7b85b5c660c6891b1740b5dacc53491a41b.1663280877.git.nabijaczleweli@nabijaczleweli.xyzSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      766c5a3e
    • Mukesh Ojha's avatar
      devcoredump : Serialize devcd_del work · 01daccf7
      Mukesh Ojha authored
      In following scenario(diagram), when one thread X running dev_coredumpm()
      adds devcd device to the framework which sends uevent notification to
      userspace and another thread Y reads this uevent and call to
      devcd_data_write() which eventually try to delete the queued timer that
      is not initialized/queued yet.
      
      So, debug object reports some warning and in the meantime, timer is
      initialized and queued from X path. and from Y path, it gets reinitialized
      again and timer->entry.pprev=NULL and try_to_grab_pending() stucks.
      
      To fix this, introduce mutex and a boolean flag to serialize the behaviour.
      
       	cpu0(X)			                cpu1(Y)
      
          dev_coredump() uevent sent to user space
          device_add()  ======================> user space process Y reads the
                                                uevents writes to devcd fd
                                                which results into writes to
      
                                               devcd_data_write()
                                                 mod_delayed_work()
                                                   try_to_grab_pending()
                                                     del_timer()
                                                       debug_assert_init()
         INIT_DELAYED_WORK()
         schedule_delayed_work()
                                                         debug_object_fixup()
                                                           timer_fixup_assert_init()
                                                             timer_setup()
                                                               do_init_timer()
                                                             /*
                                                              Above call reinitializes
                                                              the timer to
                                                              timer->entry.pprev=NULL
                                                              and this will be checked
                                                              later in timer_pending() call.
                                                             */
                                                       timer_pending()
                                                        !hlist_unhashed_lockless(&timer->entry)
                                                          !h->pprev
                                                      /*
                                                        del_timer() checks h->pprev and finds
                                                        it to be NULL due to which
                                                        try_to_grab_pending() stucks.
                                                      */
      
      Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/Signed-off-by: default avatarMukesh Ojha <quic_mojha@quicinc.com>
      Link: https://lore.kernel.org/r/1663073424-13663-1-git-send-email-quic_mojha@quicinc.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01daccf7
    • Brian Norris's avatar
      debugfs: Only clobber mode/uid/gid on remount if asked · b8de524c
      Brian Norris authored
      Users may have explicitly configured their debugfs permissions; we
      shouldn't overwrite those just because a second mount appeared.
      
      Only clobber if the options were provided at mount time.
      
      Existing behavior:
      
        ## Pre-existing status: debugfs is 0755.
        # chmod 755 /sys/kernel/debug/
        # stat -c '%A' /sys/kernel/debug/
        drwxr-xr-x
      
        ## New mount sets kernel-default permissions:
        # mount -t debugfs none /mnt/foo
        # stat -c '%A' /mnt/foo
        drwx------
      
        ## Unexpected: the original mount changed permissions:
        # stat -c '%A' /sys/kernel/debug
        drwx------
      
      New behavior:
      
        ## Pre-existing status: debugfs is 0755.
        # chmod 755 /sys/kernel/debug/
        # stat -c '%A' /sys/kernel/debug/
        drwxr-xr-x
      
        ## New mount inherits existing permissions:
        # mount -t debugfs none /mnt/foo
        # stat -c '%A' /mnt/foo
        drwxr-xr-x
      
        ## Expected: old mount is unchanged:
        # stat -c '%A' /sys/kernel/debug
        drwxr-xr-x
      
      Full test cases are being submitted to LTP.
      Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
      Link: https://lore.kernel.org/r/20220912163042.v3.1.Icbd40fce59f55ad74b80e5d435ea233579348a78@changeidSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8de524c
    • Christian A. Ehrhardt's avatar
      kernfs: fix use-after-free in __kernfs_remove · 4abc9965
      Christian A. Ehrhardt authored
      Syzkaller managed to trigger concurrent calls to
      kernfs_remove_by_name_ns() for the same file resulting in
      a KASAN detected use-after-free. The race occurs when the root
      node is freed during kernfs_drain().
      
      To prevent this acquire an additional reference for the root
      of the tree that is removed before calling __kernfs_remove().
      
      Found by syzkaller with the following reproducer (slab_nomerge is
      required):
      
      syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
      r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
      close(r0)
      pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
      mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})
      
      Sample report:
      
      ==================================================================
      BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
      BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
      BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
      Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857
      
      CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3 #5
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:317 [inline]
       print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
       kasan_report+0xa3/0x130 mm/kasan/report.c:495
       kernfs_type include/linux/kernfs.h:335 [inline]
       kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
       __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
       __kernfs_remove fs/kernfs/dir.c:1356 [inline]
       kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
       sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f725f983aed
      Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
      RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
      RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
      RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
      R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
       </TASK>
      
      Allocated by task 855:
       kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
       kasan_set_track mm/kasan/common.c:45 [inline]
       set_alloc_info mm/kasan/common.c:437 [inline]
       __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
       kasan_slab_alloc include/linux/kasan.h:224 [inline]
       slab_post_alloc_hook mm/slab.h:727 [inline]
       slab_alloc_node mm/slub.c:3243 [inline]
       slab_alloc mm/slub.c:3251 [inline]
       __kmem_cache_alloc_lru mm/slub.c:3258 [inline]
       kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
       kmem_cache_zalloc include/linux/slab.h:723 [inline]
       __kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
       kernfs_new_node fs/kernfs/dir.c:655 [inline]
       kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
       sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
       create_dir lib/kobject.c:63 [inline]
       kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
       kobject_add_varg lib/kobject.c:358 [inline]
       kobject_init_and_add+0x101/0x160 lib/kobject.c:441
       sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Freed by task 857:
       kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
       kasan_set_track+0x21/0x30 mm/kasan/common.c:45
       kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
       ____kasan_slab_free mm/kasan/common.c:367 [inline]
       ____kasan_slab_free mm/kasan/common.c:329 [inline]
       __kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
       kasan_slab_free include/linux/kasan.h:200 [inline]
       slab_free_hook mm/slub.c:1754 [inline]
       slab_free_freelist_hook mm/slub.c:1780 [inline]
       slab_free mm/slub.c:3534 [inline]
       kmem_cache_free+0x9c/0x340 mm/slub.c:3551
       kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
       kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
       __kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
       __kernfs_remove fs/kernfs/dir.c:1356 [inline]
       kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
       sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      The buggy address belongs to the object at ffff888008880780
       which belongs to the cache kernfs_node_cache of size 128
      The buggy address is located 112 bytes inside of
       128-byte region [ffff888008880780, ffff888008880800)
      
      The buggy address belongs to the physical page:
      page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
      flags: 0x100000000000200(slab|node=0|zone=1)
      raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
      raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
       ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      >ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                   ^
       ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
       ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      ==================================================================
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable <stable@kernel.org> # -rc3
      Signed-off-by: default avatarChristian A. Ehrhardt <lk@c--e.de>
      Link: https://lore.kernel.org/r/20220913121723.691454-1-lk@c--e.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4abc9965
    • Greg Kroah-Hartman's avatar
      Merge 1707c39a ("Merge tag 'driver-core-6.0-rc7' of... · ec9c8807
      Greg Kroah-Hartman authored
      Merge 1707c39a ("Merge tag 'driver-core-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core") driver-core-next
      
      This merges the driver core changes in 6.0-rc7 into driver-core-next as
      they are needed here as well for testing.
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec9c8807
  2. 23 Sep, 2022 12 commits
  3. 22 Sep, 2022 23 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · bf682942
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small and pretty obvious fixes, all in drivers"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: mpt3sas: Fix return value check of dma_get_required_mask()
        scsi: qla2xxx: Fix memory leak in __qlt_24xx_handle_abts()
        scsi: qedf: Fix a UAF bug in __qedf_probe()
      bf682942
    • Linus Torvalds's avatar
      Merge tag 'slab-for-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab · 3c0f396a
      Linus Torvalds authored
      Pull slab fixes from Vlastimil Babka:
      
       - Fix a possible use-after-free in SLUB's kmem_cache removal,
         introduced in this cycle, by Feng Tang.
      
       - WQ_MEM_RECLAIM dependency fix for the workqueue-based cpu slab
         flushing introduced in 5.15, by Maurizio Lombardi.
      
       - Add missing KASAN hooks in two kmalloc entry paths, by Peter
         Collingbourne.
      
       - A BUG_ON() removal in SLUB's kmem_cache creation when allocation
         fails (too small to possibly happen in practice, syzbot used fault
         injection), by Chao Yu.
      
      * tag 'slab-for-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
        mm: slub: fix flush_cpu_slab()/__free_slab() invocations in task context.
        mm/slab_common: fix possible double free of kmem_cache
        kasan: call kasan_malloc() from __kmalloc_*track_caller()
        mm/slub: fix to return errno if kmalloc() fails
      3c0f396a
    • Sean Christopherson's avatar
      KVM: x86: Inject #UD on emulated XSETBV if XSAVES isn't enabled · 50b2d49b
      Sean Christopherson authored
      Inject #UD when emulating XSETBV if CR4.OSXSAVE is not set.  This also
      covers the "XSAVE not supported" check, as setting CR4.OSXSAVE=1 #GPs if
      XSAVE is not supported (and userspace gets to keep the pieces if it
      forces incoherent vCPU state).
      
      Add a comment to kvm_emulate_xsetbv() to call out that the CPU checks
      CR4.OSXSAVE before checking for intercepts.  AMD'S APM implies that #UD
      has priority (says that intercepts are checked before #GP exceptions),
      while Intel's SDM says nothing about interception priority.  However,
      testing on hardware shows that both AMD and Intel CPUs prioritize the #UD
      over interception.
      
      Fixes: 02d4160f ("x86: KVM: add xsetbv to the emulator")
      Cc: stable@vger.kernel.org
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220824033057.3576315-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      50b2d49b
    • Dr. David Alan Gilbert's avatar
      KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES · a1020a25
      Dr. David Alan Gilbert authored
      Allow FP and SSE state to be saved and restored via KVM_{G,SET}_XSAVE on
      XSAVE-capable hosts even if their bits are not exposed to the guest via
      XCR0.
      
      Failing to allow FP+SSE first showed up as a QEMU live migration failure,
      where migrating a VM from a pre-XSAVE host, e.g. Nehalem, to an XSAVE
      host failed due to KVM rejecting KVM_SET_XSAVE.  However, the bug also
      causes problems even when migrating between XSAVE-capable hosts as
      KVM_GET_SAVE won't set any bits in user_xfeatures if XSAVE isn't exposed
      to the guest, i.e. KVM will fail to actually migrate FP+SSE.
      
      Because KVM_{G,S}ET_XSAVE are designed to allowing migrating between
      hosts with and without XSAVE, KVM_GET_XSAVE on a non-XSAVE (by way of
      fpu_copy_guest_fpstate_to_uabi()) always sets the FP+SSE bits in the
      header so that KVM_SET_XSAVE will work even if the new host supports
      XSAVE.
      
      Fixes: ad856280 ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0")
      bz: https://bugzilla.redhat.com/show_bug.cgi?id=2079311
      Cc: stable@vger.kernel.org
      Cc: Leonardo Bras <leobras@redhat.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      [sean: add comment, massage changelog]
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220824033057.3576315-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a1020a25
    • Sean Christopherson's avatar
      KVM: x86: Reinstate kvm_vcpu_arch.guest_supported_xcr0 · ee519b3a
      Sean Christopherson authored
      Reinstate the per-vCPU guest_supported_xcr0 by partially reverting
      commit 988896bb; the implicit assessment that guest_supported_xcr0 is
      always the same as guest_fpu.fpstate->user_xfeatures was incorrect.
      
      kvm_vcpu_after_set_cpuid() isn't the only place that sets user_xfeatures,
      as user_xfeatures is set to fpu_user_cfg.default_features when guest_fpu
      is allocated via fpu_alloc_guest_fpstate() => __fpstate_reset().
      guest_supported_xcr0 on the other hand is zero-allocated.  If userspace
      never invokes KVM_SET_CPUID2, supported XCR0 will be '0', whereas the
      allowed user XFEATURES will be non-zero.
      
      Practically speaking, the edge case likely doesn't matter as no sane
      userspace will live migrate a VM without ever doing KVM_SET_CPUID2. The
      primary motivation is to prepare for KVM intentionally and explicitly
      setting bits in user_xfeatures that are not set in guest_supported_xcr0.
      
      Because KVM_{G,S}ET_XSAVE can be used to svae/restore FP+SSE state even
      if the host doesn't support XSAVE, KVM needs to set the FP+SSE bits in
      user_xfeatures even if they're not allowed in XCR0, e.g. because XCR0
      isn't exposed to the guest.  At that point, the simplest fix is to track
      the two things separately (allowed save/restore vs. allowed XCR0).
      
      Fixes: 988896bb ("x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0")
      Cc: stable@vger.kernel.org
      Cc: Leonardo Bras <leobras@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220824033057.3576315-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ee519b3a
    • Miaohe Lin's avatar
      KVM: x86/mmu: add missing update to max_mmu_rmap_size · 604f5332
      Miaohe Lin authored
      The update to statistic max_mmu_rmap_size is unintentionally removed by
      commit 4293ddb7 ("KVM: x86/mmu: Remove redundant spte present check
      in mmu_set_spte"). Add missing update to it or max_mmu_rmap_size will
      always be nonsensical 0.
      
      Fixes: 4293ddb7 ("KVM: x86/mmu: Remove redundant spte present check in mmu_set_spte")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Message-Id: <20220907080657.42898-1-linmiaohe@huawei.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      604f5332
    • Jinrong Liang's avatar
      selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.c · 561cafeb
      Jinrong Liang authored
      The following warning appears when executing:
      	make -C tools/testing/selftests/kvm
      
      rseq_test.c: In function ‘main’:
      rseq_test.c:237:33: warning: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Wimplicit-function-declaration]
                (void *)(unsigned long)gettid());
                                       ^~~~~~
                                       getgid
      /usr/bin/ld: /tmp/ccr5mMko.o: in function `main':
      ../kvm/tools/testing/selftests/kvm/rseq_test.c:237: undefined reference to `gettid'
      collect2: error: ld returned 1 exit status
      make: *** [../lib.mk:173: ../kvm/tools/testing/selftests/kvm/rseq_test] Error 1
      
      Use the more compatible syscall(SYS_gettid) instead of gettid() to fix it.
      More subsequent reuse may cause it to be wrapped in a lib file.
      Signed-off-by: default avatarJinrong Liang <cloudliang@tencent.com>
      Message-Id: <20220802071240.84626-1-cloudliang@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      561cafeb
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.0-2' of... · b4ac28a3
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.0, take #2
      
      - Fix kmemleak usage in Protected KVM (again)
      b4ac28a3
    • Maurizio Lombardi's avatar
      mm: slub: fix flush_cpu_slab()/__free_slab() invocations in task context. · e45cc288
      Maurizio Lombardi authored
      Commit 5a836bf6 ("mm: slub: move flush_cpu_slab() invocations
      __free_slab() invocations out of IRQ context") moved all flush_cpu_slab()
      invocations to the global workqueue to avoid a problem related
      with deactivate_slab()/__free_slab() being called from an IRQ context
      on PREEMPT_RT kernels.
      
      When the flush_all_cpu_locked() function is called from a task context
      it may happen that a workqueue with WQ_MEM_RECLAIM bit set ends up
      flushing the global workqueue, this will cause a dependency issue.
      
       workqueue: WQ_MEM_RECLAIM nvme-delete-wq:nvme_delete_ctrl_work [nvme_core]
         is flushing !WQ_MEM_RECLAIM events:flush_cpu_slab
       WARNING: CPU: 37 PID: 410 at kernel/workqueue.c:2637
         check_flush_dependency+0x10a/0x120
       Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core]
       RIP: 0010:check_flush_dependency+0x10a/0x120[  453.262125] Call Trace:
       __flush_work.isra.0+0xbf/0x220
       ? __queue_work+0x1dc/0x420
       flush_all_cpus_locked+0xfb/0x120
       __kmem_cache_shutdown+0x2b/0x320
       kmem_cache_destroy+0x49/0x100
       bioset_exit+0x143/0x190
       blk_release_queue+0xb9/0x100
       kobject_cleanup+0x37/0x130
       nvme_fc_ctrl_free+0xc6/0x150 [nvme_fc]
       nvme_free_ctrl+0x1ac/0x2b0 [nvme_core]
      
      Fix this bug by creating a workqueue for the flush operation with
      the WQ_MEM_RECLAIM bit set.
      
      Fixes: 5a836bf6 ("mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMaurizio Lombardi <mlombard@redhat.com>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      e45cc288
    • Linus Torvalds's avatar
      Merge tag 'soc-fixes-6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · c69cf88c
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "Another set of fixes for fixes for the soc tree:
      
         - A fix for the interrupt number on at91/lan966 ethernet PHYs
      
         - A second round of fixes for NXP i.MX series, including a couple of
           build issues, and board specific DT corrections on TQMa8MPQL,
           imx8mp-venice-gw74xx and imx8mm-verdin for reliability and
           partially broken functionality
      
         - Several fixes for Rockchip SoCs, addressing a USB issue on
           BPI-R2-Pro, wakeup on Gru-Bob and reliability of high-speed SD
           cards, among other minor issues
      
         - A fix for a long-running naming mistake that prevented the moxart
           mmc driver from working at all
      
         - Multiple Arm SCMI firmware fixes for hardening some corner cases"
      
      * tag 'soc-fixes-6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
        arm64: dts: imx8mp-venice-gw74xx: fix port/phy validation
        ARM: dts: lan966x: Fix the interrupt number for internal PHYs
        arm64: dts: imx8mp-venice-gw74xx: fix ksz9477 cpu port
        arm64: dts: imx8mp-venice-gw74xx: fix CAN STBY polarity
        dt-bindings: memory-controllers: fsl,imx8m-ddrc: drop Leonard Crestez
        arm64: dts: tqma8mqml: Include phy-imx8-pcie.h header
        arm64: defconfig: enable ARCH_NXP
        arm64: dts: imx8mp-tqma8mpql-mba8mpxl: add missing pinctrl for RTC alarm
        ARM: dts: fix Moxa SDIO 'compatible', remove 'sdhci' misnomer
        arm64: dts: imx8mm-verdin: extend pmic voltages
        arm64: dts: rockchip: Remove 'enable-active-low' from rk3566-quartz64-a
        arm64: dts: rockchip: Remove 'enable-active-low' from rk3399-puma
        arm64: dts: rockchip: fix property for usb2 phy supply on rk3568-evb1-v10
        arm64: dts: rockchip: fix property for usb2 phy supply on rock-3a
        arm64: dts: imx8ulp: add #reset-cells for pcc
        arm64: dts: tqma8mpxl-ba8mpxl: Fix button GPIOs
        arm64: dts: imx8mn: remove GPU power domain reset
        arm64: dts: rockchip: Set RK3399-Gru PCLK_EDP to 24 MHz
        arm64: dts: imx8mm: Reverse CPLD_Dn GPIO label mapping on MX8Menlo
        arm64: dts: rockchip: fix upper usb port on BPI-R2-Pro
        ...
      c69cf88c
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 504c25cb
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from wifi, netfilter and can.
      
        A handful of awaited fixes here - revert of the FEC changes, bluetooth
        fix, fixes for iwlwifi spew.
      
        We added a warning in PHY/MDIO code which is triggering on a couple of
        platforms in a false-positive-ish way. If we can't iron that out over
        the week we'll drop it and re-add for 6.1.
      
        I've added a new "follow up fixes" section for fixes to fixes in
        6.0-rcs but it may actually give the false impression that those are
        problematic or that more testing time would have caught them. So
        likely a one time thing.
      
        Follow up fixes:
      
         - nf_tables_addchain: fix nft_counters_enabled underflow
      
         - ebtables: fix memory leak when blob is malformed
      
         - nf_ct_ftp: fix deadlock when nat rewrite is needed
      
        Current release - regressions:
      
         - Revert "fec: Restart PPS after link state change" and the related
           "net: fec: Use a spinlock to guard `fep->ptp_clk_on`"
      
         - Bluetooth: fix HCIGETDEVINFO regression
      
         - wifi: mt76: fix 5 GHz connection regression on mt76x0/mt76x2
      
         - mptcp: fix fwd memory accounting on coalesce
      
         - rwlock removal fall out:
            - ipmr: always call ip{,6}_mr_forward() from RCU read-side
              critical section
            - ipv6: fix crash when IPv6 is administratively disabled
      
         - tcp: read multiple skbs in tcp_read_skb()
      
         - mdio_bus_phy_resume state warning fallout:
            - eth: ravb: fix PHY state warning splat during system resume
            - eth: sh_eth: fix PHY state warning splat during system resume
      
        Current release - new code bugs:
      
         - wifi: iwlwifi: don't spam logs with NSS>2 messages
      
         - eth: mtk_eth_soc: enable XDP support just for MT7986 SoC
      
        Previous releases - regressions:
      
         - bonding: fix NULL deref in bond_rr_gen_slave_id
      
         - wifi: iwlwifi: mark IWLMEI as broken
      
        Previous releases - always broken:
      
         - nf_conntrack helpers:
            - irc: tighten matching on DCC message
            - sip: fix ct_sip_walk_headers
            - osf: fix possible bogus match in nf_osf_find()
      
         - ipvlan: fix out-of-bound bugs caused by unset skb->mac_header
      
         - core: fix flow symmetric hash
      
         - bonding, team: unsync device addresses on ndo_stop
      
         - phy: micrel: fix shared interrupt on LAN8814"
      
      * tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
        selftests: forwarding: add shebang for sch_red.sh
        bnxt: prevent skb UAF after handing over to PTP worker
        net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
        net: sched: fix possible refcount leak in tc_new_tfilter()
        net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
        udp: Use WARN_ON_ONCE() in udp_read_skb()
        selftests: bonding: cause oops in bond_rr_gen_slave_id
        bonding: fix NULL deref in bond_rr_gen_slave_id
        net: phy: micrel: fix shared interrupt on LAN8814
        net/smc: Stop the CLC flow if no link to map buffers on
        ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient
        net: atlantic: fix potential memory leak in aq_ndev_close()
        can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
        can: gs_usb: gs_can_open(): fix race dev->can.state condition
        can: flexcan: flexcan_mailbox_read() fix return value for drop = true
        net: sh_eth: Fix PHY state warning splat during system resume
        net: ravb: Fix PHY state warning splat during system resume
        netfilter: nf_ct_ftp: fix deadlock when nat rewrite is needed
        netfilter: ebtables: fix memory leak when blob is malformed
        netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()
        ...
      504c25cb
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · 129e7152
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
      
       - Use the right variable to check for shim insecure mode
      
       - Wipe setup_data field when booting via EFI
      
       - Add missing error check to efibc driver
      
      * tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: libstub: check Shim mode using MokSBStateRT
        efi: x86: Wipe setup_data on pure EFI boot
        efi: efibc: Guard against allocation failure
      129e7152
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 5e0a93e4
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix a NULL-pointer dereference at driver unbind and a potential
         resource leak in error path in gpio-mockup
      
       - make the irqchip immutable in gpio-ftgpio010
      
       - fix dereferencing a potentially uninitialized variable in gpio-tqmx86
      
       - fix interrupt registering in gpiolib's character device code
      
      * tag 'gpio-fixes-for-v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpiolib: cdev: Set lineevent_state::irq after IRQ register successfully
        gpio: tqmx86: fix uninitialized variable girq
        gpio: ftgpio010: Make irqchip immutable
        gpio: mockup: Fix potential resource leakage when register a chip
        gpio: mockup: fix NULL pointer dereference when removing debugfs
      5e0a93e4
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.0-2022-09-21' of... · 9597f088
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Fix polling of system-wide events related to mixing per-cpu and
         per-thread events.
      
       - Do not check if /proc/modules is unchanged when copying /proc/kcore,
         that doesn't get in the way of post processing analysis.
      
       - Include program header in ELF files generated for JIT files, so that
         they can be opened by tools using elfutils libraries.
      
       - Enter namespaces when synthesizing build-ids.
      
       - Fix some bugs related to a recent cpu_map overhaul where we should be
         using an index and not the cpu number.
      
       - Fix BPF program ELF section name, using the naming expected by libbpf
         when using BPF counters in 'perf stat'.
      
       - Add a new test for perf stat cgroup BPF counter.
      
       - Adjust check on 'perf test wp' for older kernels, where the
         PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl isn't supported.
      
       - Sync x86 cpufeatures with the kernel sources, no changes in tooling.
      
      * tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Honor namespace when synthesizing build-ids
        tools headers cpufeatures: Sync with the kernel sources
        perf kcore_copy: Do not check /proc/modules is unchanged
        libperf evlist: Fix polling of system-wide events
        perf record: Fix cpu mask bit setting for mixed mmaps
        perf test: Skip wp modify test on old kernels
        perf jit: Include program header in ELF files
        perf test: Add a new test for perf stat cgroup BPF counter
        perf stat: Use evsel->core.cpus to iterate cpus in BPF cgroup counters
        perf stat: Fix cpu map index in bperf cgroup code
        perf stat: Fix BPF program section name
      9597f088
    • Hangbin Liu's avatar
      selftests: forwarding: add shebang for sch_red.sh · 83e4b196
      Hangbin Liu authored
      RHEL/Fedora RPM build checks are stricter, and complain when executable
      files don't have a shebang line, e.g.
      
      *** WARNING: ./kselftests/net/forwarding/sch_red.sh is executable but has no shebang, removing executable bit
      
      Fix it by adding shebang line.
      
      Fixes: 6cf0291f ("selftests: forwarding: Add a RED test for SW datapath")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Link: https://lore.kernel.org/r/20220922024453.437757-1-liuhangbin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      83e4b196
    • Jakub Kicinski's avatar
      bnxt: prevent skb UAF after handing over to PTP worker · c31f26c8
      Jakub Kicinski authored
      When reading the timestamp is required bnxt_tx_int() hands
      over the ownership of the completed skb to the PTP worker.
      The skb should not be used afterwards, as the worker may
      run before the rest of our code and free the skb, leading
      to a use-after-free.
      
      Since dev_kfree_skb_any() accepts NULL make the loss of
      ownership more obvious and set skb to NULL.
      
      Fixes: 83bb623c ("bnxt_en: Transmit and retrieve packet timestamps")
      Reviewed-by: default avatarAndy Gospodarek <gospo@broadcom.com>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20220921201005.335390-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c31f26c8
    • Liang He's avatar
      net: marvell: Fix refcounting bugs in prestera_port_sfp_bind() · 3aac7ada
      Liang He authored
      In prestera_port_sfp_bind(), there are two refcounting bugs:
      (1) we should call of_node_get() before of_find_node_by_name() as
      it will automaitcally decrease the refcount of 'from' argument;
      (2) we should call of_node_put() for the break of the iteration
      for_each_child_of_node() as it will automatically increase and
      decrease the 'child'.
      
      Fixes: 52323ef7 ("net: marvell: prestera: add phylink support")
      Signed-off-by: default avatarLiang He <windhl@126.com>
      Reviewed-by: default avatarYevhen Orlov <yevhen.orlov@plvision.eu>
      Link: https://lore.kernel.org/r/20220921133245.4111672-1-windhl@126.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3aac7ada
    • Hangyu Hua's avatar
      net: sched: fix possible refcount leak in tc_new_tfilter() · c2e1cfef
      Hangyu Hua authored
      tfilter_put need to be called to put the refount got by tp->ops->get to
      avoid possible refcount leak when chain->tmplt_ops != NULL and
      chain->tmplt_ops != tp->ops.
      
      Fixes: 7d5509fa ("net: sched: extend proto ops with 'put' callback")
      Signed-off-by: default avatarHangyu Hua <hbh25y@gmail.com>
      Reviewed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Link: https://lore.kernel.org/r/20220921092734.31700-1-hbh25y@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c2e1cfef
    • Sean Anderson's avatar
      net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD · 878e2405
      Sean Anderson authored
      There is a separate receive path for small packets (under 256 bytes).
      Instead of allocating a new dma-capable skb to be used for the next packet,
      this path allocates a skb and copies the data into it (reusing the existing
      sbk for the next packet). There are two bytes of junk data at the beginning
      of every packet. I believe these are inserted in order to allow aligned DMA
      and IP headers. We skip over them using skb_reserve. Before copying over
      the data, we must use a barrier to ensure we see the whole packet. The
      current code only synchronizes len bytes, starting from the beginning of
      the packet, including the junk bytes. However, this leaves off the final
      two bytes in the packet. Synchronize the whole packet.
      
      To reproduce this problem, ping a HME with a payload size between 17 and
      214
      
      	$ ping -s 17 <hme_address>
      
      which will complain rather loudly about the data mismatch. Small packets
      (below 60 bytes on the wire) do not have this issue. I suspect this is
      related to the padding added to increase the minimum packet size.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarSean Anderson <seanga2@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220920235018.1675956-1-seanga2@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      878e2405
    • Greg Kroah-Hartman's avatar
      Merge tag 'usb-serial-6.0-rc7' of... · 47af6c64
      Greg Kroah-Hartman authored
      Merge tag 'usb-serial-6.0-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
      
      Johan writes:
        "USB-serial fixes for 6.0-rc7
      
         Here are some new modem device ids.
      
         All have been in linux-next with no reported issues."
      
      * tag 'usb-serial-6.0-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
        USB: serial: option: add Quectel RM520N
        USB: serial: option: add Quectel BG95 0x0203 composition
      47af6c64
    • Peilin Ye's avatar
      udp: Use WARN_ON_ONCE() in udp_read_skb() · db39dfdc
      Peilin Ye authored
      Prevent udp_read_skb() from flooding the syslog.
      Suggested-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarPeilin Ye <peilin.ye@bytedance.com>
      Link: https://lore.kernel.org/r/20220921005915.2697-1-yepeilin.cs@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      db39dfdc
    • Jakub Kicinski's avatar
      Merge branch 'bonding-fix-null-deref-in-bond_rr_gen_slave_id' · c5da4b68
      Jakub Kicinski authored
      Jonathan Toppins says:
      
      ====================
      bonding: fix NULL deref in bond_rr_gen_slave_id
      
      Fix a NULL dereference of the struct bonding.rr_tx_counter member because
      if a bond is initially created with an initial mode != zero (Round Robin)
      the memory required for the counter is never created and when the mode is
      changed there is never any attempt to verify the memory is allocated upon
      switching modes.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1663694476.git.jtoppins@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c5da4b68
    • Jonathan Toppins's avatar
      selftests: bonding: cause oops in bond_rr_gen_slave_id · 2ffd5732
      Jonathan Toppins authored
      This bonding selftest used to cause a kernel oops on aarch64
      and should be architectures agnostic.
      Signed-off-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2ffd5732