1. 21 Feb, 2022 15 commits
  2. 14 Feb, 2022 1 commit
  3. 13 Feb, 2022 9 commits
  4. 12 Feb, 2022 15 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · b81b1829
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two minor fixes in the lpfc driver. One changing the classification of
        trace messages and the other fixing a build issue when NVME_FC is
        disabled"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: lpfc: Reduce log messages seen after firmware download
        scsi: lpfc: Remove NVMe support if kernel has NVME_FC disabled
      b81b1829
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 080eba78
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a small number of char/misc driver fixes for 5.17-rc4 for
        reported issues. They contain:
      
         - phy driver fixes
      
         - iio driver fix
      
         - eeprom driver fix
      
         - speakup regression fix
      
         - fastrpc fix
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'char-misc-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        iio: buffer: Fix file related error handling in IIO_BUFFER_GET_FD_IOCTL
        speakup-dectlk: Restore pitch setting
        bus: mhi: pci_generic: Add mru_default for Cinterion MV31-W
        bus: mhi: pci_generic: Add mru_default for Foxconn SDX55
        eeprom: ee1004: limit i2c reads to I2C_SMBUS_BLOCK_MAX
        misc: fastrpc: avoid double fput() on failed usercopy
        phy: dphy: Correct clk_pre parameter
        phy: phy-mtk-tphy: Fix duplicated argument in phy-mtk-tphy
        phy: stm32: fix a refcount leak in stm32_usbphyc_pll_enable()
        phy: xilinx: zynqmp: Fix bus width setting for SGMII
        phy: cadence: Sierra: fix error handling bugs in probe()
        phy: ti: Fix missing sentinel for clk_div_table
        phy: broadcom: Kconfig: Fix PHY_BRCM_USB config option
        phy: usb: Leave some clocks running during suspend
      080eba78
    • Linus Torvalds's avatar
      Merge tag 'staging-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · dcd72f54
      Linus Torvalds authored
      Pullstaging driver fixes from Greg KH:
       "Here are two staging driver fixes for 5.17-rc4.  These are:
      
         - fbtft error path fix
      
         - vc04_services rcu dereference fix
      
        Both of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'staging-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: fbtft: Fix error path in fbtft_driver_module_init()
        staging: vc04_services: Fix RCU dereference check
      dcd72f54
    • Linus Torvalds's avatar
      Merge tag 'tty-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 522e7d03
      Linus Torvalds authored
      Pull tty/serial fixes from Greg KH:
       "Here are four small tty/serial fixes for 5.17-rc4.  They are:
      
         - 8250_pericom change revert to fix a reported regression
      
         - two speculation fixes for vt_ioctl
      
         - n_tty regression fix for polling
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'tty-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        vt_ioctl: add array_index_nospec to VT_ACTIVATE
        vt_ioctl: fix array_index_nospec in vt_setactivate
        serial: 8250_pericom: Revert "Re-enable higher baud rates"
        n_tty: wake up poll(POLLRDNORM) on receiving data
      522e7d03
    • Linus Torvalds's avatar
      Merge tag 'usb-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 85187378
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB driver fixes for 5.17-rc4 that resolve some
        reported issues and add new device ids:
      
         - usb-serial new device ids
      
         - ulpi cleanup fixes
      
         - f_fs use-after-free fix
      
         - dwc3 driver fixes
      
         - ax88179_178a usb network driver fix
      
         - usb gadget fixes
      
        There is a revert at the end of this series to resolve a build problem
        that 0-day found yesterday. Most of these have been in linux-next,
        except for the last few, and all have now passed 0-day tests"
      
      * tag 'usb-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        Revert "usb: dwc2: drd: fix soft connect when gadget is unconfigured"
        usb: dwc2: drd: fix soft connect when gadget is unconfigured
        usb: gadget: rndis: check size of RNDIS_MSG_SET command
        USB: gadget: validate interface OS descriptor requests
        usb: core: Unregister device on component_add() failure
        net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup
        usb: dwc3: gadget: Prevent core from processing stale TRBs
        USB: serial: cp210x: add CPI Bulk Coin Recycler id
        USB: serial: cp210x: add NCR Retail IO box id
        USB: serial: ftdi_sio: add support for Brainboxes US-159/235/320
        usb: gadget: f_uac2: Define specific wTerminalType
        usb: gadget: udc: renesas_usb3: Fix host to USB_ROLE_NONE transition
        usb: raw-gadget: fix handling of dual-direction-capable endpoints
        usb: usb251xb: add boost-up property support
        usb: ulpi: Call of_node_put correctly
        usb: ulpi: Move of_node_put to ulpi_dev_release
        USB: serial: option: add ZTE MF286D modem
        USB: serial: ch341: add support for GW Instek USB2.0-Serial devices
        usb: f_fs: Fix use-after-free for epfile
        usb: dwc3: xilinx: fix uninitialized return value
      85187378
    • Linus Torvalds's avatar
      Merge tag 's390-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · a4fd49cd
      Linus Torvalds authored
      Pull s390 updates from Vasily Gorbik:
       "Maintainers and reviewers changes:
      
          - Add Alexander Gordeev as maintainer for s390.
      
          - Christian Borntraeger will focus on s390 KVM maintainership and
            stays as s390 reviewer.
      
        Fixes:
      
         - Fix clang build of modules loader KUnit test.
      
         - Fix kernel panic in CIO code on FCES path-event when no driver is
           attached to a device or the driver does not provide the path_event
           function"
      
      * tag 's390-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/cio: verify the driver availability for path_event call
        s390/module: fix building test_modules_helpers.o with clang
        MAINTAINERS: downgrade myself to Reviewer for s390
        MAINTAINERS: add Alexander Gordeev as maintainer for s390
      a4fd49cd
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.17a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 4a387c98
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - Two small cleanups
      
       - Another fix for addressing the EFI framebuffer above 4GB when running
         as Xen dom0
      
       - A patch to let Xen guests use reserved bits in MSI- and IO-APIC-
         registers for extended APIC-IDs the same way KVM guests are doing it
         already
      
      * tag 'for-linus-5.17a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/pci: Make use of the helper macro LIST_HEAD()
        xen/x2apic: Fix inconsistent indenting
        xen/x86: detect support for extended destination ID
        xen/x86: obtain full video frame buffer address for Dom0 also under EFI
      4a387c98
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · eef8cffc
      Linus Torvalds authored
      Pull seccomp fixes from Kees Cook:
       "This fixes a corner case of fatal SIGSYS being ignored since v5.15.
        Along with the signal fix is a change to seccomp so that seeing
        another syscall after a fatal filter result will cause seccomp to kill
        the process harder.
      
        Summary:
      
         - Force HANDLER_EXIT even for SIGNAL_UNKILLABLE
      
         - Make seccomp self-destruct after fatal filter results
      
         - Update seccomp samples for easier behavioral demonstration"
      
      * tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        samples/seccomp: Adjust sample to also provide kill option
        seccomp: Invalidate seccomp mode to catch death failures
        signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE
      eef8cffc
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 9917ff5f
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "5 patches.
      
        Subsystems affected by this patch series: binfmt, procfs, and mm
        (vmscan, memcg, and kfence)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        kfence: make test case compatible with run time set sample interval
        mm: memcg: synchronize objcg lists with a dedicated spinlock
        mm: vmscan: remove deadlock due to throttling failing to make progress
        fs/proc: task_mmu.c: don't read mapcount for migration entry
        fs/binfmt_elf: fix PT_LOAD p_align values for loaders
      9917ff5f
    • Jing Leng's avatar
      kconfig: fix failing to generate auto.conf · 1b9e740a
      Jing Leng authored
      When the KCONFIG_AUTOCONFIG is specified (e.g. export \
      KCONFIG_AUTOCONFIG=output/config/auto.conf), the directory of
      include/config/ will not be created, so kconfig can't create deps
      files in it and auto.conf can't be generated.
      Signed-off-by: default avatarJing Leng <jleng@ambarella.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      1b9e740a
    • Greg Kroah-Hartman's avatar
      Revert "usb: dwc2: drd: fix soft connect when gadget is unconfigured" · 736e8d89
      Greg Kroah-Hartman authored
      This reverts commit 269cbcf7.
      
      It causes build errors as reported by the kernel test robot.
      
      Link: https://lore.kernel.org/r/202202112236.AwoOTtHO-lkp@intel.comReported-by: default avatarkernel test robot <lkp@intel.com>
      Fixes: 269cbcf7 ("usb: dwc2: drd: fix soft connect when gadget is unconfigured")
      Cc: stable@kernel.org
      Cc: Amelie Delaunay <amelie.delaunay@foss.st.com>
      Cc: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
      Cc: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      736e8d89
    • Peng Liu's avatar
      kfence: make test case compatible with run time set sample interval · 8913c610
      Peng Liu authored
      The parameter kfence_sample_interval can be set via boot parameter and
      late shell command, which is convenient for automated tests and KFENCE
      parameter optimization.  However, KFENCE test case just uses
      compile-time CONFIG_KFENCE_SAMPLE_INTERVAL, which will make KFENCE test
      case not run as users desired.  Export kfence_sample_interval, so that
      KFENCE test case can use run-time-set sample interval.
      
      Link: https://lkml.kernel.org/r/20220207034432.185532-1-liupeng256@huawei.comSigned-off-by: default avatarPeng Liu <liupeng256@huawei.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Christian Knig <christian.koenig@amd.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8913c610
    • Roman Gushchin's avatar
      mm: memcg: synchronize objcg lists with a dedicated spinlock · 0764db9b
      Roman Gushchin authored
      Alexander reported a circular lock dependency revealed by the mmap1 ltp
      test:
      
        LOCKDEP_CIRCULAR (suite: ltp, case: mtest06 (mmap1))
                WARNING: possible circular locking dependency detected
                5.17.0-20220113.rc0.git0.f2211f194038.300.fc35.s390x+debug #1 Not tainted
                ------------------------------------------------------
                mmap1/202299 is trying to acquire lock:
                00000001892c0188 (css_set_lock){..-.}-{2:2}, at: obj_cgroup_release+0x4a/0xe0
                but task is already holding lock:
                00000000ca3b3818 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x38/0x180
                which lock already depends on the new lock.
                the existing dependency chain (in reverse order) is:
                -> #1 (&sighand->siglock){-.-.}-{2:2}:
                       __lock_acquire+0x604/0xbd8
                       lock_acquire.part.0+0xe2/0x238
                       lock_acquire+0xb0/0x200
                       _raw_spin_lock_irqsave+0x6a/0xd8
                       __lock_task_sighand+0x90/0x190
                       cgroup_freeze_task+0x2e/0x90
                       cgroup_migrate_execute+0x11c/0x608
                       cgroup_update_dfl_csses+0x246/0x270
                       cgroup_subtree_control_write+0x238/0x518
                       kernfs_fop_write_iter+0x13e/0x1e0
                       new_sync_write+0x100/0x190
                       vfs_write+0x22c/0x2d8
                       ksys_write+0x6c/0xf8
                       __do_syscall+0x1da/0x208
                       system_call+0x82/0xb0
                -> #0 (css_set_lock){..-.}-{2:2}:
                       check_prev_add+0xe0/0xed8
                       validate_chain+0x736/0xb20
                       __lock_acquire+0x604/0xbd8
                       lock_acquire.part.0+0xe2/0x238
                       lock_acquire+0xb0/0x200
                       _raw_spin_lock_irqsave+0x6a/0xd8
                       obj_cgroup_release+0x4a/0xe0
                       percpu_ref_put_many.constprop.0+0x150/0x168
                       drain_obj_stock+0x94/0xe8
                       refill_obj_stock+0x94/0x278
                       obj_cgroup_charge+0x164/0x1d8
                       kmem_cache_alloc+0xac/0x528
                       __sigqueue_alloc+0x150/0x308
                       __send_signal+0x260/0x550
                       send_signal+0x7e/0x348
                       force_sig_info_to_task+0x104/0x180
                       force_sig_fault+0x48/0x58
                       __do_pgm_check+0x120/0x1f0
                       pgm_check_handler+0x11e/0x180
                other info that might help us debug this:
                 Possible unsafe locking scenario:
                       CPU0                    CPU1
                       ----                    ----
                  lock(&sighand->siglock);
                                               lock(css_set_lock);
                                               lock(&sighand->siglock);
                  lock(css_set_lock);
                 *** DEADLOCK ***
                2 locks held by mmap1/202299:
                 #0: 00000000ca3b3818 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x38/0x180
                 #1: 00000001892ad560 (rcu_read_lock){....}-{1:2}, at: percpu_ref_put_many.constprop.0+0x0/0x168
                stack backtrace:
                CPU: 15 PID: 202299 Comm: mmap1 Not tainted 5.17.0-20220113.rc0.git0.f2211f194038.300.fc35.s390x+debug #1
                Hardware name: IBM 3906 M04 704 (LPAR)
                Call Trace:
                  dump_stack_lvl+0x76/0x98
                  check_noncircular+0x136/0x158
                  check_prev_add+0xe0/0xed8
                  validate_chain+0x736/0xb20
                  __lock_acquire+0x604/0xbd8
                  lock_acquire.part.0+0xe2/0x238
                  lock_acquire+0xb0/0x200
                  _raw_spin_lock_irqsave+0x6a/0xd8
                  obj_cgroup_release+0x4a/0xe0
                  percpu_ref_put_many.constprop.0+0x150/0x168
                  drain_obj_stock+0x94/0xe8
                  refill_obj_stock+0x94/0x278
                  obj_cgroup_charge+0x164/0x1d8
                  kmem_cache_alloc+0xac/0x528
                  __sigqueue_alloc+0x150/0x308
                  __send_signal+0x260/0x550
                  send_signal+0x7e/0x348
                  force_sig_info_to_task+0x104/0x180
                  force_sig_fault+0x48/0x58
                  __do_pgm_check+0x120/0x1f0
                  pgm_check_handler+0x11e/0x180
                INFO: lockdep is turned off.
      
      In this example a slab allocation from __send_signal() caused a
      refilling and draining of a percpu objcg stock, resulted in a releasing
      of another non-related objcg.  Objcg release path requires taking the
      css_set_lock, which is used to synchronize objcg lists.
      
      This can create a circular dependency with the sighandler lock, which is
      taken with the locked css_set_lock by the freezer code (to freeze a
      task).
      
      In general it seems that using css_set_lock to synchronize objcg lists
      makes any slab allocations and deallocation with the locked css_set_lock
      and any intervened locks risky.
      
      To fix the problem and make the code more robust let's stop using
      css_set_lock to synchronize objcg lists and use a new dedicated spinlock
      instead.
      
      Link: https://lkml.kernel.org/r/Yfm1IHmoGdyUR81T@carbon.dhcp.thefacebook.com
      Fixes: bf4f0599 ("mm: memcg/slab: obj_cgroup API")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Reported-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Tested-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Reviewed-by: default avatarWaiman Long <longman@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Reviewed-by: default avatarJeremy Linton <jeremy.linton@arm.com>
      Tested-by: default avatarJeremy Linton <jeremy.linton@arm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0764db9b
    • Mel Gorman's avatar
      mm: vmscan: remove deadlock due to throttling failing to make progress · b485c6f1
      Mel Gorman authored
      A soft lockup bug in kcompactd was reported in a private bugzilla with
      the following visible in dmesg;
      
        watchdog: BUG: soft lockup - CPU#33 stuck for 26s! [kcompactd0:479]
        watchdog: BUG: soft lockup - CPU#33 stuck for 52s! [kcompactd0:479]
        watchdog: BUG: soft lockup - CPU#33 stuck for 78s! [kcompactd0:479]
        watchdog: BUG: soft lockup - CPU#33 stuck for 104s! [kcompactd0:479]
      
      The machine had 256G of RAM with no swap and an earlier failed
      allocation indicated that node 0 where kcompactd was run was potentially
      unreclaimable;
      
        Node 0 active_anon:29355112kB inactive_anon:2913528kB active_file:0kB
          inactive_file:0kB unevictable:64kB isolated(anon):0kB isolated(file):0kB
          mapped:8kB dirty:0kB writeback:0kB shmem:26780kB shmem_thp:
          0kB shmem_pmdmapped: 0kB anon_thp: 23480320kB writeback_tmp:0kB
          kernel_stack:2272kB pagetables:24500kB all_unreclaimable? yes
      
      Vlastimil Babka investigated a crash dump and found that a task
      migrating pages was trying to drain PCP lists;
      
        PID: 52922  TASK: ffff969f820e5000  CPU: 19  COMMAND: "kworker/u128:3"
        Call Trace:
           __schedule
           schedule
           schedule_timeout
           wait_for_completion
           __flush_work
           __drain_all_pages
           __alloc_pages_slowpath.constprop.114
           __alloc_pages
           alloc_migration_target
           migrate_pages
           migrate_to_node
           do_migrate_pages
           cpuset_migrate_mm_workfn
           process_one_work
           worker_thread
           kthread
           ret_from_fork
      
      This failure is specific to CONFIG_PREEMPT=n builds.  The root of the
      problem is that kcompact0 is not rescheduling on a CPU while a task that
      has isolated a large number of the pages from the LRU is waiting on
      kcompact0 to reschedule so the pages can be released.  While
      shrink_inactive_list() only loops once around too_many_isolated, reclaim
      can continue without rescheduling if sc->skipped_deactivate == 1 which
      could happen if there was no file LRU and the inactive anon list was not
      low.
      
      Link: https://lkml.kernel.org/r/20220203100326.GD3301@suse.de
      Fixes: d818fca1 ("mm/vmscan: throttle reclaim and compaction when too may pages are isolated")
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Debugged-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b485c6f1
    • Yang Shi's avatar
      fs/proc: task_mmu.c: don't read mapcount for migration entry · 24d7275c
      Yang Shi authored
      The syzbot reported the below BUG:
      
        kernel BUG at include/linux/page-flags.h:785!
        invalid opcode: 0000 [#1] PREEMPT SMP KASAN
        CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
        RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
        Call Trace:
          page_mapcount include/linux/mm.h:837 [inline]
          smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466
          smaps_pte_entry fs/proc/task_mmu.c:538 [inline]
          smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601
          walk_pmd_range mm/pagewalk.c:128 [inline]
          walk_pud_range mm/pagewalk.c:205 [inline]
          walk_p4d_range mm/pagewalk.c:240 [inline]
          walk_pgd_range mm/pagewalk.c:277 [inline]
          __walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379
          walk_page_vma+0x277/0x350 mm/pagewalk.c:530
          smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768
          smap_gather_stats fs/proc/task_mmu.c:741 [inline]
          show_smap+0xc6/0x440 fs/proc/task_mmu.c:822
          seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272
          seq_read+0x3e0/0x5b0 fs/seq_file.c:162
          vfs_read+0x1b5/0x600 fs/read_write.c:479
          ksys_read+0x12d/0x250 fs/read_write.c:619
          do_syscall_x64 arch/x86/entry/common.c:50 [inline]
          do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
          entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The reproducer was trying to read /proc/$PID/smaps when calling
      MADV_FREE at the mean time.  MADV_FREE may split THPs if it is called
      for partial THP.  It may trigger the below race:
      
                 CPU A                         CPU B
                 -----                         -----
        smaps walk:                      MADV_FREE:
        page_mapcount()
          PageCompound()
                                         split_huge_page()
          page = compound_head(page)
          PageDoubleMap(page)
      
      When calling PageDoubleMap() this page is not a tail page of THP anymore
      so the BUG is triggered.
      
      This could be fixed by elevated refcount of the page before calling
      mapcount, but that would prevent it from counting migration entries, and
      it seems overkilling because the race just could happen when PMD is
      split so all PTE entries of tail pages are actually migration entries,
      and smaps_account() does treat migration entries as mapcount == 1 as
      Kirill pointed out.
      
      Add a new parameter for smaps_account() to tell this entry is migration
      entry then skip calling page_mapcount().  Don't skip getting mapcount
      for device private entries since they do track references with mapcount.
      
      Pagemap also has the similar issue although it was not reported.  Fixed
      it as well.
      
      [shy828301@gmail.com: v4]
        Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com
      [nathan@kernel.org: avoid unused variable warning in pagemap_pmd_range()]
        Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org
      Link: https://lkml.kernel.org/r/20220120202805.3369-1-shy828301@gmail.com
      Fixes: e9b61f19 ("thp: reintroduce split_huge_page()")
      Signed-off-by: default avatarYang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      24d7275c