1. 15 Jun, 2024 19 commits
    • Aleksandr Nogikh's avatar
      kcov: don't lose track of remote references during softirqs · 01c8f980
      Aleksandr Nogikh authored
      In kcov_remote_start()/kcov_remote_stop(), we swap the previous KCOV
      metadata of the current task into a per-CPU variable.  However, the
      kcov_mode_enabled(mode) check is not sufficient in the case of remote KCOV
      coverage: current->kcov_mode always remains KCOV_MODE_DISABLED for remote
      KCOV objects.
      
      If the original task that has invoked the KCOV_REMOTE_ENABLE ioctl happens
      to get interrupted and kcov_remote_start() is called, it ultimately leads
      to kcov_remote_stop() NOT restoring the original KCOV reference.  So when
      the task exits, all registered remote KCOV handles remain active forever.
      
      The most uncomfortable effect (at least for syzkaller) is that the bug
      prevents the reuse of the same /sys/kernel/debug/kcov descriptor.  If
      we obtain it in the parent process and then e.g.  drop some
      capabilities and continuously fork to execute individual programs, at
      some point current->kcov of the forked process is lost,
      kcov_task_exit() takes no action, and all KCOV_REMOTE_ENABLE ioctls
      calls from subsequent forks fail.
      
      And, yes, the efficiency is also affected if we keep on losing remote
      kcov objects.
      a) kcov_remote_map keeps on growing forever.
      b) (If I'm not mistaken), we're also not freeing the memory referenced
      by kcov->area.
      
      Fix it by introducing a special kcov_mode that is assigned to the task
      that owns a KCOV remote object.  It makes kcov_mode_enabled() return true
      and yet does not trigger coverage collection in __sanitizer_cov_trace_pc()
      and write_comp_data().
      
      [nogikh@google.com: replace WRITE_ONCE() with an ordinary assignment]
        Link: https://lkml.kernel.org/r/20240614171221.2837584-1-nogikh@google.com
      Link: https://lkml.kernel.org/r/20240611133229.527822-1-nogikh@google.com
      Fixes: 5ff3b30a ("kcov: collect coverage from interrupts")
      Signed-off-by: default avatarAleksandr Nogikh <nogikh@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Marco Elver <elver@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      01c8f980
    • Baolin Wang's avatar
      mm: shmem: fix getting incorrect lruvec when replacing a shmem folio · 9094b4a1
      Baolin Wang authored
      When testing shmem swapin, I encountered the warning below on my machine. 
      The reason is that replacing an old shmem folio with a new one causes
      mem_cgroup_migrate() to clear the old folio's memcg data.  As a result,
      the old folio cannot get the correct memcg's lruvec needed to remove
      itself from the LRU list when it is being freed.  This could lead to
      possible serious problems, such as LRU list crashes due to holding the
      wrong LRU lock, and incorrect LRU statistics.
      
      To fix this issue, we can fallback to use the mem_cgroup_replace_folio()
      to replace the old shmem folio.
      
      [ 5241.100311] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5d9960
      [ 5241.100317] head: order:4 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      [ 5241.100319] flags: 0x17fffe0000040068(uptodate|lru|head|swapbacked|node=0|zone=2|lastcpupid=0x3ffff)
      [ 5241.100323] raw: 17fffe0000040068 fffffdffd6687948 fffffdffd69ae008 0000000000000000
      [ 5241.100325] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      [ 5241.100326] head: 17fffe0000040068 fffffdffd6687948 fffffdffd69ae008 0000000000000000
      [ 5241.100327] head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      [ 5241.100328] head: 17fffe0000000204 fffffdffd6665801 ffffffffffffffff 0000000000000000
      [ 5241.100329] head: 0000000a00000010 0000000000000000 00000000ffffffff 0000000000000000
      [ 5241.100330] page dumped because: VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled())
      [ 5241.100338] ------------[ cut here ]------------
      [ 5241.100339] WARNING: CPU: 19 PID: 78402 at include/linux/memcontrol.h:775 folio_lruvec_lock_irqsave+0x140/0x150
      [...]
      [ 5241.100374] pc : folio_lruvec_lock_irqsave+0x140/0x150
      [ 5241.100375] lr : folio_lruvec_lock_irqsave+0x138/0x150
      [ 5241.100376] sp : ffff80008b38b930
      [...]
      [ 5241.100398] Call trace:
      [ 5241.100399]  folio_lruvec_lock_irqsave+0x140/0x150
      [ 5241.100401]  __page_cache_release+0x90/0x300
      [ 5241.100404]  __folio_put+0x50/0x108
      [ 5241.100406]  shmem_replace_folio+0x1b4/0x240
      [ 5241.100409]  shmem_swapin_folio+0x314/0x528
      [ 5241.100411]  shmem_get_folio_gfp+0x3b4/0x930
      [ 5241.100412]  shmem_fault+0x74/0x160
      [ 5241.100414]  __do_fault+0x40/0x218
      [ 5241.100417]  do_shared_fault+0x34/0x1b0
      [ 5241.100419]  do_fault+0x40/0x168
      [ 5241.100420]  handle_pte_fault+0x80/0x228
      [ 5241.100422]  __handle_mm_fault+0x1c4/0x440
      [ 5241.100424]  handle_mm_fault+0x60/0x1f0
      [ 5241.100426]  do_page_fault+0x120/0x488
      [ 5241.100429]  do_translation_fault+0x4c/0x68
      [ 5241.100431]  do_mem_abort+0x48/0xa0
      [ 5241.100434]  el0_da+0x38/0xc0
      [ 5241.100436]  el0t_64_sync_handler+0x68/0xc0
      [ 5241.100437]  el0t_64_sync+0x14c/0x150
      [ 5241.100439] ---[ end trace 0000000000000000 ]---
      
      [baolin.wang@linux.alibaba.com: remove less helpful comments, per Matthew]
        Link: https://lkml.kernel.org/r/ccad3fe1375b468ebca3227b6b729f3eaf9d8046.1718423197.git.baolin.wang@linux.alibaba.com
      Link: https://lkml.kernel.org/r/3c11000dd6c1df83015a8321a859e9775ebbc23e.1718266112.git.baolin.wang@linux.alibaba.com
      Fixes: 85ce2c51 ("memcontrol: only transfer the memcg data for migration")
      Signed-off-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
      Reviewed-by: default avatarShakeel Butt <shakeel.butt@linux.dev>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Nhat Pham <nphamcs@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9094b4a1
    • Peter Xu's avatar
      mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick · 0b1ef4fd
      Peter Xu authored
      Macro RANDOM_ORVALUE was used to make sure the pgtable entry will be
      populated with !none data in clear tests.
      
      The RANDOM_ORVALUE tried to cover mostly all the bits in a pgtable entry,
      even if there's no discussion on whether all the bits will be vaild.  Both
      S390 and PPC64 have their own masks to avoid touching some bits.  Now it's
      the turn for x86_64.
      
      The issue is there's a recent report from Mikhail Gavrilov showing that
      this can cause a warning with the newly added pte set check in commit
      8430557f on writable v.s.  userfaultfd-wp bit, even though the check
      itself was valid, the random pte is not.  We can choose to mask more bits
      out.
      
      However the need to have such random bits setup is questionable, as now
      it's already guaranteed to be true on below:
      
        - For pte level, the pgtable entry will be installed with value from
          pfn_pte(), where pfn points to a valid page.  Hence the pte will be
          !none already if populated with pfn_pte().
      
        - For upper-than-pte level, the pgtable entry should contain a directory
          entry always, which is also !none.
      
      All the cases look like good enough to test a pxx_clear() helper.  Instead
      of extending the bitmask, drop the "set random bits" trick completely.  Add
      some warning guards to make sure the entries will be !none before clear().
      
      Link: https://lkml.kernel.org/r/20240523132139.289719-1-peterx@redhat.com
      Fixes: 8430557f ("mm/page_table_check: support userfault wr-protect entries")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reported-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Link: https://lore.kernel.org/r/CABXGCsMB9A8-X+Np_Q+fWLURYL_0t3Y-MdoNabDM-Lzk58-DGA@mail.gmail.comTested-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Reviewed-by: default avatarPasha Tatashin <pasha.tatashin@soleen.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Gavin Shan <gshan@redhat.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0b1ef4fd
    • Kefeng Wang's avatar
      mm: fix possible OOB in numa_rebuild_large_mapping() · cfdd12b4
      Kefeng Wang authored
      The large folio is mapped with folio size(not greater PMD_SIZE) aligned
      virtual address during the pagefault, ie, 'addr = ALIGN_DOWN(vmf->address,
      nr_pages * PAGE_SIZE)' in do_anonymous_page().  But after the mremap(),
      the virtual address only requires PAGE_SIZE alignment.  Also pte is moved
      to new in move_page_tables(), then traversal of the new pte in the
      numa_rebuild_large_mapping() could hit the following issue,
      
         Unable to handle kernel paging request at virtual address 00000a80c021a788
         Mem abort info:
           ESR = 0x0000000096000004
           EC = 0x25: DABT (current EL), IL = 32 bits
           SET = 0, FnV = 0
           EA = 0, S1PTW = 0
           FSC = 0x04: level 0 translation fault
         Data abort info:
           ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
           CM = 0, WnR = 0, TnD = 0, TagAccess = 0
           GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
         user pgtable: 4k pages, 48-bit VAs, pgdp=00002040341a6000
         [00000a80c021a788] pgd=0000000000000000, p4d=0000000000000000
         Internal error: Oops: 0000000096000004 [#1] SMP
         ...
         CPU: 76 PID: 15187 Comm: git Kdump: loaded Tainted: G        W          6.10.0-rc2+ #209
         Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 1.79 08/21/2021
         pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
         pc : numa_rebuild_large_mapping+0x338/0x638
         lr : numa_rebuild_large_mapping+0x320/0x638
         sp : ffff8000b41c3b00
         x29: ffff8000b41c3b30 x28: ffff8000812a0000 x27: 00000000000a8000
         x26: 00000000000000a8 x25: 0010000000000001 x24: ffff20401c7170f0
         x23: 0000ffff33a1e000 x22: 0000ffff33a76000 x21: ffff20400869eca0
         x20: 0000ffff33976000 x19: 00000000000000a8 x18: ffffffffffffffff
         x17: 0000000000000000 x16: 0000000000000020 x15: ffff8000b41c36a8
         x14: 0000000000000000 x13: 205d373831353154 x12: 5b5d333331363732
         x11: 000000000011ff78 x10: 000000000011ff10 x9 : ffff800080273f30
         x8 : 000000320400869e x7 : c0000000ffffd87f x6 : 00000000001e6ba8
         x5 : ffff206f3fb5af88 x4 : 0000000000000000 x3 : 0000000000000000
         x2 : 0000000000000000 x1 : fffffdffc0000000 x0 : 00000a80c021a780
         Call trace:
          numa_rebuild_large_mapping+0x338/0x638
          do_numa_page+0x3e4/0x4e0
          handle_pte_fault+0x1bc/0x238
          __handle_mm_fault+0x20c/0x400
          handle_mm_fault+0xa8/0x288
          do_page_fault+0x124/0x498
          do_translation_fault+0x54/0x80
          do_mem_abort+0x4c/0xa8
          el0_da+0x40/0x110
          el0t_64_sync_handler+0xe4/0x158
          el0t_64_sync+0x188/0x190
      
      Fix it by making the start and end not only within the vma range, but also
      within the page table range.
      
      Link: https://lkml.kernel.org/r/20240612122822.4033433-1-wangkefeng.wang@huawei.com
      Fixes: d2136d74 ("mm: support multi-size THP numa balancing")
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Liu Shixin <liushixin2@huawei.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      cfdd12b4
    • Hugh Dickins's avatar
      mm/migrate: fix kernel BUG at mm/compaction.c:2761! · 8e279f97
      Hugh Dickins authored
      I hit the VM_BUG_ON(!list_empty(&cc->migratepages)) in compact_zone(); and
      if DEBUG_VM were off, then pages would be lost on a local list.
      
      Our convention is that if migrate_pages() reports complete success (0),
      then the migratepages list will be empty; but if it reports an error or
      some pages remaining, then its caller must putback_movable_pages().
      
      There's a new case in which migrate_pages() has been reporting complete
      success, but returning with pages left on the migratepages list: when
      migrate_pages_batch() successfully split a folio on the deferred list, but
      then the "Failure isn't counted" call does not dispose of them all.
      
      Since that block is expecting the large folio to have been counted as 1
      failure already, and since the return code is later adjusted to success
      whenever the returned list is found empty, the simple way to fix this
      safely is to count splitting the deferred folio as "a failure".
      
      Link: https://lkml.kernel.org/r/46c948b4-4dd8-6e03-4c7b-ce4e81cfa536@google.com
      Fixes: 7262f208 ("mm/migrate: split source folio if it is on deferred split list")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8e279f97
    • Mark Brown's avatar
      selftests: mm: make map_fixed_noreplace test names stable · e7d2a28b
      Mark Brown authored
      KTAP parsers interpret the output of ksft_test_result_*() as being the
      name of the test.  The map_fixed_noreplace test uses a dynamically
      allocated base address for the mmap()s that it tests and currently
      includes this in the test names that it logs so the test names that are
      logged are not stable between runs.  It also uses multiples of PAGE_SIZE
      which mean that runs for kernels with different PAGE_SIZE configurations
      can't be directly compared.  Both these factors cause issues for CI
      systems when interpreting and displaying results.
      
      Fix this by replacing the current test names with fixed strings describing
      the intent of the mappings that are logged, the existing messages with the
      actual addresses and sizes are retained as diagnostic prints to aid in
      debugging.
      
      Link: https://lkml.kernel.org/r/20240605-kselftest-mm-fixed-noreplace-v1-1-a235db8b9be9@kernel.org
      Fixes: 4838cf70 ("selftests/mm: map_fixed_noreplace: conform test to TAP format output")
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Reviewed-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e7d2a28b
    • Jeff Xu's avatar
      mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC · 653c5c75
      Jeff Xu authored
      When MFD_NOEXEC_SEAL was introduced, there was one big mistake: it didn't
      have proper documentation.  This led to a lot of confusion, especially
      about whether or not memfd created with the MFD_NOEXEC_SEAL flag is
      sealable.  Before MFD_NOEXEC_SEAL, memfd had to explicitly set
      MFD_ALLOW_SEALING to be sealable, so it's a fair question.
      
      As one might have noticed, unlike other flags in memfd_create,
      MFD_NOEXEC_SEAL is actually a combination of multiple flags.  The idea is
      to make it easier to use memfd in the most common way, which is NOEXEC +
      F_SEAL_EXEC + MFD_ALLOW_SEALING.  This works with sysctl vm.noexec to help
      existing applications move to a more secure way of using memfd.
      
      Proposals have been made to put MFD_NOEXEC_SEAL non-sealable, unless
      MFD_ALLOW_SEALING is set, to be consistent with other flags [1], Those
      are based on the viewpoint that each flag is an atomic unit, which is a
      reasonable assumption.  However, MFD_NOEXEC_SEAL was designed with the
      intent of promoting the most secure method of using memfd, therefore a
      combination of multiple functionalities into one bit.
      
      Furthermore, the MFD_NOEXEC_SEAL has been added for more than one year,
      and multiple applications and distributions have backported and utilized
      it.  Altering ABI now presents a degree of risk and may lead to
      disruption.
      
      MFD_NOEXEC_SEAL is a new flag, and applications must change their code to
      use it.  There is no backward compatibility problem.
      
      When sysctl vm.noexec == 1 or 2, applications that don't set
      MFD_NOEXEC_SEAL or MFD_EXEC will get MFD_NOEXEC_SEAL memfd.  And
      old-application might break, that is by-design, in such a system vm.noexec
      = 0 shall be used.  Also no backward compatibility problem.
      
      I propose to include this documentation patch to assist in clarifying the
      semantics of MFD_NOEXEC_SEAL, thereby preventing any potential future
      confusion.
      
      Finally, I would like to express my gratitude to David Rheinsberg and
      Barnabás Pőcze for initiating the discussion on the topic of sealability.
      
      [1]
      https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
      
      [jeffxu@chromium.org: updates per Randy]
        Link: https://lkml.kernel.org/r/20240611034903.3456796-2-jeffxu@chromium.org
      [jeffxu@chromium.org: v3]
        Link: https://lkml.kernel.org/r/20240611231409.3899809-2-jeffxu@chromium.org
      Link: https://lkml.kernel.org/r/20240607203543.2151433-2-jeffxu@google.comSigned-off-by: default avatarJeff Xu <jeffxu@chromium.org>
      Reviewed-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Aleksa Sarai <cyphar@cyphar.com>
      Cc: Barnabás Pőcze <pobrn@protonmail.com>
      Cc: Daniel Verkamp <dverkamp@chromium.org>
      Cc: David Rheinsberg <david@readahead.eu>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Shuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      653c5c75
    • Rafael Aquini's avatar
      mm: mmap: allow for the maximum number of bits for randomizing mmap_base by default · 3afb76a6
      Rafael Aquini authored
      An ASLR regression was noticed [1] and tracked down to file-mapped areas
      being backed by THP in recent kernels.  The 21-bit alignment constraint
      for such mappings reduces the entropy for randomizing the placement of
      64-bit library mappings and breaks ASLR completely for 32-bit libraries.
      
      The reported issue is easily addressed by increasing vm.mmap_rnd_bits and
      vm.mmap_rnd_compat_bits.  This patch just provides a simple way to set
      ARCH_MMAP_RND_BITS and ARCH_MMAP_RND_COMPAT_BITS to their maximum values
      allowed by the architecture at build time.
      
      [1] https://zolutal.github.io/aslrnt/
      
      [akpm@linux-foundation.org: default to `y' if 32-bit, per Rafael]
      Link: https://lkml.kernel.org/r/20240606180622.102099-1-aquini@redhat.com
      Fixes: 1854bc6e ("mm/readahead: Align file mappings for non-DAX")
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Samuel Holland <samuel.holland@sifive.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3afb76a6
    • Peter Oberparleiter's avatar
      gcov: add support for GCC 14 · c1558bc5
      Peter Oberparleiter authored
      Using gcov on kernels compiled with GCC 14 results in truncated 16-byte
      long .gcda files with no usable data.  To fix this, update GCOV_COUNTERS
      to match the value defined by GCC 14.
      
      Tested with GCC versions 14.1.0 and 13.2.0.
      
      Link: https://lkml.kernel.org/r/20240610092743.1609845-1-oberpar@linux.ibm.comSigned-off-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Reported-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      Reported-by: default avatarChuck Lever III <chuck.lever@oracle.com>
      Tested-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c1558bc5
    • Oleg Nesterov's avatar
      zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING · 7fea700e
      Oleg Nesterov authored
      kernel_wait4() doesn't sleep and returns -EINTR if there is no
      eligible child and signal_pending() is true.
      
      That is why zap_pid_ns_processes() clears TIF_SIGPENDING but this is not
      enough, it should also clear TIF_NOTIFY_SIGNAL to make signal_pending()
      return false and avoid a busy-wait loop.
      
      Link: https://lkml.kernel.org/r/20240608120616.GB7947@redhat.com
      Fixes: 12db8b69 ("entry: Add support for TIF_NOTIFY_SIGNAL")
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarRachel Menge <rachelmenge@linux.microsoft.com>
      Closes: https://lore.kernel.org/all/1386cd49-36d0-4a5c-85e9-bc42056a5a38@linux.microsoft.com/Reviewed-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Tested-by: default avatarWei Fu <fuweid89@gmail.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Allen Pais <apais@linux.microsoft.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Joel Granados <j.granados@samsung.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Mateusz Guzik <mjguzik@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Mike Christie <michael.christie@oracle.com>
      Cc: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Cc: Zqiang <qiang.zhang1211@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7fea700e
    • Ran Xiaokai's avatar
      mm: huge_memory: fix misused mapping_large_folio_support() for anon folios · 6a50c9b5
      Ran Xiaokai authored
      When I did a large folios split test, a WARNING "[ 5059.122759][ T166]
      Cannot split file folio to non-0 order" was triggered.  But the test cases
      are only for anonmous folios.  while mapping_large_folio_support() is only
      reasonable for page cache folios.
      
      In split_huge_page_to_list_to_order(), the folio passed to
      mapping_large_folio_support() maybe anonmous folio.  The folio_test_anon()
      check is missing.  So the split of the anonmous THP is failed.  This is
      also the same for shmem_mapping().  We'd better add a check for both.  But
      the shmem_mapping() in __split_huge_page() is not involved, as for
      anonmous folios, the end parameter is set to -1, so (head[i].index >= end)
      is always false.  shmem_mapping() is not called.
      
      Also add a VM_WARN_ON_ONCE() in mapping_large_folio_support() for anon
      mapping, So we can detect the wrong use more easily.
      
      THP folios maybe exist in the pagecache even the file system doesn't
      support large folio, it is because when CONFIG_TRANSPARENT_HUGEPAGE is
      enabled, khugepaged will try to collapse read-only file-backed pages to
      THP.  But the mapping does not actually support multi order large folios
      properly.
      
      Using /sys/kernel/debug/split_huge_pages to verify this, with this patch,
      large anon THP is successfully split and the warning is ceased.
      
      Link: https://lkml.kernel.org/r/202406071740485174hcFl7jRxncsHDtI-Pz-o@zte.com.cn
      Fixes: c010d47f ("mm: thp: split huge page to any lower order pages")
      Reviewed-by: default avatarBarry Song <baohua@kernel.org>
      Reviewed-by: default avatarZi Yan <ziy@nvidia.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRan Xiaokai <ran.xiaokai@zte.com.cn>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: xu xin <xu.xin16@zte.com.cn>
      Cc: Yang Yang <yang.yang29@zte.com.cn>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6a50c9b5
    • Suren Baghdasaryan's avatar
      lib/alloc_tag: fix RCU imbalance in pgalloc_tag_get() · a273559e
      Suren Baghdasaryan authored
      put_page_tag_ref() should be called only when get_page_tag_ref() returns a
      valid reference because only in that case get_page_tag_ref() enters RCU
      read section while put_page_tag_ref() will call rcu_read_unlock() even if
      the provided reference is NULL.  Fix pgalloc_tag_get() which does not
      follow this rule causing RCU imbalance.  Add a warning in
      put_page_tag_ref() to catch any future mistakes.
      
      Link: https://lkml.kernel.org/r/20240601233840.617458-1-surenb@google.com
      Fixes: cc92eba1 ("mm: fix non-compound multi-order memory accounting in __free_pages")
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202405271029.6d2f9c4c-lkp@intel.comAcked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a273559e
    • Suren Baghdasaryan's avatar
      lib/alloc_tag: do not register sysctl interface when CONFIG_SYSCTL=n · c944bf60
      Suren Baghdasaryan authored
      Memory allocation profiling is trying to register sysctl interface even
      when CONFIG_SYSCTL=n, resulting in proc_do_static_key() being undefined. 
      Prevent that by skipping sysctl registration for such configurations.
      
      Link: https://lkml.kernel.org/r/20240601233831.617124-1-surenb@google.com
      Fixes: 22d407b1 ("lib: add allocation tagging support for memory allocation profiling")
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202405280616.wcOGWJEj-lkp@intel.com/Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c944bf60
    • Lorenzo Stoakes's avatar
      MAINTAINERS: remove Lorenzo as vmalloc reviewer · 3ab85f40
      Lorenzo Stoakes authored
      I haven't had the bandwidth to review vmalloc patches recently and I
      suspect I won't be able to do so consistently moving forwards, so I think
      it's best if I remove myself as reviewer for the time being.
      
      Link: https://lkml.kernel.org/r/20240602205510.108807-1-lstoakes@gmail.comSigned-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3ab85f40
    • David Hildenbrand's avatar
      Revert "mm: init_mlocked_on_free_v3" · 384a746b
      David Hildenbrand authored
      There was insufficient review and no agreement that this is the right
      approach.
      
      There are serious flaws with the implementation that make processes using
      mlock() not even work with simple fork() [1] and we get reliable crashes
      when rebooting.
      
      Further, simply because we might be unmapping a single PTE of a large
      mlocked folio, we shouldn't zero out the whole folio.
      
      ... especially because the code can also *corrupt* urelated memory because
      	kernel_init_pages(page, folio_nr_pages(folio));
      
      Could end up writing outside of the actual folio if we work with a tail
      page.
      
      Let's revert it.  Once there is agreement that this is the right approach,
      the issues were fixed and there was reasonable review and proper testing,
      we can consider it again.
      
      [1] https://lkml.kernel.org/r/4da9da2f-73e4-45fd-b62f-a8a513314057@redhat.com
      
      Link: https://lkml.kernel.org/r/20240605091710.38961-1-david@redhat.com
      Fixes: ba42b524 ("mm: init_mlocked_on_free_v3")
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reported-by: default avatarDavid Wang <00107082@163.com>
      Closes: https://lore.kernel.org/lkml/20240528151340.4282-1-00107082@163.com/Reported-by: default avatarLance Yang <ioworker0@gmail.com>
      Closes: https://lkml.kernel.org/r/20240601140917.43562-1-ioworker0@gmail.comAcked-by: default avatarLance Yang <ioworker0@gmail.com>
      Cc: York Jasper Niebuhr <yjnworkstation@gmail.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      384a746b
    • Peter Xu's avatar
      mm/page_table_check: fix crash on ZONE_DEVICE · 8bb592c2
      Peter Xu authored
      Not all pages may apply to pgtable check.  One example is ZONE_DEVICE
      pages: they map PFNs directly, and they don't allocate page_ext at all
      even if there's struct page around.  One may reference
      devm_memremap_pages().
      
      When both ZONE_DEVICE and page-table-check enabled, then try to map some
      dax memories, one can trigger kernel bug constantly now when the kernel
      was trying to inject some pfn maps on the dax device:
      
       kernel BUG at mm/page_table_check.c:55!
      
      While it's pretty legal to use set_pxx_at() for ZONE_DEVICE pages for page
      fault resolutions, skip all the checks if page_ext doesn't even exist in
      pgtable checker, which applies to ZONE_DEVICE but maybe more.
      
      Link: https://lkml.kernel.org/r/20240605212146.994486-1-peterx@redhat.com
      Fixes: df4e817b ("mm: page table check")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarPasha Tatashin <pasha.tatashin@soleen.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8bb592c2
    • Yury Norov's avatar
      gcc: disable '-Warray-bounds' for gcc-9 · 8e5bd4ea
      Yury Norov authored
      '-Warray-bounds' is already disabled for gcc-10+.  Now that we've merged
      bitmap_{read,write), I see the following error when building the kernel
      with gcc-9.4 (Ubuntu 20.04.4 LTS) for x86_64 allmodconfig:
      
      drivers/pinctrl/pinctrl-cy8c95x0.c: In function `cy8c95x0_read_regs_mask.isra.0':
      include/linux/bitmap.h:756:18: error: array subscript [1, 288230376151711744] is outside array bounds of `long unsigned int[1]' [-Werror=array-bounds]
        756 |  value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
            |               ~~~^~~~~~~~~~~
      
      The immediate reason is that the commit b4475970 ("bitmap: make
      bitmap_{get,set}_value8() use bitmap_{read,write}()") switched the
      bitmap_get_value8() to an alias of bitmap_read(); the same for 'set'.
      
      Now; the code that triggers Warray-bounds, calls the function like this:
      
        #define MAX_BANK 8
        #define BANK_SZ 8
        #define MAX_LINE        (MAX_BANK * BANK_SZ)
        DECLARE_BITMAP(tval, MAX_LINE); // 64-bit map: unsigned long tval[1]
      
        read_val |= bitmap_get_value8(tval, i * BANK_SZ) & ~bits;
      
      bitmap_read() is implemented such that it may conditionally dereference a
      pointer beyond the boundary like this:
      
      	unsigned long offset = start % BITS_PER_LONG;
              unsigned long space = BITS_PER_LONG - offset;
      
              if (space >= nbits)
                      return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits);
      
              value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
              value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
              return (value_low >> offset) | (value_high << space);
      
      In case of bitmap_get_value8(), it's impossible to violate the boundary
      because 'space >= nbits' is never the true for byte-aligned 8-bit access. 
      So, this is clearly a false-positive.
      
      The same type of false-positives break my allmodconfig build in many
      places.  gcc-8, is clear, however.
      
      Link: https://lkml.kernel.org/r/20240522225830.1201778-1-yury.norov@gmail.com
      Fixes: b4475970 ("bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}()")
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nhat Pham <nphamcs@gmail.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Yoann Congal <yoann.congal@smile.fr>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8e5bd4ea
    • Joseph Qi's avatar
      ocfs2: fix NULL pointer dereference in ocfs2_abort_trigger() · 685d03c3
      Joseph Qi authored
      bdev->bd_super has been removed and commit 8887b94d change the usage
      from bdev->bd_super to b_assoc_map->host->i_sb.  Since ocfs2 hasn't set
      bh->b_assoc_map, it will trigger NULL pointer dereference when calling
      into ocfs2_abort_trigger().
      
      Actually this was pointed out in history, see commit 74e364ad.  But
      I've made a mistake when reviewing commit 8887b94d and then
      re-introduce this regression.
      
      Since we cannot revive bdev in buffer head, so fix this issue by
      initializing all types of ocfs2 triggers when fill super, and then get the
      specific ocfs2 trigger from ocfs2_caching_info when access journal.
      
      [joseph.qi@linux.alibaba.com: v2]
        Link: https://lkml.kernel.org/r/20240602112045.1112708-1-joseph.qi@linux.alibaba.com
      Link: https://lkml.kernel.org/r/20240530110630.3933832-2-joseph.qi@linux.alibaba.com
      Fixes: 8887b94d ("ocfs2: stop using bdev->bd_super for journal error logging")
      Signed-off-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: default avatarHeming Zhao <heming.zhao@suse.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>	[6.6+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      685d03c3
    • Joseph Qi's avatar
      ocfs2: fix NULL pointer dereference in ocfs2_journal_dirty() · 58f7e1e2
      Joseph Qi authored
      bdev->bd_super has been removed and commit 8887b94d change the usage
      from bdev->bd_super to b_assoc_map->host->i_sb.  This introduces the
      following NULL pointer dereference in ocfs2_journal_dirty() since
      b_assoc_map is still not initialized.  This can be easily reproduced by
      running xfstests generic/186, which simulate no more credits.
      
      [  134.351592] BUG: kernel NULL pointer dereference, address: 0000000000000000
      ...
      [  134.355341] RIP: 0010:ocfs2_journal_dirty+0x14f/0x160 [ocfs2]
      ...
      [  134.365071] Call Trace:
      [  134.365312]  <TASK>
      [  134.365524]  ? __die_body+0x1e/0x60
      [  134.365868]  ? page_fault_oops+0x13d/0x4f0
      [  134.366265]  ? __pfx_bit_wait_io+0x10/0x10
      [  134.366659]  ? schedule+0x27/0xb0
      [  134.366981]  ? exc_page_fault+0x6a/0x140
      [  134.367356]  ? asm_exc_page_fault+0x26/0x30
      [  134.367762]  ? ocfs2_journal_dirty+0x14f/0x160 [ocfs2]
      [  134.368305]  ? ocfs2_journal_dirty+0x13d/0x160 [ocfs2]
      [  134.368837]  ocfs2_create_new_meta_bhs.isra.51+0x139/0x2e0 [ocfs2]
      [  134.369454]  ocfs2_grow_tree+0x688/0x8a0 [ocfs2]
      [  134.369927]  ocfs2_split_and_insert.isra.67+0x35c/0x4a0 [ocfs2]
      [  134.370521]  ocfs2_split_extent+0x314/0x4d0 [ocfs2]
      [  134.371019]  ocfs2_change_extent_flag+0x174/0x410 [ocfs2]
      [  134.371566]  ocfs2_add_refcount_flag+0x3fa/0x630 [ocfs2]
      [  134.372117]  ocfs2_reflink_remap_extent+0x21b/0x4c0 [ocfs2]
      [  134.372994]  ? inode_update_timestamps+0x4a/0x120
      [  134.373692]  ? __pfx_ocfs2_journal_access_di+0x10/0x10 [ocfs2]
      [  134.374545]  ? __pfx_ocfs2_journal_access_di+0x10/0x10 [ocfs2]
      [  134.375393]  ocfs2_reflink_remap_blocks+0xe4/0x4e0 [ocfs2]
      [  134.376197]  ocfs2_remap_file_range+0x1de/0x390 [ocfs2]
      [  134.376971]  ? security_file_permission+0x29/0x50
      [  134.377644]  vfs_clone_file_range+0xfe/0x320
      [  134.378268]  ioctl_file_clone+0x45/0xa0
      [  134.378853]  do_vfs_ioctl+0x457/0x990
      [  134.379422]  __x64_sys_ioctl+0x6e/0xd0
      [  134.379987]  do_syscall_64+0x5d/0x170
      [  134.380550]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
      [  134.381231] RIP: 0033:0x7fa4926397cb
      [  134.381786] Code: 73 01 c3 48 8b 0d bd 56 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8d 56 38 00 f7 d8 64 89 01 48
      [  134.383930] RSP: 002b:00007ffc2b39f7b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [  134.384854] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fa4926397cb
      [  134.385734] RDX: 00007ffc2b39f7f0 RSI: 000000004020940d RDI: 0000000000000003
      [  134.386606] RBP: 0000000000000000 R08: 00111a82a4f015bb R09: 00007fa494221000
      [  134.387476] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      [  134.388342] R13: 0000000000f10000 R14: 0000558e844e2ac8 R15: 0000000000f10000
      [  134.389207]  </TASK>
      
      Fix it by only aborting transaction and journal in ocfs2_journal_dirty()
      now, and leave ocfs2_abort() later when detecting an aborted handle,
      e.g. start next transaction. Also log the handle details in this case.
      
      Link: https://lkml.kernel.org/r/20240530110630.3933832-1-joseph.qi@linux.alibaba.com
      Fixes: 8887b94d ("ocfs2: stop using bdev->bd_super for journal error logging")
      Signed-off-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: default avatarHeming Zhao <heming.zhao@suse.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>	[6.6+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      58f7e1e2
  2. 09 Jun, 2024 5 commits
    • Linus Torvalds's avatar
      Linux 6.10-rc3 · 83a7eefe
      Linus Torvalds authored
      83a7eefe
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of... · b8481381
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Update copies of kernel headers, which resulted in support for the
         new 'mseal' syscall, SUBVOL statx return mask bit, RISC-V and PPC
         prctls, fcntl's DUPFD_QUERY, POSTED_MSI_NOTIFICATION IRQ vector,
         'map_shadow_stack' syscall for x86-32.
      
       - Revert perf.data record memory allocation optimization that ended up
         causing a regression, work is being done to re-introduce it in the
         next merge window.
      
       - Fix handling of minimal vmlinux.h file used with BPF's CO-RE when
         interrupting the build.
      
      * tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
        perf bpf: Fix handling of minimal vmlinux.h file when interrupting the build
        Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event"
        tools headers arm64: Sync arm64's cputype.h with the kernel sources
        tools headers uapi: Sync linux/stat.h with the kernel sources to pick STATX_SUBVOL
        tools headers UAPI: Update i915_drm.h with the kernel sources
        tools headers UAPI: Sync kvm headers with the kernel sources
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall
        perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources to pick POSTED_MSI_NOTIFICATION
        perf beauty: Update copy of linux/socket.h with the kernel sources
        tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERY
        tools headers UAPI: Sync linux/prctl.h with the kernel sources
        tools include UAPI: Sync linux/stat.h with the kernel sources
      b8481381
    • Linus Torvalds's avatar
      Merge tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 637c2dfc
      Linus Torvalds authored
      Pull EDAC fixes from Borislav Petkov:
      
       - Convert PCI core error codes to proper error numbers since latter get
         propagated all the way up to the module loading functions
      
      * tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/igen6: Convert PCIBIOS_* return codes to errnos
        EDAC/amd64: Convert PCIBIOS_* return codes to errnos
      637c2dfc
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 771ed661
      Linus Torvalds authored
      Pull clk fix from Stephen Boyd:
       "One fix for the SiFive PRCI clocks so that the device boots again.
      
        This driver was registering clkdev lookups that were always going to
        be useless. This wasn't a problem until clkdev started returning an
        error in these cases, causing this driver to fail probe, and thus boot
        to fail because clks are essential for most drivers. The fix is
        simple, don't use clkdev because this is a DT based system where
        clkdev isn't used"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sifive: Do not register clkdevs for PRCI clocks
      771ed661
    • Linus Torvalds's avatar
      Merge tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · c5dbc2ed
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Two small smb3 client fixes:
      
         - fix deadlock in umount
      
         - minor cleanup due to netfs change"
      
      * tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Don't advance the I/O iterator before terminating subrequest
        smb: client: fix deadlock in smb2_find_smb_tcon()
      c5dbc2ed
  3. 08 Jun, 2024 8 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 061d1af7
      Linus Torvalds authored
      Pull HID fixes from Benjamin Tissoires:
      
       - fix potential read out of bounds in hid-asus (Andrew Ballance)
      
       - fix endian-conversion on little endian systems in intel-ish-hid (Arnd
         Bergmann)
      
       - A couple of new input event codes (Aseda Aboagye)
      
       - errors handling fixes in hid-nvidia-shield (Chen Ni), hid-nintendo
         (Christophe JAILLET), hid-logitech-dj (José Expósito)
      
       - current leakage fix while the device is in suspend on a i2c-hid
         laptop (Johan Hovold)
      
       - other assorted smaller fixes and device ID / quirk entry additions
      
      * tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: Ignore battery for ELAN touchscreens 2F2C and 4116
        HID: i2c-hid: elan: fix reset suspend current leakage
        dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property
        dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M
        dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema
        input: Add support for "Do Not Disturb"
        input: Add event code for accessibility key
        hid: asus: asus_report_fixup: fix potential read out of bounds
        HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro
        HID: intel-ish-hid: fix endian-conversion
        HID: nintendo: Fix an error handling path in nintendo_hid_probe()
        HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
        HID: core: remove unnecessary WARN_ON() in implement()
        HID: nvidia-shield: Add missing check for input_ff_create_memless
        HID: intel-ish-hid: Fix build error for COMPILE_TEST
      061d1af7
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.10-2' of... · 329f70c5
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix the initial state of the save button in 'make gconfig'
      
       - Improve the Kconfig documentation
      
       - Fix a Kconfig bug regarding property visibility
      
       - Fix build breakage for systems where 'sed' is not installed in /bin
      
       - Fix a false warning about missing MODULE_DESCRIPTION()
      
      * tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o
        kbuild: explicitly run mksysmap as sed script from link-vmlinux.sh
        kconfig: remove wrong expr_trans_bool()
        kconfig: doc: document behavior of 'select' and 'imply' followed by 'if'
        kconfig: doc: fix a typo in the note about 'imply'
        kconfig: gconf: give a proper initial state to the Save button
        kconfig: remove unneeded code for user-supplied values being out of range
      329f70c5
    • Linus Torvalds's avatar
      Merge tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 1e7ccdd3
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
      
       - fixes for the new ipu6 driver (and related fixes to mei csi driver)
      
       - fix a double debugfs remove logic at mgb4 driver
      
       - a documentation fix
      
      * tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: intel/ipu6: add csi2 port sanity check in notifier bound
        media: intel/ipu6: update the maximum supported csi2 port number to 6
        media: mei: csi: Warn less verbosely of a missing device fwnode
        media: mei: csi: Put the IPU device reference
        media: intel/ipu6: fix the buffer flags caused by wrong parentheses
        media: intel/ipu6: Fix an error handling path in isys_probe()
        media: intel/ipu6: Move isys_remove() close to isys_probe()
        media: intel/ipu6: Fix some redundant resources freeing in ipu6_pci_remove()
        media: Documentation: v4l: Fix ACTIVE route flag
        media: mgb4: Fix double debugfs remove
      1e7ccdd3
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 36714d69
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
      
       - Fix possible memory leak the riscv-intc irqchip driver load failures
      
       - Fix boot crash in the sifive-plic irqchip driver caused by recently
         changed boot initialization order
      
       - Fix race condition in the gic-v3-its irqchip driver
      
      * tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()
        irqchip/sifive-plic: Chain to parent IRQ after handlers are ready
        irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails
      36714d69
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cedb020
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Miscellaneous fixes:
      
         - Fix kexec() crash if call depth tracking is enabled
      
         - Fix SMN reads on inaccessible registers on certain AMD systems"
      
      * tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/amd_nb: Check for invalid SMN reads
        x86/kexec: Fix bug with call depth tracking
      7cedb020
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cec2e16
      Linus Torvalds authored
      Pull perf event fix from Ingo Molnar:
       "Fix race between perf_event_free_task() and perf_event_release_kernel()
        that can result in missed wakeups and hung tasks"
      
      * tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/core: Fix missing wakeup when waiting for context reference
      7cec2e16
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bbc5332b
      Linus Torvalds authored
      Pull locking doc fix from Ingo Molnar:
       "Fix typos in the kerneldoc of some of the atomic APIs"
      
      * tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldoc
      bbc5332b
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-06-07-15-24' of... · dc772f82
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "14 hotfixes, 6 of which are cc:stable.
      
        All except the nilfs2 fix affect MM and all are singletons - see the
        chagelogs for details"
      
      * tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
        mm: fix xyz_noprof functions calling profiled functions
        codetag: avoid race at alloc_slab_obj_exts
        mm/hugetlb: do not call vma_add_reservation upon ENOMEM
        mm/ksm: fix ksm_zero_pages accounting
        mm/ksm: fix ksm_pages_scanned accounting
        kmsan: do not wipe out origin when doing partial unpoisoning
        vmalloc: check CONFIG_EXECMEM in is_vmalloc_or_module_addr()
        mm: page_alloc: fix highatomic typing in multi-block buddies
        nilfs2: fix potential kernel bug due to lack of writeback flag waiting
        memcg: remove the lockdep assert from __mod_objcg_mlstate()
        mm: arm64: fix the out-of-bounds issue in contpte_clear_young_dirty_ptes
        mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=n
        mm: drop the 'anon_' prefix for swap-out mTHP counters
      dc772f82
  4. 07 Jun, 2024 8 commits