1. 21 Apr, 2023 3 commits
    • Peter Xu's avatar
      mm/hugetlb: fix uffd-wp bit lost when unsharing happens · 0f230bc2
      Peter Xu authored
      When we try to unshare a pinned page for a private hugetlb, uffd-wp bit
      can get lost during unsharing.
      
      When above condition met, one can lose uffd-wp bit on the privately mapped
      hugetlb page.  It allows the page to be writable even if it should still be
      wr-protected.  I assume it can mean data loss.
      
      This should be very rare, only if an unsharing happened on a private
      hugetlb page with uffd-wp protected (e.g.  in a child which shares the
      same page with parent with UFFD_FEATURE_EVENT_FORK enabled).
      
      When I wrote the reproducer (provided in the last patch) I needed to
      use the newest gup_test cmd introduced by David to trigger it because I
      don't even know another way to do a proper RO longerm pin.
      
      Besides that, it needs a bunch of other conditions all met:
      
              (1) hugetlb being mapped privately,
              (2) userfaultfd registered with WP and EVENT_FORK,
              (3) the user app fork()s, then,
              (4) RO longterm pin onto a wr-protected anonymous page.
      
      If it's not impossible to hit in production I'd say extremely rare.
      
      Link: https://lkml.kernel.org/r/20230417195317.898696-3-peterx@redhat.com
      Fixes: 166f3ecc ("mm/hugetlb: hook page faults for uffd write protection")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reported-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Mika Penttilä <mpenttil@redhat.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0f230bc2
    • Peter Xu's avatar
      mm/hugetlb: fix uffd-wp during fork() · 5a2f8d22
      Peter Xu authored
      Patch series "mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins",
      v2.
      
      
      This patch (of 6):
      
      There're a bunch of things that were wrong:
      
        - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
          rather than huge_pte_uffd_wp().
      
        - When copying over a pte, we should drop uffd-wp bit when
          !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
      
        - When doing early CoW for private hugetlb (e.g. when the parent page was
          pinned), uffd-wp bit should be properly carried over if necessary.
      
      No bug reported probably because most people do not even care about these
      corner cases, but they are still bugs and can be exposed by the recent unit
      tests introduced, so fix all of them in one shot.
      
      Link: https://lkml.kernel.org/r/20230417195317.898696-1-peterx@redhat.com
      Link: https://lkml.kernel.org/r/20230417195317.898696-2-peterx@redhat.com
      Fixes: bc70fbf2 ("mm/hugetlb: handle uffd-wp during fork()")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Mika Penttilä <mpenttil@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5a2f8d22
    • Zqiang's avatar
      kasan: fix lockdep report invalid wait context · be41d814
      Zqiang authored
      For kernels built with the following options and booting
      
      CONFIG_SLUB=y
      CONFIG_DEBUG_LOCKDEP=y
      CONFIG_PROVE_LOCKING=y
      CONFIG_PROVE_RAW_LOCK_NESTING=y
      
      [    0.523115] [ BUG: Invalid wait context ]
      [    0.523315] 6.3.0-rc1-yocto-standard+ #739 Not tainted
      [    0.523649] -----------------------------
      [    0.523663] swapper/0/0 is trying to lock:
      [    0.523663] ffff888035611360 (&c->lock){....}-{3:3}, at: put_cpu_partial+0x2e/0x1e0
      [    0.523663] other info that might help us debug this:
      [    0.523663] context-{2:2}
      [    0.523663] no locks held by swapper/0/0.
      [    0.523663] stack backtrace:
      [    0.523663] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.3.0-rc1-yocto-standard+ #739
      [    0.523663] Call Trace:
      [    0.523663]  <IRQ>
      [    0.523663]  dump_stack_lvl+0x64/0xb0
      [    0.523663]  dump_stack+0x10/0x20
      [    0.523663]  __lock_acquire+0x6c4/0x3c10
      [    0.523663]  lock_acquire+0x188/0x460
      [    0.523663]  put_cpu_partial+0x5a/0x1e0
      [    0.523663]  __slab_free+0x39a/0x520
      [    0.523663]  ___cache_free+0xa9/0xc0
      [    0.523663]  qlist_free_all+0x7a/0x160
      [    0.523663]  per_cpu_remove_cache+0x5c/0x70
      [    0.523663]  __flush_smp_call_function_queue+0xfc/0x330
      [    0.523663]  generic_smp_call_function_single_interrupt+0x13/0x20
      [    0.523663]  __sysvec_call_function+0x86/0x2e0
      [    0.523663]  sysvec_call_function+0x73/0x90
      [    0.523663]  </IRQ>
      [    0.523663]  <TASK>
      [    0.523663]  asm_sysvec_call_function+0x1b/0x20
      [    0.523663] RIP: 0010:default_idle+0x13/0x20
      [    0.523663] RSP: 0000:ffffffff83e07dc0 EFLAGS: 00000246
      [    0.523663] RAX: 0000000000000000 RBX: ffffffff83e1e200 RCX: ffffffff82a83293
      [    0.523663] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8119a6b1
      [    0.523663] RBP: ffffffff83e07dc8 R08: 0000000000000001 R09: ffffed1006ac0d66
      [    0.523663] R10: ffff888035606b2b R11: ffffed1006ac0d65 R12: 0000000000000000
      [    0.523663] R13: ffffffff83e1e200 R14: ffffffff84a7d980 R15: 0000000000000000
      [    0.523663]  default_idle_call+0x6c/0xa0
      [    0.523663]  do_idle+0x2e1/0x330
      [    0.523663]  cpu_startup_entry+0x20/0x30
      [    0.523663]  rest_init+0x152/0x240
      [    0.523663]  arch_call_rest_init+0x13/0x40
      [    0.523663]  start_kernel+0x331/0x470
      [    0.523663]  x86_64_start_reservations+0x18/0x40
      [    0.523663]  x86_64_start_kernel+0xbb/0x120
      [    0.523663]  secondary_startup_64_no_verify+0xe0/0xeb
      [    0.523663]  </TASK>
      
      The local_lock_irqsave() is invoked in put_cpu_partial() and happens in
      IPI context, due to the CONFIG_PROVE_RAW_LOCK_NESTING=y (the
      LD_WAIT_CONFIG not equal to LD_WAIT_SPIN), so acquire local_lock in IPI
      context will trigger above calltrace.
      
      This commit therefore moves qlist_free_all() from hard-irq context to task
      context.
      
      Link: https://lkml.kernel.org/r/20230327120019.1027640-1-qiang1.zhang@intel.comSigned-off-by: default avatarZqiang <qiang1.zhang@intel.com>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      be41d814
  2. 18 Apr, 2023 37 commits