1. 06 Jul, 2024 1 commit
    • Edward Adam Davis's avatar
      hfsplus: fix uninit-value in copy_name · 0570730c
      Edward Adam Davis authored
      [syzbot reported]
      BUG: KMSAN: uninit-value in sized_strscpy+0xc4/0x160
       sized_strscpy+0xc4/0x160
       copy_name+0x2af/0x320 fs/hfsplus/xattr.c:411
       hfsplus_listxattr+0x11e9/0x1a50 fs/hfsplus/xattr.c:750
       vfs_listxattr fs/xattr.c:493 [inline]
       listxattr+0x1f3/0x6b0 fs/xattr.c:840
       path_listxattr fs/xattr.c:864 [inline]
       __do_sys_listxattr fs/xattr.c:876 [inline]
       __se_sys_listxattr fs/xattr.c:873 [inline]
       __x64_sys_listxattr+0x16b/0x2f0 fs/xattr.c:873
       x64_sys_call+0x2ba0/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:195
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Uninit was created at:
       slab_post_alloc_hook mm/slub.c:3877 [inline]
       slab_alloc_node mm/slub.c:3918 [inline]
       kmalloc_trace+0x57b/0xbe0 mm/slub.c:4065
       kmalloc include/linux/slab.h:628 [inline]
       hfsplus_listxattr+0x4cc/0x1a50 fs/hfsplus/xattr.c:699
       vfs_listxattr fs/xattr.c:493 [inline]
       listxattr+0x1f3/0x6b0 fs/xattr.c:840
       path_listxattr fs/xattr.c:864 [inline]
       __do_sys_listxattr fs/xattr.c:876 [inline]
       __se_sys_listxattr fs/xattr.c:873 [inline]
       __x64_sys_listxattr+0x16b/0x2f0 fs/xattr.c:873
       x64_sys_call+0x2ba0/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:195
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      [Fix]
      When allocating memory to strbuf, initialize memory to 0.
      
      Reported-and-tested-by: syzbot+efde959319469ff8d4d7@syzkaller.appspotmail.com
      Signed-off-by: default avatarEdward Adam Davis <eadavis@qq.com>
      Link: https://lore.kernel.org/r/tencent_8BBB6433BC9E1C1B7B4BDF1BF52574BA8808@qq.com
      Reported-and-tested-by: syzbot+01ade747b16e9c8030e0@syzkaller.appspotmail.com
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      0570730c
  2. 05 Jul, 2024 3 commits
    • Brian Foster's avatar
      vfs: don't mod negative dentry count when on shrinker list · aabfe57e
      Brian Foster authored
      The nr_dentry_negative counter is intended to only account negative
      dentries that are present on the superblock LRU. Therefore, the LRU
      add, remove and isolate helpers modify the counter based on whether
      the dentry is negative, but the shrinker list related helpers do not
      modify the counter, and the paths that change a dentry between
      positive and negative only do so if DCACHE_LRU_LIST is set.
      
      The problem with this is that a dentry on a shrinker list still has
      DCACHE_LRU_LIST set to indicate ->d_lru is in use. The additional
      DCACHE_SHRINK_LIST flag denotes whether the dentry is on LRU or a
      shrink related list. Therefore if a relevant operation (i.e. unlink)
      occurs while a dentry is present on a shrinker list, and the
      associated codepath only checks for DCACHE_LRU_LIST, then it is
      technically possible to modify the negative dentry count for a
      dentry that is off the LRU. Since the shrinker list related helpers
      do not modify the negative dentry count (because non-LRU dentries
      should not be included in the count) when the dentry is ultimately
      removed from the shrinker list, this can cause the negative dentry
      count to become permanently inaccurate.
      
      This problem can be reproduced via a heavy file create/unlink vs.
      drop_caches workload. On an 80xcpu system, I start 80 tasks each
      running a 1k file create/delete loop, and one task spinning on
      drop_caches. After 10 minutes or so of runtime, the idle/clean cache
      negative dentry count increases from somewhere in the range of 5-10
      entries to several hundred (and increasingly grows beyond
      nr_dentry_unused).
      
      Tweak the logic in the paths that turn a dentry negative or positive
      to filter out the case where the dentry is present on a shrink
      related list. This allows the above workload to maintain an accurate
      negative dentry count.
      
      Fixes: af0c9af1 ("fs/dcache: Track & report number of negative dentries")
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Link: https://lore.kernel.org/r/20240703121301.247680-1-bfoster@redhat.comAcked-by: default avatarIan Kent <ikent@redhat.com>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      aabfe57e
    • Jeff Layton's avatar
      filelock: fix potential use-after-free in posix_lock_inode · 1b3ec4f7
      Jeff Layton authored
      Light Hsieh reported a KASAN UAF warning in trace_posix_lock_inode().
      The request pointer had been changed earlier to point to a lock entry
      that was added to the inode's list. However, before the tracepoint could
      fire, another task raced in and freed that lock.
      
      Fix this by moving the tracepoint inside the spinlock, which should
      ensure that this doesn't happen.
      
      Fixes: 74f6f591 ("locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock")
      Link: https://lore.kernel.org/linux-fsdevel/724ffb0a2962e912ea62bb0515deadf39c325112.camel@kernel.org/Reported-by: default avatarLight Hsieh (謝明燈) <Light.Hsieh@mediatek.com>
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20240702-filelock-6-10-v1-1-96e766aadc98@kernel.orgReviewed-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      1b3ec4f7
    • Christian Brauner's avatar
      Merge patch series "cachefiles: random bugfixes" · eeb17984
      Christian Brauner authored
      libaokun@huaweicloud.com <libaokun@huaweicloud.com> says:
      
      This is the third version of this patch series, in which another patch set
      is subsumed into this one to avoid confusing the two patch sets.
      (https://patchwork.kernel.org/project/linux-fsdevel/list/?series=854914)
      
      We've been testing ondemand mode for cachefiles since January, and we're
      almost done. We hit a lot of issues during the testing period, and this
      patch series fixes some of the issues. The patches have passed internal
      testing without regression.
      
      The following is a brief overview of the patches, see the patches for
      more details.
      
      Patch 1-2: Add fscache_try_get_volume() helper function to avoid
      fscache_volume use-after-free on cache withdrawal.
      
      Patch 3: Fix cachefiles_lookup_cookie() and cachefiles_withdraw_cache()
      concurrency causing cachefiles_volume use-after-free.
      
      Patch 4: Propagate error codes returned by vfs_getxattr() to avoid
      endless loops.
      
      Patch 5-7: A read request waiting for reopen could be closed maliciously
      before the reopen worker is executing or waiting to be scheduled. So
      ondemand_object_worker() may be called after the info and object and even
      the cache have been freed and trigger use-after-free. So use
      cancel_work_sync() in cachefiles_ondemand_clean_object() to cancel the
      reopen worker or wait for it to finish. Since it makes no sense to wait
      for the daemon to complete the reopen request, to avoid this pointless
      operation blocking cancel_work_sync(), Patch 1 avoids request generation
      by the DROPPING state when the request has not been sent, and Patch 2
      flushes the requests of the current object before cancel_work_sync().
      
      Patch 8: Cyclic allocation of msg_id to avoid msg_id reuse misleading
      the daemon to cause hung.
      
      Patch 9: Hold xas_lock during polling to avoid dereferencing reqs causing
      use-after-free. This issue was triggered frequently in our tests, and we
      found that anolis 5.10 had fixed it. So to avoid failing the test, this
      patch is pushed upstream as well.
      
      Baokun Li (7):
        netfs, fscache: export fscache_put_volume() and add
          fscache_try_get_volume()
        cachefiles: fix slab-use-after-free in fscache_withdraw_volume()
        cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie()
        cachefiles: propagate errors from vfs_getxattr() to avoid infinite
          loop
        cachefiles: stop sending new request when dropping object
        cachefiles: cancel all requests for the object that is being dropped
        cachefiles: cyclic allocation of msg_id to avoid reuse
      
      Hou Tao (1):
        cachefiles: wait for ondemand_object_worker to finish when dropping
          object
      
      Jingbo Xu (1):
        cachefiles: add missing lock protection when polling
      
       fs/cachefiles/cache.c          | 45 ++++++++++++++++++++++++++++-
       fs/cachefiles/daemon.c         |  4 +--
       fs/cachefiles/internal.h       |  3 ++
       fs/cachefiles/ondemand.c       | 52 ++++++++++++++++++++++++++++++----
       fs/cachefiles/volume.c         |  1 -
       fs/cachefiles/xattr.c          |  5 +++-
       fs/netfs/fscache_volume.c      | 14 +++++++++
       fs/netfs/internal.h            |  2 --
       include/linux/fscache-cache.h  |  6 ++++
       include/trace/events/fscache.h |  4 +++
       10 files changed, 123 insertions(+), 13 deletions(-)
      
      Link: https://lore.kernel.org/r/20240628062930.2467993-1-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
      eeb17984
  3. 03 Jul, 2024 9 commits
    • Jingbo Xu's avatar
      cachefiles: add missing lock protection when polling · cf5bb09e
      Jingbo Xu authored
      Add missing lock protection in poll routine when iterating xarray,
      otherwise:
      
      Even with RCU read lock held, only the slot of the radix tree is
      ensured to be pinned there, while the data structure (e.g. struct
      cachefiles_req) stored in the slot has no such guarantee.  The poll
      routine will iterate the radix tree and dereference cachefiles_req
      accordingly.  Thus RCU read lock is not adequate in this case and
      spinlock is needed here.
      
      Fixes: b817e22b ("cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode")
      Signed-off-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-10-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      cf5bb09e
    • Baokun Li's avatar
      cachefiles: cyclic allocation of msg_id to avoid reuse · 19f4f399
      Baokun Li authored
      Reusing the msg_id after a maliciously completed reopen request may cause
      a read request to remain unprocessed and result in a hung, as shown below:
      
             t1       |      t2       |      t3
      -------------------------------------------------
      cachefiles_ondemand_select_req
       cachefiles_ondemand_object_is_close(A)
       cachefiles_ondemand_set_object_reopening(A)
       queue_work(fscache_object_wq, &info->work)
                      ondemand_object_worker
                       cachefiles_ondemand_init_object(A)
                        cachefiles_ondemand_send_req(OPEN)
                          // get msg_id 6
                          wait_for_completion(&req_A->done)
      cachefiles_ondemand_daemon_read
       // read msg_id 6 req_A
       cachefiles_ondemand_get_fd
       copy_to_user
                                      // Malicious completion msg_id 6
                                      copen 6,-1
                                      cachefiles_ondemand_copen
                                       complete(&req_A->done)
                                       // will not set the object to close
                                       // because ondemand_id && fd is valid.
      
                      // ondemand_object_worker() is done
                      // but the object is still reopening.
      
                                      // new open req_B
                                      cachefiles_ondemand_init_object(B)
                                       cachefiles_ondemand_send_req(OPEN)
                                       // reuse msg_id 6
      process_open_req
       copen 6,A.size
       // The expected failed copen was executed successfully
      
      Expect copen to fail, and when it does, it closes fd, which sets the
      object to close, and then close triggers reopen again. However, due to
      msg_id reuse resulting in a successful copen, the anonymous fd is not
      closed until the daemon exits. Therefore read requests waiting for reopen
      to complete may trigger hung task.
      
      To avoid this issue, allocate the msg_id cyclically to avoid reusing the
      msg_id for a very short duration of time.
      
      Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-9-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      19f4f399
    • Hou Tao's avatar
      cachefiles: wait for ondemand_object_worker to finish when dropping object · 12e009d6
      Hou Tao authored
      When queuing ondemand_object_worker() to re-open the object,
      cachefiles_object is not pinned. The cachefiles_object may be freed when
      the pending read request is completed intentionally and the related
      erofs is umounted. If ondemand_object_worker() runs after the object is
      freed, it will incur use-after-free problem as shown below.
      
      process A  processs B  process C  process D
      
      cachefiles_ondemand_send_req()
      // send a read req X
      // wait for its completion
      
                 // close ondemand fd
                 cachefiles_ondemand_fd_release()
                 // set object as CLOSE
      
                             cachefiles_ondemand_daemon_read()
                             // set object as REOPENING
                             queue_work(fscache_wq, &info->ondemand_work)
      
                                      // close /dev/cachefiles
                                      cachefiles_daemon_release
                                      cachefiles_flush_reqs
                                      complete(&req->done)
      
      // read req X is completed
      // umount the erofs fs
      cachefiles_put_object()
      // object will be freed
      cachefiles_ondemand_deinit_obj_info()
      kmem_cache_free(object)
                             // both info and object are freed
                             ondemand_object_worker()
      
      When dropping an object, it is no longer necessary to reopen the object,
      so use cancel_work_sync() to cancel or wait for ondemand_object_worker()
      to finish.
      
      Fixes: 0a7e54c1 ("cachefiles: resend an open request if the read request's object is closed")
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-8-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      12e009d6
    • Baokun Li's avatar
      cachefiles: cancel all requests for the object that is being dropped · 751f5246
      Baokun Li authored
      Because after an object is dropped, requests for that object are useless,
      cancel them to avoid causing other problems.
      
      This prepares for the later addition of cancel_work_sync(). After the
      reopen requests is generated, cancel it to avoid cancel_work_sync()
      blocking by waiting for daemon to complete the reopen requests.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-7-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      751f5246
    • Baokun Li's avatar
      cachefiles: stop sending new request when dropping object · b2415d1f
      Baokun Li authored
      Added CACHEFILES_ONDEMAND_OBJSTATE_DROPPING indicates that the cachefiles
      object is being dropped, and is set after the close request for the dropped
      object completes, and no new requests are allowed to be sent after this
      state.
      
      This prepares for the later addition of cancel_work_sync(). It prevents
      leftover reopen requests from being sent, to avoid processing unnecessary
      requests and to avoid cancel_work_sync() blocking by waiting for daemon to
      complete the reopen requests.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-6-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      b2415d1f
    • Baokun Li's avatar
      cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop · 0ece614a
      Baokun Li authored
      In cachefiles_check_volume_xattr(), the error returned by vfs_getxattr()
      is not passed to ret, so it ends up returning -ESTALE, which leads to an
      endless loop as follows:
      
      cachefiles_acquire_volume
      retry:
        ret = cachefiles_check_volume_xattr
          ret = -ESTALE
          xlen = vfs_getxattr // return -EIO
          // The ret is not updated when xlen < 0, so -ESTALE is returned.
          return ret
        // Supposed to jump out of the loop at this judgement.
        if (ret != -ESTALE)
            goto error_dir;
        cachefiles_bury_object
          //  EIO causes rename failure
        goto retry;
      
      Hence propagate the error returned by vfs_getxattr() to avoid the above
      issue. Do the same in cachefiles_check_auxdata().
      
      Fixes: 32e15003 ("fscache, cachefiles: Store the volume coherency data")
      Fixes: 72b95785 ("cachefiles: Implement metadata/coherency data storage in xattrs")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-5-libaokun@huaweicloud.comReviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      0ece614a
    • Baokun Li's avatar
      cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie() · 5d8f8057
      Baokun Li authored
      We got the following issue in our fault injection stress test:
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in cachefiles_withdraw_cookie+0x4d9/0x600
      Read of size 8 at addr ffff888118efc000 by task kworker/u78:0/109
      
      CPU: 13 PID: 109 Comm: kworker/u78:0 Not tainted 6.8.0-dirty #566
      Call Trace:
       <TASK>
       kasan_report+0x93/0xc0
       cachefiles_withdraw_cookie+0x4d9/0x600
       fscache_cookie_state_machine+0x5c8/0x1230
       fscache_cookie_worker+0x91/0x1c0
       process_one_work+0x7fa/0x1800
       [...]
      
      Allocated by task 117:
       kmalloc_trace+0x1b3/0x3c0
       cachefiles_acquire_volume+0xf3/0x9c0
       fscache_create_volume_work+0x97/0x150
       process_one_work+0x7fa/0x1800
       [...]
      
      Freed by task 120301:
       kfree+0xf1/0x2c0
       cachefiles_withdraw_cache+0x3fa/0x920
       cachefiles_put_unbind_pincount+0x1f6/0x250
       cachefiles_daemon_release+0x13b/0x290
       __fput+0x204/0xa00
       task_work_run+0x139/0x230
       do_exit+0x87a/0x29b0
       [...]
      ==================================================================
      
      Following is the process that triggers the issue:
      
                 p1                |             p2
      ------------------------------------------------------------
                                    fscache_begin_lookup
                                     fscache_begin_volume_access
                                      fscache_cache_is_live(fscache_cache)
      cachefiles_daemon_release
       cachefiles_put_unbind_pincount
        cachefiles_daemon_unbind
         cachefiles_withdraw_cache
          fscache_withdraw_cache
           fscache_set_cache_state(cache, FSCACHE_CACHE_IS_WITHDRAWN);
          cachefiles_withdraw_objects(cache)
          fscache_wait_for_objects(fscache)
            atomic_read(&fscache_cache->object_count) == 0
                                    fscache_perform_lookup
                                     cachefiles_lookup_cookie
                                      cachefiles_alloc_object
                                       refcount_set(&object->ref, 1);
                                       object->volume = volume
                                       fscache_count_object(vcookie->cache);
                                        atomic_inc(&fscache_cache->object_count)
          cachefiles_withdraw_volumes
           cachefiles_withdraw_volume
            fscache_withdraw_volume
            __cachefiles_free_volume
             kfree(cachefiles_volume)
                                    fscache_cookie_state_machine
                                     cachefiles_withdraw_cookie
                                      cache = object->volume->cache;
                                      // cachefiles_volume UAF !!!
      
      After setting FSCACHE_CACHE_IS_WITHDRAWN, wait for all the cookie lookups
      to complete first, and then wait for fscache_cache->object_count == 0 to
      avoid the cookie exiting after the volume has been freed and triggering
      the above issue. Therefore call fscache_withdraw_volume() before calling
      cachefiles_withdraw_objects().
      
      This way, after setting FSCACHE_CACHE_IS_WITHDRAWN, only the following two
      cases will occur:
      1) fscache_begin_lookup fails in fscache_begin_volume_access().
      2) fscache_withdraw_volume() will ensure that fscache_count_object() has
         been executed before calling fscache_wait_for_objects().
      
      Fixes: fe2140e2 ("cachefiles: Implement volume support")
      Suggested-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-4-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
      5d8f8057
    • Baokun Li's avatar
      cachefiles: fix slab-use-after-free in fscache_withdraw_volume() · 522018a0
      Baokun Li authored
      We got the following issue in our fault injection stress test:
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370
      Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798
      
      CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #565
      Call Trace:
       kasan_check_range+0xf6/0x1b0
       fscache_withdraw_volume+0x2e1/0x370
       cachefiles_withdraw_volume+0x31/0x50
       cachefiles_withdraw_cache+0x3ad/0x900
       cachefiles_put_unbind_pincount+0x1f6/0x250
       cachefiles_daemon_release+0x13b/0x290
       __fput+0x204/0xa00
       task_work_run+0x139/0x230
      
      Allocated by task 5820:
       __kmalloc+0x1df/0x4b0
       fscache_alloc_volume+0x70/0x600
       __fscache_acquire_volume+0x1c/0x610
       erofs_fscache_register_volume+0x96/0x1a0
       erofs_fscache_register_fs+0x49a/0x690
       erofs_fc_fill_super+0x6c0/0xcc0
       vfs_get_super+0xa9/0x140
       vfs_get_tree+0x8e/0x300
       do_new_mount+0x28c/0x580
       [...]
      
      Freed by task 5820:
       kfree+0xf1/0x2c0
       fscache_put_volume.part.0+0x5cb/0x9e0
       erofs_fscache_unregister_fs+0x157/0x1b0
       erofs_kill_sb+0xd9/0x1c0
       deactivate_locked_super+0xa3/0x100
       vfs_get_super+0x105/0x140
       vfs_get_tree+0x8e/0x300
       do_new_mount+0x28c/0x580
       [...]
      ==================================================================
      
      Following is the process that triggers the issue:
      
              mount failed         |         daemon exit
      ------------------------------------------------------------
       deactivate_locked_super        cachefiles_daemon_release
        erofs_kill_sb
         erofs_fscache_unregister_fs
          fscache_relinquish_volume
           __fscache_relinquish_volume
            fscache_put_volume(fscache_volume, fscache_volume_put_relinquish)
             zero = __refcount_dec_and_test(&fscache_volume->ref, &ref);
                                       cachefiles_put_unbind_pincount
                                        cachefiles_daemon_unbind
                                         cachefiles_withdraw_cache
                                          cachefiles_withdraw_volumes
                                           list_del_init(&volume->cache_link)
             fscache_free_volume(fscache_volume)
              cache->ops->free_volume
               cachefiles_free_volume
                list_del_init(&cachefiles_volume->cache_link);
              kfree(fscache_volume)
                                           cachefiles_withdraw_volume
                                            fscache_withdraw_volume
                                             fscache_volume->n_accesses
                                             // fscache_volume UAF !!!
      
      The fscache_volume in cache->volumes must not have been freed yet, but its
      reference count may be 0. So use the new fscache_try_get_volume() helper
      function try to get its reference count.
      
      If the reference count of fscache_volume is 0, fscache_put_volume() is
      freeing it, so wait for it to be removed from cache->volumes.
      
      If its reference count is not 0, call cachefiles_withdraw_volume() with
      reference count protection to avoid the above issue.
      
      Fixes: fe2140e2 ("cachefiles: Implement volume support")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
      522018a0
    • Baokun Li's avatar
      netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume() · 85b08b31
      Baokun Li authored
      Export fscache_put_volume() and add fscache_try_get_volume()
      helper function to allow cachefiles to get/put fscache_volume
      via linux/fscache-cache.h.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-2-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
      85b08b31
  4. 30 Jun, 2024 16 commits
    • Linus Torvalds's avatar
      Linux 6.10-rc6 · 22a40d14
      Linus Torvalds authored
      22a40d14
    • Linus Torvalds's avatar
      Merge tag 'ata-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux · aca7c377
      Linus Torvalds authored
      Pull ata fixes from Niklas Cassel:
      
       - Add NOLPM quirk for for all Crucial BX SSD1 models.
      
         Considering that we now have had bug reports for 3 different BX SSD1
         variants from Crucial with the same product name, make the quirk more
         inclusive, to catch more device models from the same generation.
      
       - Fix a trivial NULL pointer dereference in the error path for
         ata_host_release().
      
       - Create a ata_port_free(), so that we don't miss freeing ata_port
         struct members when freeing a struct ata_port.
      
       - Fix a trivial double free in the error path for ata_host_alloc().
      
       - Ensure that we remove the libata "remapped NVMe device count" sysfs
         entry on .probe() error.
      
      * tag 'ata-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
        ata: ahci: Clean up sysfs file on error
        ata: libata-core: Fix double free on error
        ata,scsi: libata-core: Do not leak memory for ata_port struct members
        ata: libata-core: Fix null pointer dereference on error
        ata: libata-core: Add ATA_HORKAGE_NOLPM for all Crucial BX SSD1 models
      aca7c377
    • Niklas Cassel's avatar
      ata: ahci: Clean up sysfs file on error · eeb25a09
      Niklas Cassel authored
      .probe() (ahci_init_one()) calls sysfs_add_file_to_group(), however,
      if probe() fails after this call, we currently never call
      sysfs_remove_file_from_group().
      
      (The sysfs_remove_file_from_group() call in .remove() (ahci_remove_one())
      does not help, as .remove() is not called on .probe() error.)
      
      Thus, if probe() fails after the sysfs_add_file_to_group() call, the next
      time we insmod the module we will get:
      
      sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:04.0/remapped_nvme'
      CPU: 11 PID: 954 Comm: modprobe Not tainted 6.10.0-rc5 #43
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x5d/0x80
       sysfs_warn_dup.cold+0x17/0x23
       sysfs_add_file_mode_ns+0x11a/0x130
       sysfs_add_file_to_group+0x7e/0xc0
       ahci_init_one+0x31f/0xd40 [ahci]
      
      Fixes: 894fba7f ("ata: ahci: Add sysfs attribute to show remapped NVMe device count")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Link: https://lore.kernel.org/r/20240629124210.181537-10-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      eeb25a09
    • Niklas Cassel's avatar
      ata: libata-core: Fix double free on error · ab9e0c52
      Niklas Cassel authored
      If e.g. the ata_port_alloc() call in ata_host_alloc() fails, we will jump
      to the err_out label, which will call devres_release_group().
      devres_release_group() will trigger a call to ata_host_release().
      ata_host_release() calls kfree(host), so executing the kfree(host) in
      ata_host_alloc() will lead to a double free:
      
      kernel BUG at mm/slub.c:553!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 11 PID: 599 Comm: (udev-worker) Not tainted 6.10.0-rc5 #47
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
      RIP: 0010:kfree+0x2cf/0x2f0
      Code: 5d 41 5e 41 5f 5d e9 80 d6 ff ff 4d 89 f1 41 b8 01 00 00 00 48 89 d9 48 89 da
      RSP: 0018:ffffc90000f377f0 EFLAGS: 00010246
      RAX: ffff888112b1f2c0 RBX: ffff888112b1f2c0 RCX: ffff888112b1f320
      RDX: 000000000000400b RSI: ffffffffc02c9de5 RDI: ffff888112b1f2c0
      RBP: ffffc90000f37830 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffc90000f37610 R11: 617461203a736b6e R12: ffffea00044ac780
      R13: ffff888100046400 R14: ffffffffc02c9de5 R15: 0000000000000006
      FS:  00007f2f1cabe980(0000) GS:ffff88813b380000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f2f1c3acf75 CR3: 0000000111724000 CR4: 0000000000750ef0
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? __die_body.cold+0x19/0x27
       ? die+0x2e/0x50
       ? do_trap+0xca/0x110
       ? do_error_trap+0x6a/0x90
       ? kfree+0x2cf/0x2f0
       ? exc_invalid_op+0x50/0x70
       ? kfree+0x2cf/0x2f0
       ? asm_exc_invalid_op+0x1a/0x20
       ? ata_host_alloc+0xf5/0x120 [libata]
       ? ata_host_alloc+0xf5/0x120 [libata]
       ? kfree+0x2cf/0x2f0
       ata_host_alloc+0xf5/0x120 [libata]
       ata_host_alloc_pinfo+0x14/0xa0 [libata]
       ahci_init_one+0x6c9/0xd20 [ahci]
      
      Ensure that we will not call kfree(host) twice, by performing the kfree()
      only if the devres_open_group() call failed.
      
      Fixes: dafd6c49 ("libata: ensure host is free'd on error exit paths")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Link: https://lore.kernel.org/r/20240629124210.181537-9-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      ab9e0c52
    • Niklas Cassel's avatar
      ata,scsi: libata-core: Do not leak memory for ata_port struct members · f6549f53
      Niklas Cassel authored
      libsas is currently not freeing all the struct ata_port struct members,
      e.g. ncq_sense_buf for a driver supporting Command Duration Limits (CDL).
      
      Add a function, ata_port_free(), that is used to free a ata_port,
      including its struct members. It makes sense to keep the code related to
      freeing a ata_port in its own function, which will also free all the
      struct members of struct ata_port.
      
      Fixes: 18bd7718 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Link: https://lore.kernel.org/r/20240629124210.181537-8-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      f6549f53
    • Niklas Cassel's avatar
      ata: libata-core: Fix null pointer dereference on error · 5d92c7c5
      Niklas Cassel authored
      If the ata_port_alloc() call in ata_host_alloc() fails,
      ata_host_release() will get called.
      
      However, the code in ata_host_release() tries to free ata_port struct
      members unconditionally, which can lead to the following:
      
      BUG: unable to handle page fault for address: 0000000000003990
      PGD 0 P4D 0
      Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 10 PID: 594 Comm: (udev-worker) Not tainted 6.10.0-rc5 #44
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
      RIP: 0010:ata_host_release.cold+0x2f/0x6e [libata]
      Code: e4 4d 63 f4 44 89 e2 48 c7 c6 90 ad 32 c0 48 c7 c7 d0 70 33 c0 49 83 c6 0e 41
      RSP: 0018:ffffc90000ebb968 EFLAGS: 00010246
      RAX: 0000000000000041 RBX: ffff88810fb52e78 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff88813b3218c0 RDI: ffff88813b3218c0
      RBP: ffff88810fb52e40 R08: 0000000000000000 R09: 6c65725f74736f68
      R10: ffffc90000ebb738 R11: 73692033203a746e R12: 0000000000000004
      R13: 0000000000000000 R14: 0000000000000011 R15: 0000000000000006
      FS:  00007f6cc55b9980(0000) GS:ffff88813b300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000003990 CR3: 00000001122a2000 CR4: 0000000000750ef0
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? __die_body.cold+0x19/0x27
       ? page_fault_oops+0x15a/0x2f0
       ? exc_page_fault+0x7e/0x180
       ? asm_exc_page_fault+0x26/0x30
       ? ata_host_release.cold+0x2f/0x6e [libata]
       ? ata_host_release.cold+0x2f/0x6e [libata]
       release_nodes+0x35/0xb0
       devres_release_group+0x113/0x140
       ata_host_alloc+0xed/0x120 [libata]
       ata_host_alloc_pinfo+0x14/0xa0 [libata]
       ahci_init_one+0x6c9/0xd20 [ahci]
      
      Do not access ata_port struct members unconditionally.
      
      Fixes: 633273a3 ("libata-pmp: hook PMP support and enable it")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Link: https://lore.kernel.org/r/20240629124210.181537-7-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      5d92c7c5
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.10-3' of... · e0b668b0
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Remove the executable bit from installed DTB files
      
       - Escape $ in subshell execution in the debian-orig target
      
       - Fix RPM builds with CONFIG_MODULES=n
      
       - Fix xconfig with the O= option
      
       - Fix scripts_gdb with the O= option
      
      * tag 'kbuild-fixes-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: scripts/gdb: bring the "abspath" back
        kbuild: Use $(obj)/%.cc to fix host C++ module builds
        kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n
        kbuild: Fix build target deb-pkg: ln: failed to create hard link
        kbuild: doc: Update default INSTALL_MOD_DIR from extra to updates
        kbuild: Install dtb files as 0644 in Makefile.dtbinst
      e0b668b0
    • Linus Torvalds's avatar
      x86-32: fix cmpxchg8b_emu build error with clang · 76932725
      Linus Torvalds authored
      The kernel test robot reported that clang no longer compiles the 32-bit
      x86 kernel in some configurations due to commit 95ece481
      ("locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}()
      functions").
      
      The build fails with
      
        arch/x86/include/asm/cmpxchg_32.h:149:9: error: inline assembly requires more registers than available
      
      and the reason seems to be that not only does the cmpxchg8b instruction
      need four fixed registers (EDX:EAX and ECX:EBX), with the emulation
      fallback the inline asm also wants a fifth fixed register for the
      address (it uses %esi for that, but that's just a software convention
      with cmpxchg8b_emu).
      
      Avoiding using another pointer input to the asm (and just forcing it to
      use the "0(%esi)" addressing that we end up requiring for the sw
      fallback) seems to fix the issue.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202406230912.F6XFIyA6-lkp@intel.com/
      Fixes: 95ece481 ("locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions")
      Link: https://lore.kernel.org/all/202406230912.F6XFIyA6-lkp@intel.com/Suggested-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Reviewed-and-Tested-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      76932725
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 84dd4373
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are some small driver fixes for 6.10-rc6. Included in here are:
      
         - IIO driver fixes for reported issues
      
         - Counter driver fix for a reported problem.
      
        All of these have been in linux-next this week with no reported
        issues"
      
      * tag 'char-misc-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        counter: ti-eqep: enable clock at probe
        iio: chemical: bme680: Fix sensor data read operation
        iio: chemical: bme680: Fix overflows in compensate() functions
        iio: chemical: bme680: Fix calibration data variable
        iio: chemical: bme680: Fix pressure value output
        iio: humidity: hdc3020: fix hysteresis representation
        iio: dac: fix ad9739a random config compile error
        iio: accel: fxls8962af: select IIO_BUFFER & IIO_KFIFO_BUF
        iio: adc: ad7266: Fix variable checking bug
        iio: xilinx-ams: Don't include ams_ctrl_channels in scan_mask
      84dd4373
    • Linus Torvalds's avatar
      Merge tag 'staging-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 12529aa1
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are two small staging driver fixes for 6.10-rc6, both for the
        vc04_services drivers:
      
         - build fix if CONFIG_DEBUGFS was not set
      
         - initialization check fix that was much reported.
      
        Both of these have been in linux-next this week with no reported
        issues"
      
      * tag 'staging-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: vchiq_debugfs: Fix build if CONFIG_DEBUG_FS is not set
        staging: vc04_services: vchiq_arm: Fix initialisation check
      12529aa1
    • Linus Torvalds's avatar
      Merge tag 'tty-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 3e334486
      Linus Torvalds authored
      Pull tty / serial / console fixes from Greg KH:
       "Here are a bunch of fixes/reverts for 6.10-rc6.  Include in here are:
      
         - revert the bunch of tty/serial/console changes that landed in -rc1
           that didn't quite work properly yet.
      
           Everyone agreed to just revert them for now and will work on making
           them better for a future release instead of trying to quick fix the
           existing changes this late in the release cycle
      
         - 8250 driver port count bugfix
      
         - Other tiny serial port bugfixes for reported issues
      
        All of these have been in linux-next this week with no reported
        issues"
      
      * tag 'tty-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "printk: Save console options for add_preferred_console_match()"
        Revert "printk: Don't try to parse DEVNAME:0.0 console options"
        Revert "printk: Flag register_console() if console is set on command line"
        Revert "serial: core: Add support for DEVNAME:0.0 style naming for kernel console"
        Revert "serial: core: Handle serial console options"
        Revert "serial: 8250: Add preferred console in serial8250_isa_init_ports()"
        Revert "Documentation: kernel-parameters: Add DEVNAME:0.0 format for serial ports"
        Revert "serial: 8250: Fix add preferred console for serial8250_isa_init_ports()"
        Revert "serial: core: Fix ifdef for serial base console functions"
        serial: bcm63xx-uart: fix tx after conversion to uart_port_tx_limited()
        serial: core: introduce uart_port_tx_limited_flags()
        Revert "serial: core: only stop transmit when HW fifo is empty"
        serial: imx: set receiver level before starting uart
        tty: mcf: MCF54418 has 10 UARTS
        serial: 8250_omap: Implementation of Errata i2310
        tty: serial: 8250: Fix port count mismatch with the device
      3e334486
    • Linus Torvalds's avatar
      Merge tag 'usb-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 2c01c3d5
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a handful of small USB driver fixes for 6.10-rc6 to resolve
        some reported issues. Included in here are:
      
         - typec driver bugfixes
      
         - usb gadget driver reverts for commits that were reported to have
           problems
      
         - resource leak bugfix
      
         - gadget driver bugfixes
      
         - dwc3 driver bugfixes
      
         - usb atm driver bugfix for when syzbot got loose on it
      
        All of these have been in linux-next this week with no reported issues"
      
      * tag 'usb-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: dwc3: core: Workaround for CSR read timeout
        Revert "usb: gadget: u_ether: Replace netif_stop_queue with netif_device_detach"
        Revert "usb: gadget: u_ether: Re-attach netif device to mirror detachment"
        usb: gadget: aspeed_udc: fix device address configuration
        usb: dwc3: core: remove lock of otg mode during gadget suspend/resume to avoid deadlock
        usb: typec: ucsi: glink: fix child node release in probe function
        usb: musb: da8xx: fix a resource leak in probe()
        usb: typec: ucsi_acpi: Add LG Gram quirk
        usb: ucsi: stm32: fix command completion handling
        usb: atm: cxacru: fix endpoint checking in cxacru_bind()
        usb: gadget: printer: fix races against disable
        usb: gadget: printer: SS+ support
      2c01c3d5
    • Linus Torvalds's avatar
      Merge tag 'smp_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3ffea9a7
      Linus Torvalds authored
      Pull smp fixes from Borislav Petkov:
      
       - Fix "nosmp" and "maxcpus=0" after the parallel CPU bringup work went
         in and broke them
      
       - Make sure CPU hotplug dynamic prepare states are actually executed
      
      * tag 'smp_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu: Fix broken cmdline "nosmp" and "maxcpus=0"
        cpu/hotplug: Fix dynstate assignment in __cpuhp_setup_state_cpuslocked()
      3ffea9a7
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4e412160
      Linus Torvalds authored
      Pull irq fixes from Borislav Petkov:
      
       - Make sure multi-bridge machines get all eiointc interrupt controllers
         initialized even if the number of CPUs has been limited by a cmdline
         param
      
       - Make sure interrupt lines on liointc hw are configured properly even
         when interrupt routing changes
      
       - Avoid use-after-free in the error path of the MSI init code
      
      * tag 'irq_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        PCI/MSI: Fix UAF in msi_capability_init
        irqchip/loongson-liointc: Set different ISRs for different cores
        irqchip/loongson-eiointc: Use early_cpu_to_node() instead of cpu_to_node()
      4e412160
    • Linus Torvalds's avatar
      Merge tag 'timers_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 03c8b0bd
      Linus Torvalds authored
      Pull timer fix from Borislav Petkov:
      
       - Warn when an hrtimer doesn't get a callback supplied
      
      * tag 'timers_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        hrtimer: Prevent queuing of hrtimer without a function callback
      03c8b0bd
    • Linus Torvalds's avatar
      Merge tag 'linux-watchdog-6.10-rc-fixes' of git://www.linux-watchdog.org/linux-watchdog · 327fceff
      Linus Torvalds authored
      Pull watchdog fixes from Wim Van Sebroeck:
      
       - lenovo_se10_wdt: add HAS_IOPORT dependency
      
       - add missing MODULE_DESCRIPTION() macros
      
      * tag 'linux-watchdog-6.10-rc-fixes' of git://www.linux-watchdog.org/linux-watchdog:
        watchdog: add missing MODULE_DESCRIPTION() macros
        watchdog: lenovo_se10_wdt: add HAS_IOPORT dependency
      327fceff
  5. 29 Jun, 2024 5 commits
  6. 28 Jun, 2024 6 commits