1. 29 May, 2024 9 commits
    • Baokun Li's avatar
      cachefiles: defer exposing anon_fd until after copy_to_user() succeeds · 4b4391e7
      Baokun Li authored
      After installing the anonymous fd, we can now see it in userland and close
      it. However, at this point we may not have gotten the reference count of
      the cache, but we will put it during colse fd, so this may cause a cache
      UAF.
      
      So grab the cache reference count before fd_install(). In addition, by
      kernel convention, fd is taken over by the user land after fd_install(),
      and the kernel should not call close_fd() after that, i.e., it should call
      fd_install() after everything is ready, thus fd_install() is called after
      copy_to_user() succeeds.
      
      Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
      Suggested-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-10-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      4b4391e7
    • Baokun Li's avatar
      cachefiles: never get a new anonymous fd if ondemand_id is valid · 4988e35e
      Baokun Li authored
      Now every time the daemon reads an open request, it gets a new anonymous fd
      and ondemand_id. With the introduction of "restore", it is possible to read
      the same open request more than once, and therefore an object can have more
      than one anonymous fd.
      
      If the anonymous fd is not unique, the following concurrencies will result
      in an fd leak:
      
           t1     |         t2         |          t3
      ------------------------------------------------------------
       cachefiles_ondemand_init_object
        cachefiles_ondemand_send_req
         REQ_A = kzalloc(sizeof(*req) + data_len)
         wait_for_completion(&REQ_A->done)
                  cachefiles_daemon_read
                   cachefiles_ondemand_daemon_read
                    REQ_A = cachefiles_ondemand_select_req
                    cachefiles_ondemand_get_fd
                      load->fd = fd0
                      ondemand_id = object_id0
                                        ------ restore ------
                                        cachefiles_ondemand_restore
                                         // restore REQ_A
                                        cachefiles_daemon_read
                                         cachefiles_ondemand_daemon_read
                                          REQ_A = cachefiles_ondemand_select_req
                                            cachefiles_ondemand_get_fd
                                              load->fd = fd1
                                              ondemand_id = object_id1
                   process_open_req(REQ_A)
                   write(devfd, ("copen %u,%llu", msg->msg_id, size))
                   cachefiles_ondemand_copen
                    xa_erase(&cache->reqs, id)
                    complete(&REQ_A->done)
         kfree(REQ_A)
                                        process_open_req(REQ_A)
                                        // copen fails due to no req
                                        // daemon close(fd1)
                                        cachefiles_ondemand_fd_release
                                         // set object closed
       -- umount --
       cachefiles_withdraw_cookie
        cachefiles_ondemand_clean_object
         cachefiles_ondemand_init_close_req
          if (!cachefiles_ondemand_object_is_open(object))
            return -ENOENT;
          // The fd0 is not closed until the daemon exits.
      
      However, the anonymous fd holds the reference count of the object and the
      object holds the reference count of the cookie. So even though the cookie
      has been relinquished, it will not be unhashed and freed until the daemon
      exits.
      
      In fscache_hash_cookie(), when the same cookie is found in the hash list,
      if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then the new
      cookie waits for the old cookie to be unhashed, while the old cookie is
      waiting for the leaked fd to be closed, if the daemon does not exit in time
      it will trigger a hung task.
      
      To avoid this, allocate a new anonymous fd only if no anonymous fd has
      been allocated (ondemand_id == 0) or if the previously allocated anonymous
      fd has been closed (ondemand_id == -1). Moreover, returns an error if
      ondemand_id is valid, letting the daemon know that the current userland
      restore logic is abnormal and needs to be checked.
      
      Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-9-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      4988e35e
    • Baokun Li's avatar
      cachefiles: add spin_lock for cachefiles_ondemand_info · 0a790040
      Baokun Li authored
      The following concurrency may cause a read request to fail to be completed
      and result in a hung:
      
                 t1             |             t2
      ---------------------------------------------------------
                                  cachefiles_ondemand_copen
                                    req = xa_erase(&cache->reqs, id)
      // Anon fd is maliciously closed.
      cachefiles_ondemand_fd_release
        xa_lock(&cache->reqs)
        cachefiles_ondemand_set_object_close(object)
        xa_unlock(&cache->reqs)
                                    cachefiles_ondemand_set_object_open
                                    // No one will ever close it again.
      cachefiles_ondemand_daemon_read
        cachefiles_ondemand_select_req
        // Get a read req but its fd is already closed.
        // The daemon can't issue a cread ioctl with an closed fd, then hung.
      
      So add spin_lock for cachefiles_ondemand_info to protect ondemand_id and
      state, thus we can avoid the above problem in cachefiles_ondemand_copen()
      by using ondemand_id to determine if fd has been closed.
      
      Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-8-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      0a790040
    • Baokun Li's avatar
      cachefiles: add consistency check for copen/cread · a26dc49d
      Baokun Li authored
      This prevents malicious processes from completing random copen/cread
      requests and crashing the system. Added checks are listed below:
      
        * Generic, copen can only complete open requests, and cread can only
          complete read requests.
        * For copen, ondemand_id must not be 0, because this indicates that the
          request has not been read by the daemon.
        * For cread, the object corresponding to fd and req should be the same.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-7-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      a26dc49d
    • Baokun Li's avatar
      cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read() · 3e6d704f
      Baokun Li authored
      The err_put_fd label is only used once, so remove it to make the code
      more readable. In addition, the logic for deleting error request and
      CLOSE request is merged to simplify the code.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-6-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      3e6d704f
    • Baokun Li's avatar
      cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() · da4a8274
      Baokun Li authored
      We got the following issue in a fuzz test of randomly issuing the restore
      command:
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0xb41/0xb60
      Read of size 8 at addr ffff888122e84088 by task ondemand-04-dae/963
      
      CPU: 13 PID: 963 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #564
      Call Trace:
       kasan_report+0x93/0xc0
       cachefiles_ondemand_daemon_read+0xb41/0xb60
       vfs_read+0x169/0xb50
       ksys_read+0xf5/0x1e0
      
      Allocated by task 116:
       kmem_cache_alloc+0x140/0x3a0
       cachefiles_lookup_cookie+0x140/0xcd0
       fscache_cookie_state_machine+0x43c/0x1230
       [...]
      
      Freed by task 792:
       kmem_cache_free+0xfe/0x390
       cachefiles_put_object+0x241/0x480
       fscache_cookie_state_machine+0x5c8/0x1230
       [...]
      ==================================================================
      
      Following is the process that triggers the issue:
      
           mount  |   daemon_thread1    |    daemon_thread2
      ------------------------------------------------------------
      cachefiles_withdraw_cookie
       cachefiles_ondemand_clean_object(object)
        cachefiles_ondemand_send_req
         REQ_A = kzalloc(sizeof(*req) + data_len)
         wait_for_completion(&REQ_A->done)
      
                  cachefiles_daemon_read
                   cachefiles_ondemand_daemon_read
                    REQ_A = cachefiles_ondemand_select_req
                    msg->object_id = req->object->ondemand->ondemand_id
                                        ------ restore ------
                                        cachefiles_ondemand_restore
                                        xas_for_each(&xas, req, ULONG_MAX)
                                         xas_set_mark(&xas, CACHEFILES_REQ_NEW)
      
                                        cachefiles_daemon_read
                                         cachefiles_ondemand_daemon_read
                                          REQ_A = cachefiles_ondemand_select_req
                    copy_to_user(_buffer, msg, n)
                     xa_erase(&cache->reqs, id)
                     complete(&REQ_A->done)
                    ------ close(fd) ------
                    cachefiles_ondemand_fd_release
                     cachefiles_put_object
       cachefiles_put_object
        kmem_cache_free(cachefiles_object_jar, object)
                                          REQ_A->object->ondemand->ondemand_id
                                           // object UAF !!!
      
      When we see the request within xa_lock, req->object must not have been
      freed yet, so grab the reference count of object before xa_unlock to
      avoid the above issue.
      
      Fixes: 0a7e54c1 ("cachefiles: resend an open request if the read request's object is closed")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-5-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      da4a8274
    • Baokun Li's avatar
      cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() · de3e26f9
      Baokun Li authored
      We got the following issue in a fuzz test of randomly issuing the restore
      command:
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0x609/0xab0
      Write of size 4 at addr ffff888109164a80 by task ondemand-04-dae/4962
      
      CPU: 11 PID: 4962 Comm: ondemand-04-dae Not tainted 6.8.0-rc7-dirty #542
      Call Trace:
       kasan_report+0x94/0xc0
       cachefiles_ondemand_daemon_read+0x609/0xab0
       vfs_read+0x169/0xb50
       ksys_read+0xf5/0x1e0
      
      Allocated by task 626:
       __kmalloc+0x1df/0x4b0
       cachefiles_ondemand_send_req+0x24d/0x690
       cachefiles_create_tmpfile+0x249/0xb30
       cachefiles_create_file+0x6f/0x140
       cachefiles_look_up_object+0x29c/0xa60
       cachefiles_lookup_cookie+0x37d/0xca0
       fscache_cookie_state_machine+0x43c/0x1230
       [...]
      
      Freed by task 626:
       kfree+0xf1/0x2c0
       cachefiles_ondemand_send_req+0x568/0x690
       cachefiles_create_tmpfile+0x249/0xb30
       cachefiles_create_file+0x6f/0x140
       cachefiles_look_up_object+0x29c/0xa60
       cachefiles_lookup_cookie+0x37d/0xca0
       fscache_cookie_state_machine+0x43c/0x1230
       [...]
      ==================================================================
      
      Following is the process that triggers the issue:
      
           mount  |   daemon_thread1    |    daemon_thread2
      ------------------------------------------------------------
       cachefiles_ondemand_init_object
        cachefiles_ondemand_send_req
         REQ_A = kzalloc(sizeof(*req) + data_len)
         wait_for_completion(&REQ_A->done)
      
                  cachefiles_daemon_read
                   cachefiles_ondemand_daemon_read
                    REQ_A = cachefiles_ondemand_select_req
                    cachefiles_ondemand_get_fd
                    copy_to_user(_buffer, msg, n)
                  process_open_req(REQ_A)
                                        ------ restore ------
                                        cachefiles_ondemand_restore
                                        xas_for_each(&xas, req, ULONG_MAX)
                                         xas_set_mark(&xas, CACHEFILES_REQ_NEW);
      
                                        cachefiles_daemon_read
                                         cachefiles_ondemand_daemon_read
                                          REQ_A = cachefiles_ondemand_select_req
      
                   write(devfd, ("copen %u,%llu", msg->msg_id, size));
                   cachefiles_ondemand_copen
                    xa_erase(&cache->reqs, id)
                    complete(&REQ_A->done)
         kfree(REQ_A)
                                          cachefiles_ondemand_get_fd(REQ_A)
                                           fd = get_unused_fd_flags
                                           file = anon_inode_getfile
                                           fd_install(fd, file)
                                           load = (void *)REQ_A->msg.data;
                                           load->fd = fd;
                                           // load UAF !!!
      
      This issue is caused by issuing a restore command when the daemon is still
      alive, which results in a request being processed multiple times thus
      triggering a UAF. So to avoid this problem, add an additional reference
      count to cachefiles_req, which is held while waiting and reading, and then
      released when the waiting and reading is over.
      
      Note that since there is only one reference count for waiting, we need to
      avoid the same request being completed multiple times, so we can only
      complete the request if it is successfully removed from the xarray.
      
      Fixes: e73fa11a ("cachefiles: add restore command to recover inflight ondemand read requests")
      Suggested-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-4-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      de3e26f9
    • Baokun Li's avatar
      cachefiles: remove requests from xarray during flushing requests · 0fc75c59
      Baokun Li authored
      Even with CACHEFILES_DEAD set, we can still read the requests, so in the
      following concurrency the request may be used after it has been freed:
      
           mount  |   daemon_thread1    |    daemon_thread2
      ------------------------------------------------------------
       cachefiles_ondemand_init_object
        cachefiles_ondemand_send_req
         REQ_A = kzalloc(sizeof(*req) + data_len)
         wait_for_completion(&REQ_A->done)
                  cachefiles_daemon_read
                   cachefiles_ondemand_daemon_read
                                        // close dev fd
                                        cachefiles_flush_reqs
                                         complete(&REQ_A->done)
         kfree(REQ_A)
                    xa_lock(&cache->reqs);
                    cachefiles_ondemand_select_req
                      req->msg.opcode != CACHEFILES_OP_READ
                      // req use-after-free !!!
                    xa_unlock(&cache->reqs);
                                         xa_destroy(&cache->reqs)
      
      Hence remove requests from cache->reqs when flushing them to avoid
      accessing freed requests.
      
      Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-3-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      0fc75c59
    • Baokun Li's avatar
      cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd · cc5ac966
      Baokun Li authored
      This lets us see the correct trace output.
      
      Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240522114308.2402121-2-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      cc5ac966
  2. 22 May, 2024 1 commit
  3. 21 May, 2024 1 commit
    • Jiangfeng Xiao's avatar
      arm64: asm-bug: Add .align 2 to the end of __BUG_ENTRY · ffbf4fb9
      Jiangfeng Xiao authored
      When CONFIG_DEBUG_BUGVERBOSE=n, we fail to add necessary padding bytes
      to bug_table entries, and as a result the last entry in a bug table will
      be ignored, potentially leading to an unexpected panic(). All prior
      entries in the table will be handled correctly.
      
      The arm64 ABI requires that struct fields of up to 8 bytes are
      naturally-aligned, with padding added within a struct such that struct
      are suitably aligned within arrays.
      
      When CONFIG_DEBUG_BUGVERPOSE=y, the layout of a bug_entry is:
      
      	struct bug_entry {
      		signed int      bug_addr_disp;	// 4 bytes
      		signed int      file_disp;	// 4 bytes
      		unsigned short  line;		// 2 bytes
      		unsigned short  flags;		// 2 bytes
      	}
      
      ... with 12 bytes total, requiring 4-byte alignment.
      
      When CONFIG_DEBUG_BUGVERBOSE=n, the layout of a bug_entry is:
      
      	struct bug_entry {
      		signed int      bug_addr_disp;	// 4 bytes
      		unsigned short  flags;		// 2 bytes
      		< implicit padding >		// 2 bytes
      	}
      
      ... with 8 bytes total, with 6 bytes of data and 2 bytes of trailing
      padding, requiring 4-byte alginment.
      
      When we create a bug_entry in assembly, we align the start of the entry
      to 4 bytes, which implicitly handles padding for any prior entries.
      However, we do not align the end of the entry, and so when
      CONFIG_DEBUG_BUGVERBOSE=n, the final entry lacks the trailing padding
      bytes.
      
      For the main kernel image this is not a problem as find_bug() doesn't
      depend on the trailing padding bytes when searching for entries:
      
      	for (bug = __start___bug_table; bug < __stop___bug_table; ++bug)
      		if (bugaddr == bug_addr(bug))
      			return bug;
      
      However for modules, module_bug_finalize() depends on the trailing
      bytes when calculating the number of entries:
      
      	mod->num_bugs = sechdrs[i].sh_size / sizeof(struct bug_entry);
      
      ... and as the last bug_entry lacks the necessary padding bytes, this entry
      will not be counted, e.g. in the case of a single entry:
      
      	sechdrs[i].sh_size == 6
      	sizeof(struct bug_entry) == 8;
      
      	sechdrs[i].sh_size / sizeof(struct bug_entry) == 0;
      
      Consequently module_find_bug() will miss the last bug_entry when it does:
      
      	for (i = 0; i < mod->num_bugs; ++i, ++bug)
      		if (bugaddr == bug_addr(bug))
      			goto out;
      
      ... which can lead to a kenrel panic due to an unhandled bug.
      
      This can be demonstrated with the following module:
      
      	static int __init buginit(void)
      	{
      		WARN(1, "hello\n");
      		return 0;
      	}
      
      	static void __exit bugexit(void)
      	{
      	}
      
      	module_init(buginit);
      	module_exit(bugexit);
      	MODULE_LICENSE("GPL");
      
      ... which will trigger a kernel panic when loaded:
      
      	------------[ cut here ]------------
      	hello
      	Unexpected kernel BRK exception at EL1
      	Internal error: BRK handler: 00000000f2000800 [#1] PREEMPT SMP
      	Modules linked in: hello(O+)
      	CPU: 0 PID: 50 Comm: insmod Tainted: G           O       6.9.1 #8
      	Hardware name: linux,dummy-virt (DT)
      	pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      	pc : buginit+0x18/0x1000 [hello]
      	lr : buginit+0x18/0x1000 [hello]
      	sp : ffff800080533ae0
      	x29: ffff800080533ae0 x28: 0000000000000000 x27: 0000000000000000
      	x26: ffffaba8c4e70510 x25: ffff800080533c30 x24: ffffaba8c4a28a58
      	x23: 0000000000000000 x22: 0000000000000000 x21: ffff3947c0eab3c0
      	x20: ffffaba8c4e3f000 x19: ffffaba846464000 x18: 0000000000000006
      	x17: 0000000000000000 x16: ffffaba8c2492834 x15: 0720072007200720
      	x14: 0720072007200720 x13: ffffaba8c49b27c8 x12: 0000000000000312
      	x11: 0000000000000106 x10: ffffaba8c4a0a7c8 x9 : ffffaba8c49b27c8
      	x8 : 00000000ffffefff x7 : ffffaba8c4a0a7c8 x6 : 80000000fffff000
      	x5 : 0000000000000107 x4 : 0000000000000000 x3 : 0000000000000000
      	x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff3947c0eab3c0
      	Call trace:
      	 buginit+0x18/0x1000 [hello]
      	 do_one_initcall+0x80/0x1c8
      	 do_init_module+0x60/0x218
      	 load_module+0x1ba4/0x1d70
      	 __do_sys_init_module+0x198/0x1d0
      	 __arm64_sys_init_module+0x1c/0x28
      	 invoke_syscall+0x48/0x114
      	 el0_svc_common.constprop.0+0x40/0xe0
      	 do_el0_svc+0x1c/0x28
      	 el0_svc+0x34/0xd8
      	 el0t_64_sync_handler+0x120/0x12c
      	 el0t_64_sync+0x190/0x194
      	Code: d0ffffe0 910003fd 91000000 9400000b (d4210000)
      	---[ end trace 0000000000000000 ]---
      	Kernel panic - not syncing: BRK handler: Fatal exception
      
      Fix this by always aligning the end of a bug_entry to 4 bytes, which is
      correct regardless of CONFIG_DEBUG_BUGVERBOSE.
      
      Fixes: 9fb7410f ("arm64/BUG: Use BRK instruction for generic BUG traps")
      Signed-off-by: default avatarYuanbin Xie <xieyuanbin1@huawei.com>
      Signed-off-by: default avatarJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Link: https://lore.kernel.org/r/1716212077-43826-1-git-send-email-xiaojiangfeng@huawei.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      ffbf4fb9
  4. 17 May, 2024 2 commits
  5. 10 May, 2024 5 commits
  6. 09 May, 2024 8 commits
    • Will Deacon's avatar
      Merge branch 'for-next/tlbi' into for-next/core · 54e1a2aa
      Will Deacon authored
      * for-next/tlbi:
        arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES
        arm64: tlb: Improve __TLBI_VADDR_RANGE()
        arm64: tlb: Fix TLBI RANGE operand
      54e1a2aa
    • Will Deacon's avatar
      Merge branch 'for-next/selftests' into for-next/core · 46e336c7
      Will Deacon authored
      * for-next/selftests:
        kselftest: arm64: Add a null pointer check
        kselftest/arm64: Remove unused parameters in abi test
      46e336c7
    • Will Deacon's avatar
      Merge branch 'for-next/perf' into for-next/core · 42e7ddba
      Will Deacon authored
      * for-next/perf: (41 commits)
        arm64: Add USER_STACKTRACE support
        drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()
        drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group
        drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group
        perf/arm-spe: Assign parents for event_source device
        perf/arm-smmuv3: Assign parents for event_source device
        perf/arm-dsu: Assign parents for event_source device
        perf/arm-dmc620: Assign parents for event_source device
        perf/arm-ccn: Assign parents for event_source device
        perf/arm-cci: Assign parents for event_source device
        perf/alibaba_uncore: Assign parents for event_source device
        perf/arm_pmu: Assign parents for event_source devices
        perf/imx_ddr: Assign parents for event_source devices
        perf/qcom: Assign parents for event_source devices
        Documentation: qcom-pmu: Use /sys/bus/event_source/devices paths
        perf/riscv: Assign parents for event_source devices
        perf/thunderx2: Assign parents for event_source devices
        Documentation: thunderx2-pmu: Use /sys/bus/event_source/devices paths
        perf/xgene: Assign parents for event_source devices
        Documentation: xgene-pmu: Use /sys/bus/event_source/devices paths
        ...
      42e7ddba
    • Will Deacon's avatar
      Merge branch 'for-next/mm' into for-next/core · a5a5ce57
      Will Deacon authored
      * for-next/mm:
        arm64/mm: Fix pud_user_accessible_page() for PGTABLE_LEVELS <= 2
        arm64/mm: Add uffd write-protect support
        arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG
        arm64/mm: Remove PTE_PROT_NONE bit
        arm64/mm: generalize PMD_PRESENT_INVALID for all levels
        arm64: mm: Don't remap pgtables for allocate vs populate
        arm64: mm: Batch dsb and isb when populating pgtables
        arm64: mm: Don't remap pgtables per-cont(pte|pmd) block
      a5a5ce57
    • Will Deacon's avatar
      Merge branch 'for-next/misc' into for-next/core · 7a7f6045
      Will Deacon authored
      * for-next/misc:
        arm64: simplify arch_static_branch/_jump function
        arm64: Add the arm64.no32bit_el0 command line option
        arm64: defer clearing DAIF.D
        arm64: assembler: update stale comment for disable_step_tsk
        arm64/sysreg: Update PIE permission encodings
        arm64: Add Neoverse-V2 part
        arm64: Remove unnecessary irqflags alternative.h include
      7a7f6045
    • Will Deacon's avatar
      Merge branch 'for-next/kbuild' into for-next/core · d4ea881f
      Will Deacon authored
      * for-next/kbuild:
        arm64: boot: Support Flat Image Tree
        arm64: Add BOOT_TARGETS variable
      d4ea881f
    • Will Deacon's avatar
      Merge branch 'for-next/acpi' into for-next/core · b2b7cc6d
      Will Deacon authored
      * for-next/acpi:
        arm64: acpi: Honour firmware_signature field of FACS, if it exists
        ACPICA: Detect FACS even for hardware reduced platforms
      b2b7cc6d
    • Ryan Roberts's avatar
      arm64/mm: Fix pud_user_accessible_page() for PGTABLE_LEVELS <= 2 · cb67ea12
      Ryan Roberts authored
      The recent change to use pud_valid() as part of the implementation of
      pud_user_accessible_page() fails to build when PGTABLE_LEVELS <= 2
      because pud_valid() is not defined in that case.
      
      Fix this by defining pud_valid() to false for this case. This means that
      pud_user_accessible_page() will correctly always return false for this
      config.
      
      Fixes: f0f5863a ("arm64/mm: Remove PTE_PROT_NONE bit")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202405082221.43rfWxz5-lkp@intel.com/Signed-off-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Link: https://lore.kernel.org/r/20240509122844.563320-1-ryan.roberts@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      cb67ea12
  7. 03 May, 2024 7 commits
  8. 28 Apr, 2024 7 commits
    • Hao Chen's avatar
      drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset() · 582c1aee
      Hao Chen authored
      pci_alloc_irq_vectors() allocates an irq vector. When devm_add_action()
      fails, the irq vector is not freed, which leads to a memory leak.
      
      Replace the devm_add_action with devm_add_action_or_reset to ensure
      the irq vector can be destroyed when it fails.
      
      Fixes: 66637ab1 ("drivers/perf: hisi: add driver for HNS3 PMU")
      Signed-off-by: default avatarHao Chen <chenhao418@huawei.com>
      Signed-off-by: default avatarJunhao He <hejunhao3@huawei.com>
      Reviewed-by: default avatarJijie Shao <shaojijie@huawei.com>
      Acked-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Link: https://lore.kernel.org/r/20240425124627.13764-4-hejunhao3@huawei.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      582c1aee
    • Junhao He's avatar
      drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group · 81bdd60a
      Junhao He authored
      The perf tool allows users to create event groups through following
      cmd [1], but the driver does not check whether the array index is out
      of bounds when writing data to the event_group array. If the number of
      events in an event_group is greater than HNS3_PMU_MAX_HW_EVENTS, the
      memory write overflow of event_group array occurs.
      
      Add array index check to fix the possible array out of bounds violation,
      and return directly when write new events are written to array bounds.
      
      There are 9 different events in an event_group.
      [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}
      
      Fixes: 66637ab1 ("drivers/perf: hisi: add driver for HNS3 PMU")
      Signed-off-by: default avatarJunhao He <hejunhao3@huawei.com>
      Signed-off-by: default avatarHao Chen <chenhao418@huawei.com>
      Acked-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Reviewed-by: default avatarJijie Shao <shaojijie@huawei.com>
      Link: https://lore.kernel.org/r/20240425124627.13764-3-hejunhao3@huawei.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      81bdd60a
    • Junhao He's avatar
      drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group · 77fce826
      Junhao He authored
      The perf tool allows users to create event groups through following
      cmd [1], but the driver does not check whether the array index is out of
      bounds when writing data to the event_group array. If the number of events
      in an event_group is greater than HISI_PCIE_MAX_COUNTERS, the memory write
      overflow of event_group array occurs.
      
      Add array index check to fix the possible array out of bounds violation,
      and return directly when write new events are written to array bounds.
      
      There are 9 different events in an event_group.
      [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}'
      
      Fixes: 8404b0fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
      Signed-off-by: default avatarJunhao He <hejunhao3@huawei.com>
      Reviewed-by: default avatarJijie Shao <shaojijie@huawei.com>
      Acked-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Link: https://lore.kernel.org/r/20240425124627.13764-2-hejunhao3@huawei.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      77fce826
    • Kunwu Chan's avatar
      kselftest: arm64: Add a null pointer check · 80164282
      Kunwu Chan authored
      There is a 'malloc' call, which can be unsuccessful.
      This patch will add the malloc failure checking
      to avoid possible null dereference and give more information
      about test fail reasons.
      Signed-off-by: default avatarKunwu Chan <chentao@kylinos.cn>
      Reviewed-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Link: https://lore.kernel.org/r/20240423082102.2018886-1-chentao@kylinos.cnSigned-off-by: default avatarWill Deacon <will@kernel.org>
      80164282
    • Mark Rutland's avatar
      arm64: defer clearing DAIF.D · 080297be
      Mark Rutland authored
      For historical reasons we unmask debug exceptions in __cpu_setup(), but
      it's not necessary to unmask debug exceptions this early in the
      boot/idle entry paths. It would be better to unmask debug exceptions
      later in C code as this simplifies the current code and will make it
      easier to rework exception masking logic to handle non-DAIF bits in
      future (e.g. PSTATE.{ALLINT,PM}).
      
      We started clearing DAIF.D in __cpu_setup() in commit:
      
        2ce39ad1 ("arm64: debug: unmask PSTATE.D earlier")
      
      At the time, we needed to ensure that DAIF.D was clear on the primary
      CPU before scheduling and preemption were possible, and chose to do this
      in __cpu_setup() so that this occurred in the same place for primary and
      secondary CPUs. As we cannot handle debug exceptions this early, we
      placed an ISB between initializing MDSCR_EL1 and clearing DAIF.D so that
      no exceptions should be triggered.
      
      Subsequently we rewrote the return-from-{idle,suspend} paths to use
      __cpu_setup() in commit:
      
        cabe1c81 ("arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va")
      
      ... which allowed for earlier use of the MMU and had the desirable
      property of using the same code to reset the CPU in the cold and warm
      boot paths. This introduced a bug: DAIF.D was clear while
      cpu_do_resume() restored MDSCR_EL1 and other control registers (e.g.
      breakpoint/watchpoint control/value registers), and so we could
      unexpectedly take debug exceptions.
      
      We fixed that in commit:
      
        744c6c37 ("arm64: kernel: Fix unmasked debug exceptions when restoring mdscr_el1")
      
      ... by having cpu_do_resume() use the `disable_dbg` macro to set DAIF.D
      before restoring MDSCR_EL1 and other control registers. This relies on
      DAIF.D being subsequently cleared again in cpu_resume().
      
      Subsequently we reworked DAIF masking in commit:
      
        0fbeb318 ("arm64: explicitly mask all exceptions")
      
      ... where we began enforcing a policy that DAIF.D being set implies all
      other DAIF bits are set, and so e.g. we cannot take an IRQ while DAIF.D
      is set. As part of this the use of `disable_dbg` in cpu_resume() was
      replaced with `disable_daif` for consistency with the rest of the
      kernel.
      
      These days, there's no need to clear DAIF.D early within __cpu_setup():
      
      * setup_arch() clears DAIF.DA before scheduling and preemption are
        possible on the primary CPU, avoiding the problem we we originally
        trying to work around.
      
        Note: DAIF.IF get cleared later when interrupts are enabled for the
        first time.
      
      * secondary_start_kernel() clears all DAIF bits before scheduling and
        preemption are possible on secondary CPUs.
      
        Note: with pseudo-NMI, the PMR is initialized here before any DAIF
        bits are cleared. Similar will be necessary for the architectural NMI.
      
      * cpu_suspend() restores all DAIF bits when returning from idle,
        ensuring that we don't unexpectedly leave DAIF.D clear or set.
      
        Note: with pseudo-NMI, the PMR is initialized here before DAIF is
        cleared. Similar will be necessary for the architectural NMI.
      
      This patch removes the unmasking of debug exceptions from __cpu_setup(),
      relying on the above locations to initialize DAIF. This allows some
      other cleanups:
      
      * It is no longer necessary for cpu_resume() to explicitly mask debug
        (or other) exceptions, as it is always called with all DAIF bits set.
        Thus we drop the use of `disable_daif`.
      
      * The `enable_dbg` macro is no longer used, and so is dropped.
      
      * It is no longer necessary to have an ISB immediately after
        initializing MDSCR_EL1 in __cpu_setup(), and we can revert to relying
        on the context synchronization that occurs when the MMU is enabled
        between __cpu_setup() and code which clears DAIF.D
      
      Comments are added to setup_arch() and secondary_start_kernel() to
      explain the initial unmasking of the DAIF bits.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20240422113523.4070414-3-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      080297be
    • Mark Rutland's avatar
      arm64: assembler: update stale comment for disable_step_tsk · 3a2d2ca4
      Mark Rutland authored
      A comment in the disable_step_tsk macro refers to synchronising with
      enable_dbg, as historically the entry used enable_dbg to unmask debug
      exceptions after disabling single-stepping.
      
      These days the unmasking happens in entry-common.c via
      local_daif_restore() or local_daif_inherit(), so the comment is stale.
      This logic is likely to chang in future, so it would be best to avoid
      referring to those macros specifically.
      
      Update the comment to take this into account, and describe it in terms
      of clearing DAIF.D so that it doesn't macro where this logic lives nor
      what it is called.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarMark Brown <broonie@kernel.org>
      Link: https://lore.kernel.org/r/20240422113523.4070414-2-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      3a2d2ca4
    • Shiqi Liu's avatar
      arm64/sysreg: Update PIE permission encodings · 12d712dc
      Shiqi Liu authored
      Fix left shift overflow issue when the parameter idx is greater than or
      equal to 8 in the calculation of perm in PIRx_ELx_PERM macro.
      
      Fix this by modifying the encoding to use a long integer type.
      Signed-off-by: default avatarShiqi Liu <shiqiliu@hust.edu.cn>
      Acked-by: default avatarMarc Zyngier <maz@kernel.org>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/20240421063328.29710-1-shiqiliu@hust.edu.cnSigned-off-by: default avatarWill Deacon <will@kernel.org>
      12d712dc