• Hou Tao's avatar
    bpf: Use c->unit_size to select target cache during free · 7ac5c53e
    Hou Tao authored
    At present, bpf memory allocator uses check_obj_size() to ensure that
    ksize() of allocated pointer is equal with the unit_size of used
    bpf_mem_cache. Its purpose is to prevent bpf_mem_free() from selecting
    a bpf_mem_cache which has different unit_size compared with the
    bpf_mem_cache used for allocation. But as reported by lkp, the return
    value of ksize() or kmalloc_size_roundup() may change due to slab merge
    and it will lead to the warning report in check_obj_size().
    
    The reported warning happened as follows:
    (1) in bpf_mem_cache_adjust_size(), kmalloc_size_roundup(96) returns the
    object_size of kmalloc-96 instead of kmalloc-cg-96. The object_size of
    kmalloc-96 is 96, so size_index for 96 is not adjusted accordingly.
    (2) the object_size of kmalloc-cg-96 is adjust from 96 to 128 due to
    slab merge in __kmem_cache_alias(). For SLAB, SLAB_HWCACHE_ALIGN is
    enabled by default for kmalloc slab, so align is 64 and size is 128 for
    kmalloc-cg-96. SLUB has a similar merge logic, but its object_size will
    not be changed, because its align is 8 under x86-64.
    (3) when unit_alloc() does kmalloc_node(96, __GFP_ACCOUNT, node),
    ksize() returns 128 instead of 96 for the returned pointer.
    (4) the warning in check_obj_size() is triggered.
    
    Considering the slab merge can happen in anytime (e.g, a slab created in
    a new module), the following case is also possible: during the
    initialization of bpf_global_ma, there is no slab merge and ksize() for
    a 96-bytes object returns 96. But after that a new slab created by a
    kernel module is merged to kmalloc-cg-96 and the object_size of
    kmalloc-cg-96 is adjust from 96 to 128 (which is possible for x86-64 +
    CONFIG_SLAB, because its alignment requirement is 64 for 96-bytes slab).
    So soon or later, when bpf_global_ma frees a 96-byte-sized pointer
    which is allocated from bpf_mem_cache with unit_size=96, bpf_mem_free()
    will free the pointer through a bpf_mem_cache in which unit_size is 128,
    because the return value of ksize() changes. The warning for the
    mismatch will be triggered again.
    
    A feasible fix is introducing similar APIs compared with ksize() and
    kmalloc_size_roundup() to return the actually-allocated size instead of
    size which may change due to slab merge, but it will introduce
    unnecessary dependency on the implementation details of mm subsystem.
    
    As for now the pointer of bpf_mem_cache is saved in the 8-bytes area
    (or 4-bytes under 32-bit host) above the returned pointer, using
    unit_size in the saved bpf_mem_cache to select the target cache instead
    of inferring the size from the pointer itself. Beside no extra
    dependency on mm subsystem, the performance for bpf_mem_free_rcu() is
    also improved as shown below.
    
    Before applying the patch, the performances of bpf_mem_alloc() and
    bpf_mem_free_rcu() on 8-CPUs VM with one producer are as follows:
    
    kmalloc : alloc 11.69 ± 0.28M/s free 29.58 ± 0.93M/s
    percpu  : alloc 14.11 ± 0.52M/s free 14.29 ± 0.99M/s
    
    After apply the patch, the performance for bpf_mem_free_rcu() increases
    9% and 146% for kmalloc memory and per-cpu memory respectively:
    
    kmalloc: alloc 11.01 ± 0.03M/s free   32.42 ± 0.48M/s
    percpu:  alloc 12.84 ± 0.12M/s free   35.24 ± 0.23M/s
    
    After the fixes, there is no need to adjust size_index to fix the
    mismatch between allocation and free, so remove it as well. Also return
    NULL instead of ZERO_SIZE_PTR for zero-sized alloc in bpf_mem_alloc(),
    because there is no bpf_mem_cache pointer saved above ZERO_SIZE_PTR.
    
    Fixes: 9077fc22 ("bpf: Use kmalloc_size_roundup() to adjust size_index")
    Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
    Closes: https://lore.kernel.org/bpf/202310302113.9f8fe705-oliver.sang@intel.comSigned-off-by: default avatarHou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/20231216131052.27621-2-houtao@huaweicloud.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    7ac5c53e
memalloc.c 24.9 KB