• GONG, Ruiqi's avatar
    Randomized slab caches for kmalloc() · 3c615294
    GONG, Ruiqi authored
    When exploiting memory vulnerabilities, "heap spraying" is a common
    technique targeting those related to dynamic memory allocation (i.e. the
    "heap"), and it plays an important role in a successful exploitation.
    Basically, it is to overwrite the memory area of vulnerable object by
    triggering allocation in other subsystems or modules and therefore
    getting a reference to the targeted memory location. It's usable on
    various types of vulnerablity including use after free (UAF), heap out-
    of-bound write and etc.
    
    There are (at least) two reasons why the heap can be sprayed: 1) generic
    slab caches are shared among different subsystems and modules, and
    2) dedicated slab caches could be merged with the generic ones.
    Currently these two factors cannot be prevented at a low cost: the first
    one is a widely used memory allocation mechanism, and shutting down slab
    merging completely via `slub_nomerge` would be overkill.
    
    To efficiently prevent heap spraying, we propose the following approach:
    to create multiple copies of generic slab caches that will never be
    merged, and random one of them will be used at allocation. The random
    selection is based on the address of code that calls `kmalloc()`, which
    means it is static at runtime (rather than dynamically determined at
    each time of allocation, which could be bypassed by repeatedly spraying
    in brute force). In other words, the randomness of cache selection will
    be with respect to the code address rather than time, i.e. allocations
    in different code paths would most likely pick different caches,
    although kmalloc() at each place would use the same cache copy whenever
    it is executed. In this way, the vulnerable object and memory allocated
    in other subsystems and modules will (most probably) be on different
    slab caches, which prevents the object from being sprayed.
    
    Meanwhile, the static random selection is further enhanced with a
    per-boot random seed, which prevents the attacker from finding a usable
    kmalloc that happens to pick the same cache with the vulnerable
    subsystem/module by analyzing the open source code. In other words, with
    the per-boot seed, the random selection is static during each time the
    system starts and runs, but not across different system startups.
    
    The overhead of performance has been tested on a 40-core x86 server by
    comparing the results of `perf bench all` between the kernels with and
    without this patch based on the latest linux-next kernel, which shows
    minor difference. A subset of benchmarks are listed below:
    
                    sched/  sched/  syscall/       mem/       mem/
                 messaging    pipe     basic     memcpy     memset
                     (sec)   (sec)     (sec)   (GB/sec)   (GB/sec)
    
    control1         0.019   5.459     0.733  15.258789  51.398026
    control2         0.019   5.439     0.730  16.009221  48.828125
    control3         0.019   5.282     0.735  16.009221  48.828125
    control_avg      0.019   5.393     0.733  15.759077  49.684759
    
    experiment1      0.019   5.374     0.741  15.500992  46.502976
    experiment2      0.019   5.440     0.746  16.276042  51.398026
    experiment3      0.019   5.242     0.752  15.258789  51.398026
    experiment_avg   0.019   5.352     0.746  15.678608  49.766343
    
    The overhead of memory usage was measured by executing `free` after boot
    on a QEMU VM with 1GB total memory, and as expected, it's positively
    correlated with # of cache copies:
    
               control  4 copies  8 copies  16 copies
    
    total       969.8M    968.2M    968.2M     968.2M
    used         20.0M     21.9M     24.1M      26.7M
    free        936.9M    933.6M    931.4M     928.6M
    available   932.2M    928.8M    926.6M     923.9M
    Co-developed-by: default avatarXiu Jianfeng <xiujianfeng@huawei.com>
    Signed-off-by: default avatarXiu Jianfeng <xiujianfeng@huawei.com>
    Signed-off-by: default avatarGONG, Ruiqi <gongruiqi@huaweicloud.com>
    Reviewed-by: default avatarKees Cook <keescook@chromium.org>
    Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
    Acked-by: Dennis Zhou <dennis@kernel.org> # percpu
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    3c615294
percpu.h 4.66 KB