• Kui-Feng Lee's avatar
    bpf: Fix the kernel crash caused by bpf_setsockopt(). · 5416c9ae
    Kui-Feng Lee authored
    The kernel crash was caused by a BPF program attached to the
    "lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to
    `bpf_setsockopt()` in order to set the TCP_NODELAY flag as an
    example. Flags like TCP_NODELAY can prompt the kernel to flush a
    socket's outgoing queue, and this hook
    "lsm_cgroup/socket_sock_rcv_skb" is frequently triggered by
    softirqs. The issue was that in certain circumstances, when
    `tcp_write_xmit()` was called to flush the queue, it would also allow
    BH (bottom-half) to run. This could lead to our program attempting to
    flush the same socket recursively, which caused a `skbuff` to be
    unlinked twice.
    
    `security_sock_rcv_skb()` is triggered by `tcp_filter()`. This occurs
    before the sock ownership is checked in `tcp_v4_rcv()`. Consequently,
    if a bpf program runs on `security_sock_rcv_skb()` while under softirq
    conditions, it may not possess the lock needed for `bpf_setsockopt()`,
    thus presenting an issue.
    
    The patch fixes this issue by ensuring that a BPF program attached to
    the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call
    `bpf_setsockopt()`.
    
    The differences from v1 are
     - changing commit log to explain holding the lock of the sock,
     - emphasizing that TCP_NODELAY is not the only flag, and
     - adding the fixes tag.
    
    v1: https://lore.kernel.org/bpf/20230125000244.1109228-1-kuifeng@meta.com/Signed-off-by: default avatarKui-Feng Lee <kuifeng@meta.com>
    Fixes: 9113d7e4 ("bpf: expose bpf_{g,s}etsockopt to lsm cgroup")
    Link: https://lore.kernel.org/r/20230127001732.4162630-1-kuifeng@meta.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
    5416c9ae
bpf_lsm.c 10.9 KB