• Muchun Song's avatar
    kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler · 110aadd4
    Muchun Song authored
    BugLink: https://bugs.launchpad.net/bugs/1892822
    
    commit 0cb2f137 upstream.
    
    We found a case of kernel panic on our server. The stack trace is as
    follows(omit some irrelevant information):
    
      BUG: kernel NULL pointer dereference, address: 0000000000000080
      RIP: 0010:kprobe_ftrace_handler+0x5e/0xe0
      RSP: 0018:ffffb512c6550998 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: ffff8e9d16eea018 RCX: 0000000000000000
      RDX: ffffffffbe1179c0 RSI: ffffffffc0535564 RDI: ffffffffc0534ec0
      RBP: ffffffffc0534ec1 R08: ffff8e9d1bbb0f00 R09: 0000000000000004
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff8e9d1f797060 R14: 000000000000bacc R15: ffff8e9ce13eca00
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000080 CR3: 00000008453d0005 CR4: 00000000003606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <IRQ>
       ftrace_ops_assist_func+0x56/0xe0
       ftrace_call+0x5/0x34
       tcpa_statistic_send+0x5/0x130 [ttcp_engine]
    
    The tcpa_statistic_send is the function being kprobed. After analysis,
    the root cause is that the fourth parameter regs of kprobe_ftrace_handler
    is NULL. Why regs is NULL? We use the crash tool to analyze the kdump.
    
      crash> dis tcpa_statistic_send -r
             <tcpa_statistic_send>: callq 0xffffffffbd8018c0 <ftrace_caller>
    
    The tcpa_statistic_send calls ftrace_caller instead of ftrace_regs_caller.
    So it is reasonable that the fourth parameter regs of kprobe_ftrace_handler
    is NULL. In theory, we should call the ftrace_regs_caller instead of the
    ftrace_caller. After in-depth analysis, we found a reproducible path.
    
      Writing a simple kernel module which starts a periodic timer. The
      timer's handler is named 'kprobe_test_timer_handler'. The module
      name is kprobe_test.ko.
    
      1) insmod kprobe_test.ko
      2) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}'
      3) echo 0 > /proc/sys/kernel/ftrace_enabled
      4) rmmod kprobe_test
      5) stop step 2) kprobe
      6) insmod kprobe_test.ko
      7) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}'
    
    We mark the kprobe as GONE but not disarm the kprobe in the step 4).
    The step 5) also do not disarm the kprobe when unregister kprobe. So
    we do not remove the ip from the filter. In this case, when the module
    loads again in the step 6), we will replace the code to ftrace_caller
    via the ftrace_module_enable(). When we register kprobe again, we will
    not replace ftrace_caller to ftrace_regs_caller because the ftrace is
    disabled in the step 3). So the step 7) will trigger kernel panic. Fix
    this problem by disarming the kprobe when the module is going away.
    
    Link: https://lkml.kernel.org/r/20200728064536.24405-1-songmuchun@bytedance.com
    
    Cc: stable@vger.kernel.org
    Fixes: ae6aa16f ("kprobes: introduce ftrace based optimization")
    Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
    Co-developed-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
    Signed-off-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
    Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
    Signed-off-by: default avatarIan May <ian.may@canonical.com>
    Signed-off-by: default avatarKelsey Skunberg <kelsey.skunberg@canonical.com>
    110aadd4
kprobes.c 61.6 KB