• Steven Rostedt's avatar
    tracing: Add test for user space strings when filtering on string pointers · 77360f9b
    Steven Rostedt authored
    Pingfan reported that the following causes a fault:
    
      echo "filename ~ \"cpu\"" > events/syscalls/sys_enter_openat/filter
      echo 1 > events/syscalls/sys_enter_at/enable
    
    The reason is that trace event filter treats the user space pointer
    defined by "filename" as a normal pointer to compare against the "cpu"
    string. The following bug happened:
    
     kvm-03-guest16 login: [72198.026181] BUG: unable to handle page fault for address: 00007fffaae8ef60
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0001) - permissions violation
     PGD 80000001008b7067 P4D 80000001008b7067 PUD 2393f1067 PMD 2393ec067 PTE 8000000108f47867
     Oops: 0001 [#1] PREEMPT SMP PTI
     CPU: 1 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-32.el9.x86_64 #1
     Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
     RIP: 0010:strlen+0x0/0x20
     Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11
           48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8
           48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
     RSP: 0018:ffffb5b900013e48 EFLAGS: 00010246
     RAX: 0000000000000018 RBX: ffff8fc1c49ede00 RCX: 0000000000000000
     RDX: 0000000000000020 RSI: ffff8fc1c02d601c RDI: 00007fffaae8ef60
     RBP: 00007fffaae8ef60 R08: 0005034f4ddb8ea4 R09: 0000000000000000
     R10: ffff8fc1c02d601c R11: 0000000000000000 R12: ffff8fc1c8a6e380
     R13: 0000000000000000 R14: ffff8fc1c02d6010 R15: ffff8fc1c00453c0
     FS:  00007fa86123db40(0000) GS:ffff8fc2ffd00000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007fffaae8ef60 CR3: 0000000102880001 CR4: 00000000007706e0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     PKRU: 55555554
     Call Trace:
      filter_pred_pchar+0x18/0x40
      filter_match_preds+0x31/0x70
      ftrace_syscall_enter+0x27a/0x2c0
      syscall_trace_enter.constprop.0+0x1aa/0x1d0
      do_syscall_64+0x16/0x90
      entry_SYSCALL_64_after_hwframe+0x44/0xae
     RIP: 0033:0x7fa861d88664
    
    The above happened because the kernel tried to access user space directly
    and triggered a "supervisor read access in kernel mode" fault. Worse yet,
    the memory could not even be loaded yet, and a SEGFAULT could happen as
    well. This could be true for kernel space accessing as well.
    
    To be even more robust, test both kernel and user space strings. If the
    string fails to read, then simply have the filter fail.
    
    Note, TASK_SIZE is used to determine if the pointer is user or kernel space
    and the appropriate strncpy_from_kernel/user_nofault() function is used to
    copy the memory. For some architectures, the compare to TASK_SIZE may always
    pick user space or kernel space. If it gets it wrong, the only thing is that
    the filter will fail to match. In the future, this needs to be fixed to have
    the event denote which should be used. But failing a filter is much better
    than panicing the machine, and that can be solved later.
    
    Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/
    Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home
    
    Cc: stable@vger.kernel.org
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Reported-by: default avatarPingfan Liu <kernelfans@gmail.com>
    Tested-by: default avatarPingfan Liu <kernelfans@gmail.com>
    Fixes: 87a342f5 ("tracing/filters: Support filtering for char * strings")
    Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    77360f9b
events.rst 39.6 KB