Commit a35a76fa authored by Alexei Starovoitov's avatar Alexei Starovoitov

Merge branch 'fix_bpf_send_signal'

Yonghong Song says:

====================
Commit 8b401f9e ("bpf: implement bpf_send_signal() helper")
introduced bpf_send_signal() helper and Commit 8482941f
("bpf: Add bpf_send_signal_thread() helper") added bpf_send_signal_thread()
helper. Both helpers try to send a signel to current process or thread.

When bpf_prog is called with scheduler rq_lock held, a deadlock
could happen since bpf_send_signal() and bpf_send_signal_thread()
will call group_send_sig_info() which may ultimately want to acquire
rq_lock() again. This happens in 5.2 and 4.16 based kernels in our
production environment with perf_sw_event.

In a different scenario, the following is also possible in the last kernel:
  cpu 1:
     do_task_stat <- holding sighand->siglock
     ...
     task_sched_runtime <- trying to grab rq_lock

  cpu 2:
     __schedule <- holding rq_lock
     ...
     do_send_sig_info <- trying to grab sighand->siglock

Commit eac9153f ("bpf/stackmap: Fix deadlock with
rq_lock in bpf_get_stack()") has a similar issue with above
rq_lock() deadlock. This patch set addressed the issue
in a similar way. Patch #1 provided kernel solution and
Patch #2 added a selftest.

Changelogs:
  v2 -> v3:
    . simplify selftest send_signal_sched_switch().
      The previous version has mmap/munmap inherited
      from Song's reproducer. They are not necessary
      in this context.
  v1 -> v2:
    . previous fix using task_work in nmi() is incorrect.
      there is no nmi() involvement here. Using task_work
      in all cases might be a solution. But decided to
      use a similar fix as in Commit eac9153f.
====================
Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 52e7c083 c4ef2f32
......@@ -732,7 +732,7 @@ static int bpf_send_signal_common(u32 sig, enum pid_type type)
if (unlikely(!nmi_uaccess_okay()))
return -EPERM;
if (in_nmi()) {
if (irqs_disabled()) {
/* Do an early check on signal validity. Otherwise,
* the error is lost in deferred irq_work.
*/
......
// SPDX-License-Identifier: GPL-2.0
#include <test_progs.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include "test_send_signal_kern.skel.h"
static void sigusr1_handler(int signum)
{
}
#define THREAD_COUNT 100
static void *worker(void *p)
{
int i;
for ( i = 0; i < 1000; i++)
usleep(1);
return NULL;
}
void test_send_signal_sched_switch(void)
{
struct test_send_signal_kern *skel;
pthread_t threads[THREAD_COUNT];
u32 duration = 0;
int i, err;
signal(SIGUSR1, sigusr1_handler);
skel = test_send_signal_kern__open_and_load();
if (CHECK(!skel, "skel_open_and_load", "skeleton open_and_load failed\n"))
return;
skel->bss->pid = getpid();
skel->bss->sig = SIGUSR1;
err = test_send_signal_kern__attach(skel);
if (CHECK(err, "skel_attach", "skeleton attach failed\n"))
goto destroy_skel;
for (i = 0; i < THREAD_COUNT; i++) {
err = pthread_create(threads + i, NULL, worker, NULL);
if (CHECK(err, "pthread_create", "Error creating thread, %s\n",
strerror(errno)))
goto destroy_skel;
}
for (i = 0; i < THREAD_COUNT; i++)
pthread_join(threads[i], NULL);
destroy_skel:
test_send_signal_kern__destroy(skel);
}
......@@ -31,6 +31,12 @@ int send_signal_tp(void *ctx)
return bpf_send_signal_test(ctx);
}
SEC("tracepoint/sched/sched_switch")
int send_signal_tp_sched(void *ctx)
{
return bpf_send_signal_test(ctx);
}
SEC("perf_event")
int send_signal_perf(void *ctx)
{
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment