Commit 66519f54 authored by Roland McGrath's avatar Roland McGrath Committed by Linus Torvalds

[PATCH] fix ptracer death race yielding bogus BUG_ON

There is a BUG_ON in ptrace_stop that hits if the thread is not ptraced.
However, there is no synchronization between a thread deciding to do a
ptrace stop and so going here, and its ptracer dying and so detaching from
it and clearing its ->ptrace field. 

The RHEL3 2.4-based kernel has a backport of a slightly older version of
the 2.6 signals code, which has a different but equivalent BUG_ON.  This
actually bit users in practice (when the debugger dies), but was
exceedingly difficult to reproduce in contrived circumstances.  We moved
forward in RHEL3 just by removing the BUG_ON, and that fixed the real user
problems even though I was never able to reproduce the scenario myself. 
So, to my knowledge this scenario has never actually been seen in practice
under 2.6.  But it's plain to see from the code that it is indeed possible.

This patch removes that BUG_ON, but also goes further and tries to handle
this case more gracefully than simply avoiding the crash.  By removing the
BUG_ON alone, it becomes possible for the real parent of a process to see
spurious SIGCHLD notifications intended for the debugger that has just
died, and have its child wind up stopped unexpectedly.  This patch avoids
that possibility by detecting the case when we are about to do the ptrace
stop but our ptracer has gone away, and simply eliding that ptrace stop
altogether as if we hadn't been ptraced when we hit the interesting event
(signal or ptrace_notify call for syscall tracing or something like that).
Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 183986a7
...@@ -1581,11 +1581,12 @@ do_notify_parent_cldstop(struct task_struct *tsk, struct task_struct *parent, ...@@ -1581,11 +1581,12 @@ do_notify_parent_cldstop(struct task_struct *tsk, struct task_struct *parent,
* We always set current->last_siginfo while stopped here. * We always set current->last_siginfo while stopped here.
* That makes it a way to test a stopped process for * That makes it a way to test a stopped process for
* being ptrace-stopped vs being job-control-stopped. * being ptrace-stopped vs being job-control-stopped.
*
* If we actually decide not to stop at all because the tracer is gone,
* we leave nostop_code in current->exit_code.
*/ */
static void ptrace_stop(int exit_code, siginfo_t *info) static void ptrace_stop(int exit_code, int nostop_code, siginfo_t *info)
{ {
BUG_ON(!(current->ptrace & PT_PTRACED));
/* /*
* If there is a group stop in progress, * If there is a group stop in progress,
* we must participate in the bookkeeping. * we must participate in the bookkeeping.
...@@ -1600,9 +1601,22 @@ static void ptrace_stop(int exit_code, siginfo_t *info) ...@@ -1600,9 +1601,22 @@ static void ptrace_stop(int exit_code, siginfo_t *info)
set_current_state(TASK_TRACED); set_current_state(TASK_TRACED);
spin_unlock_irq(&current->sighand->siglock); spin_unlock_irq(&current->sighand->siglock);
read_lock(&tasklist_lock); read_lock(&tasklist_lock);
do_notify_parent_cldstop(current, current->parent, CLD_TRAPPED); if (likely(current->ptrace & PT_PTRACED) &&
read_unlock(&tasklist_lock); likely(current->parent != current->real_parent ||
schedule(); !(current->ptrace & PT_ATTACHED))) {
do_notify_parent_cldstop(current, current->parent,
CLD_TRAPPED);
read_unlock(&tasklist_lock);
schedule();
} else {
/*
* By the time we got the lock, our tracer went away.
* Don't stop here.
*/
read_unlock(&tasklist_lock);
set_current_state(TASK_RUNNING);
current->exit_code = nostop_code;
}
/* /*
* We are back. Now reacquire the siglock before touching * We are back. Now reacquire the siglock before touching
...@@ -1633,7 +1647,7 @@ void ptrace_notify(int exit_code) ...@@ -1633,7 +1647,7 @@ void ptrace_notify(int exit_code)
/* Let the debugger run. */ /* Let the debugger run. */
spin_lock_irq(&current->sighand->siglock); spin_lock_irq(&current->sighand->siglock);
ptrace_stop(exit_code, &info); ptrace_stop(exit_code, 0, &info);
spin_unlock_irq(&current->sighand->siglock); spin_unlock_irq(&current->sighand->siglock);
} }
...@@ -1836,7 +1850,7 @@ int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka, ...@@ -1836,7 +1850,7 @@ int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
ptrace_signal_deliver(regs, cookie); ptrace_signal_deliver(regs, cookie);
/* Let the debugger run. */ /* Let the debugger run. */
ptrace_stop(signr, info); ptrace_stop(signr, signr, info);
/* We're back. Did the debugger cancel the sig? */ /* We're back. Did the debugger cancel the sig? */
signr = current->exit_code; signr = current->exit_code;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment