1. 17 Jul, 2011 1 commit
    • Oleg Nesterov's avatar
      has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/ · 961c4675
      Oleg Nesterov authored
      has_stopped_jobs() naively checks task_is_stopped(group_leader). This
      was always wrong even without ptrace, group_leader can be dead. And
      given that ptrace can change the state to TRACED this is wrong even
      in the single-threaded case.
      
      Change the code to check SIGNAL_STOP_STOPPED and simplify the code,
      retval + break/continue doesn't make this trivial code more readable.
      
      We could probably add the usual "|| signal->group_stop_count" check
      but I don't think this makes sense, the task can start the group-stop
      right after the check anyway.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      961c4675
  2. 01 Jul, 2011 1 commit
  3. 27 Jun, 2011 11 commits
    • Oleg Nesterov's avatar
      ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/ · 479bf98c
      Oleg Nesterov authored
      wait_consider_task() checks same_thread_group(parent, real_parent),
      this is the open-coded ptrace_reparented().
      
      __ptrace_detach() remains the only function which has to check this by
      hand, although we could reorganize the code to delay __ptrace_unlink.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      479bf98c
    • Oleg Nesterov's avatar
      ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented() · bb3696da
      Oleg Nesterov authored
      Kill real_parent_is_ptracer() and update the callers to use
      ptrace_reparented(), after the previous patch they do the same.
      
      Remove the unnecessary ->ptrace != 0 check in get_signal_to_deliver(),
      if ptrace_reparented() == T then the task must be ptraced.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      bb3696da
    • Oleg Nesterov's avatar
      ptrace: ptrace_reparented() should check same_thread_group() · 0347e177
      Oleg Nesterov authored
      ptrace_reparented() naively does parent != real_parent, this means
      it returns true even if the tracer _is_ the real parent. This is per
      process thing, not per-thread. The only reason ->real_parent can
      point to the non-leader thread is that we have __WNOTHREAD.
      
      Change it to check !same_thread_group(parent, real_parent).
      
      It has two callers, and in both cases the current check does not
      look right.
      
      exit_notify: we should respect ->exit_signal if the exiting leader
      is traced by any thread from the parent thread group. It is the
      child of the whole group, and we are going to send the signal to
      the whole group.
      
      wait_task_zombie: without __WNOTHREAD do_wait() should do the same
      for any thread, only sys_ptrace() is "bound" to the single thread.
      However do_wait(WEXITED) succeeds but does not release a traced
      natural child unless the caller is the tracer.
      
      Test-case:
      
      	void *tfunc(void *arg)
      	{
      		assert(ptrace(PTRACE_ATTACH, (long)arg, 0,0) == 0);
      		pause();
      		return NULL;
      	}
      
      	int main(void)
      	{
      		pthread_t thr;
      		pid_t pid, stat, ret;
      
      		pid = fork();
      		if (!pid) {
      			pause();
      			assert(0);
      		}
      
      		assert(pthread_create(&thr, NULL, tfunc, (void*)(long)pid) == 0);
      
      		assert(waitpid(-1, &stat, 0) == pid);
      		assert(WIFSTOPPED(stat));
      
      		kill(pid, SIGKILL);
      
      		assert(waitpid(-1, &stat, 0) == pid);
      		assert(WIFSIGNALED(stat) && WTERMSIG(stat) == SIGKILL);
      
      		ret = waitpid(pid, &stat, 0);
      		if (ret < 0)
      			return 0;
      
      		printf("WTF? %d is dead, but: wait=%d stat=%x\n",
      				pid, ret, stat);
      
      		return 1;
      	}
      
      Note that the main thread simply does
      
      	pid = fork();
      	kill(pid, SIGKILL);
      
      and then without the patch wait4(WEXITED) succeeds twice and reports
      WTERMSIG(stat) == SIGKILL.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      0347e177
    • Oleg Nesterov's avatar
      redefine thread_group_leader() as exit_signal >= 0 · 087806b1
      Oleg Nesterov authored
      Change de_thread() to set old_leader->exit_signal = -1. This is
      good for the consistency, it is no longer the leader and all
      sub-threads have exit_signal = -1 set by copy_process(CLONE_THREAD).
      
      And this allows us to micro-optimize thread_group_leader(), it can
      simply check exit_signal >= 0. This also makes sense because we
      should move ->group_leader from task_struct to signal_struct.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      087806b1
    • Oleg Nesterov's avatar
      do not change dead_task->exit_signal · d4f7c511
      Oleg Nesterov authored
      __ptrace_detach() and do_notify_parent() set task->exit_signal = -1
      to mark the task dead. This is no longer needed, nobody checks
      exit_signal to detect the EXIT_DEAD task.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      d4f7c511
    • Oleg Nesterov's avatar
      kill task_detached() · e550f14d
      Oleg Nesterov authored
      Upadate the last user of task_detached(), wait_task_zombie(), to
      use thread_group_leader() and kill task_detached().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      e550f14d
    • Oleg Nesterov's avatar
      reparent_leader: check EXIT_DEAD instead of task_detached() · 0976a03e
      Oleg Nesterov authored
      Change reparent_leader() to check ->exit_state instead of ->exit_signal,
      this matches the similar EXIT_DEAD check in wait_consider_task() and
      allows us to cleanup the do_notify_parent/task_detached logic.
      
      task_detached() was really needed during reparenting before 9cd80bbb
      "do_wait() optimization: do not place sub-threads on ->children list"
      to filter out the sub-threads. After this change task_detached(p) can
      only be true if p is the dead group_leader and its parent ignores
      SIGCHLD, in this case the caller of do_notify_parent() is going to
      reap this task and it should set EXIT_DEAD.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      0976a03e
    • Oleg Nesterov's avatar
      make do_notify_parent() __must_check, update the callers · 86773473
      Oleg Nesterov authored
      Change other callers of do_notify_parent() to check the value it
      returns, this makes the subsequent task_detached() unnecessary.
      Mark do_notify_parent() as __must_check.
      
      Use thread_group_leader() instead of !task_detached() to check
      if we need to notify the real parent in wait_task_zombie().
      
      Remove the stale comment in release_task(). "just for sanity" is
      no longer true, we have to set EXIT_DEAD to avoid the races with
      do_wait().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      86773473
    • Oleg Nesterov's avatar
      __ptrace_detach: avoid task_detached(), check do_notify_parent() · 9843a1e9
      Oleg Nesterov authored
      __ptrace_detach() relies on the current obscure behaviour of
      do_notify_parent(tsk) which changes tsk->exit_signal if this child
      should be silently reaped. That is why we check task_detached(), it
      is true if the task is sub-thread, or it is the group_leader but
      its exit_signal was changed by do_notify_parent().
      
      This is confusing, change the code to rely on !thread_group_leader()
      or the value returned by do_notify_parent().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      9843a1e9
    • Oleg Nesterov's avatar
      kill tracehook_notify_death() · 45cdf5cc
      Oleg Nesterov authored
      Kill tracehook_notify_death(), reimplement the logic in its caller,
      exit_notify().
      
      Also, change the exec_id's check to use thread_group_leader() instead
      of task_detached(), this is more clear. This logic only applies to
      the exiting leader, a sub-thread must never change its exit_signal.
      
      Note: when the traced group leader exits the exit_signal-or-SIGCHLD
      logic looks really strange:
      
      	- we notify the tracer even if !thread_group_empty() but
      	   do_wait(WEXITED) can't work until all threads exit
      
      	- if the tracer is real_parent, it is not clear why can't
      	  we use ->exit_signal event if !thread_group_empty()
      
      -v2: do not try to fix the 2nd oddity to avoid the subtle behavior
           change mixed with reorganization, suggested by Tejun.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      45cdf5cc
    • Oleg Nesterov's avatar
      make do_notify_parent() return bool · 53c8f9f1
      Oleg Nesterov authored
      - change do_notify_parent() to return a boolean, true if the task should
        be reaped because its parent ignores SIGCHLD.
      
      - update the only caller which checks the returned value, exit_notify().
      
      This temporary uglifies exit_notify() even more, will be cleanuped by
      the next change.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      53c8f9f1
  4. 22 Jun, 2011 6 commits
    • Tejun Heo's avatar
      ptrace: s/tracehook_tracer_task()/ptrace_parent()/ · 06d98473
      Tejun Heo authored
      tracehook.h is on the way out.  Rename tracehook_tracer_task() to
      ptrace_parent() and move it from tracehook.h to ptrace.h.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: John Johansen <john.johansen@canonical.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      06d98473
    • Tejun Heo's avatar
      ptrace: kill clone/exec tracehooks · 4b9d33e6
      Tejun Heo authored
      At this point, tracehooks aren't useful to mainline kernel and mostly
      just add an extra layer of obfuscation.  Although they have comments,
      without actual in-kernel users, it is difficult to tell what are their
      assumptions and they're actually trying to achieve.  To mainline
      kernel, they just aren't worth keeping around.
      
      This patch kills the following clone and exec related tracehooks.
      
      	tracehook_prepare_clone()
      	tracehook_finish_clone()
      	tracehook_report_clone()
      	tracehook_report_clone_complete()
      	tracehook_unsafe_exec()
      
      The changes are mostly trivial - logic is moved to the caller and
      comments are merged and adjusted appropriately.
      
      The only exception is in check_unsafe_exec() where LSM_UNSAFE_PTRACE*
      are OR'd to bprm->unsafe instead of setting it, which produces the
      same result as the field is always zero on entry.  It also tests
      p->ptrace instead of (p->ptrace & PT_PTRACED) for consistency, which
      also gives the same result.
      
      This doesn't introduce any behavior change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      4b9d33e6
    • Tejun Heo's avatar
      ptrace: kill trivial tracehooks · a288eecc
      Tejun Heo authored
      At this point, tracehooks aren't useful to mainline kernel and mostly
      just add an extra layer of obfuscation.  Although they have comments,
      without actual in-kernel users, it is difficult to tell what are their
      assumptions and they're actually trying to achieve.  To mainline
      kernel, they just aren't worth keeping around.
      
      This patch kills the following trivial tracehooks.
      
      * Ones testing whether task is ptraced.  Replace with ->ptrace test.
      
      	tracehook_expect_breakpoints()
      	tracehook_consider_ignored_signal()
      	tracehook_consider_fatal_signal()
      
      * ptrace_event() wrappers.  Call directly.
      
      	tracehook_report_exec()
      	tracehook_report_exit()
      	tracehook_report_vfork_done()
      
      * ptrace_release_task() wrapper.  Call directly.
      
      	tracehook_finish_release_task()
      
      * noop
      
      	tracehook_prepare_release_task()
      	tracehook_report_death()
      
      This doesn't introduce any behavior change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      a288eecc
    • Tejun Heo's avatar
      ptrace: move SIGTRAP on exec(2) logic to ptrace_event() · f3c04b93
      Tejun Heo authored
      Move SIGTRAP on exec(2) logic from tracehook_report_exec() to
      ptrace_event().  This is part of changes to make ptrace_event()
      smarter and handle ptrace event related details in one place.
      
      This doesn't introduce any behavior change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      f3c04b93
    • Tejun Heo's avatar
      ptrace: introduce ptrace_event_enabled() and simplify ptrace_event() and tracehook_prepare_clone() · 643ad838
      Tejun Heo authored
      This patch implements ptrace_event_enabled() which tests whether a
      given PTRACE_EVENT_* is enabled and use it to simplify ptrace_event()
      and tracehook_prepare_clone().
      
      PT_EVENT_FLAG() macro is added which calculates PT_TRACE_* flag from
      PTRACE_EVENT_*.  This is used to define PT_TRACE_* flags and by
      ptrace_event_enabled() to find the matching flag.
      
      This is used to make ptrace_event() and tracehook_prepare_clone()
      simpler.
      
      * ptrace_event() callers were responsible for providing mask to test
        whether the event was enabled.  This patch implements
        ptrace_event_enabled() and make ptrace_event() drop @mask and
        determine whether the event is enabled from @event.  Note that
        @event is constant and this conversion doesn't add runtime overhead.
      
        All conversions except tracehook_report_clone_complete() are
        trivial.  tracehook_report_clone_complete() used to use 0 for @mask
        (always enabled) but now tests whether the specified event is
        enabled.  This doesn't cause any behavior difference as it's
        guaranteed that the event specified by @trace is enabled.
      
      * tracehook_prepare_clone() now only determines which event is
        applicable and use ptrace_event_enabled() for enable test.
      
      This doesn't introduce any behavior change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      643ad838
    • Tejun Heo's avatar
      ptrace: kill task_ptrace() · d21142ec
      Tejun Heo authored
      task_ptrace(task) simply dereferences task->ptrace and isn't even used
      consistently only adding confusion.  Kill it and directly access
      ->ptrace instead.
      
      This doesn't introduce any behavior change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      d21142ec
  5. 16 Jun, 2011 5 commits
    • Tejun Heo's avatar
      ptrace: implement PTRACE_LISTEN · 544b2c91
      Tejun Heo authored
      The previous patch implemented async notification for ptrace but it
      only worked while trace is running.  This patch introduces
      PTRACE_LISTEN which is suggested by Oleg Nestrov.
      
      It's allowed iff tracee is in STOP trap and puts tracee into
      quasi-running state - tracee never really runs but wait(2) and
      ptrace(2) consider it to be running.  While ptracer is listening,
      tracee is allowed to re-enter STOP to notify an async event.
      Listening state is cleared on the first notification.  Ptracer can
      also clear it by issuing INTERRUPT - tracee will re-trap into STOP
      with listening state cleared.
      
      This allows ptracer to monitor group stop state without running tracee
      - use INTERRUPT to put tracee into STOP trap, issue LISTEN and then
      wait(2) to wait for the next group stop event.  When it happens,
      PTRACE_GETSIGINFO provides information to determine the current state.
      
      Test program follows.
      
        #define PTRACE_SEIZE		0x4206
        #define PTRACE_INTERRUPT	0x4207
        #define PTRACE_LISTEN		0x4208
      
        #define PTRACE_SEIZE_DEVEL	0x80000000
      
        static const struct timespec ts1s = { .tv_sec = 1 };
      
        int main(int argc, char **argv)
        {
      	  pid_t tracee, tracer;
      	  int i;
      
      	  tracee = fork();
      	  if (!tracee)
      		  while (1)
      			  pause();
      
      	  tracer = fork();
      	  if (!tracer) {
      		  siginfo_t si;
      
      		  ptrace(PTRACE_SEIZE, tracee, NULL,
      			 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
      		  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
      	  repeat:
      		  waitid(P_PID, tracee, NULL, WSTOPPED);
      
      		  ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
      		  if (!si.si_code) {
      			  printf("tracer: SIG %d\n", si.si_signo);
      			  ptrace(PTRACE_CONT, tracee, NULL,
      				 (void *)(unsigned long)si.si_signo);
      			  goto repeat;
      		  }
      		  printf("tracer: stopped=%d signo=%d\n",
      			 si.si_signo != SIGTRAP, si.si_signo);
      		  if (si.si_signo != SIGTRAP)
      			  ptrace(PTRACE_LISTEN, tracee, NULL, NULL);
      		  else
      			  ptrace(PTRACE_CONT, tracee, NULL, NULL);
      		  goto repeat;
      	  }
      
      	  for (i = 0; i < 3; i++) {
      		  nanosleep(&ts1s, NULL);
      		  printf("mother: SIGSTOP\n");
      		  kill(tracee, SIGSTOP);
      		  nanosleep(&ts1s, NULL);
      		  printf("mother: SIGCONT\n");
      		  kill(tracee, SIGCONT);
      	  }
      	  nanosleep(&ts1s, NULL);
      
      	  kill(tracer, SIGKILL);
      	  kill(tracee, SIGKILL);
      	  return 0;
        }
      
      This is identical to the program to test TRAP_NOTIFY except that
      tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped.
      This allows ptracer to monitor when group stop ends without running
      tracee.
      
        # ./test-listen
        tracer: stopped=0 signo=5
        mother: SIGSTOP
        tracer: SIG 19
        tracer: stopped=1 signo=19
        mother: SIGCONT
        tracer: stopped=0 signo=5
        tracer: SIG 18
        mother: SIGSTOP
        tracer: SIG 19
        tracer: stopped=1 signo=19
        mother: SIGCONT
        tracer: stopped=0 signo=5
        tracer: SIG 18
        mother: SIGSTOP
        tracer: SIG 19
        tracer: stopped=1 signo=19
        mother: SIGCONT
        tracer: stopped=0 signo=5
        tracer: SIG 18
      
      -v2: Moved JOBCTL_LISTENING check in wait_task_stopped() into
           task_stopped_code() as suggested by Oleg.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      544b2c91
    • Tejun Heo's avatar
      ptrace: implement TRAP_NOTIFY and use it for group stop events · fb1d910c
      Tejun Heo authored
      Currently there's no way for ptracer to find out whether group stop
      finished other than polling with INTERRUPT - GETSIGINFO - CONT
      sequence.  This patch implements group stop notification for ptracer
      using STOP traps.
      
      When group stop state of a seized tracee changes, JOBCTL_TRAP_NOTIFY
      is set, which schedules a STOP trap which is sticky - it isn't cleared
      by other traps and at least one STOP trap will happen eventually.
      STOP trap is synchronization point for event notification and the
      tracer can determine the current group stop state by looking at the
      signal number portion of exit code (si_status from waitid(2) or
      si_code from PTRACE_GETSIGINFO).
      
      Notifications are generated both on start and end of group stops but,
      because group stop participation always happens before STOP trap, this
      doesn't cause an extra trap while tracee is participating in group
      stop.  The symmetry will be useful later.
      
      Note that this notification works iff tracee is not trapped.
      Currently there is no way to be notified of group stop state changes
      while tracee is trapped.  This will be addressed by a later patch.
      
      An example program follows.
      
        #define PTRACE_SEIZE		0x4206
        #define PTRACE_INTERRUPT	0x4207
      
        #define PTRACE_SEIZE_DEVEL	0x80000000
      
        static const struct timespec ts1s = { .tv_sec = 1 };
      
        int main(int argc, char **argv)
        {
      	  pid_t tracee, tracer;
      	  int i;
      
      	  tracee = fork();
      	  if (!tracee)
      		  while (1)
      			  pause();
      
      	  tracer = fork();
      	  if (!tracer) {
      		  siginfo_t si;
      
      		  ptrace(PTRACE_SEIZE, tracee, NULL,
      			 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
      		  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
      	  repeat:
      		  waitid(P_PID, tracee, NULL, WSTOPPED);
      
      		  ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
      		  if (!si.si_code) {
      			  printf("tracer: SIG %d\n", si.si_signo);
      			  ptrace(PTRACE_CONT, tracee, NULL,
      				 (void *)(unsigned long)si.si_signo);
      			  goto repeat;
      		  }
      		  printf("tracer: stopped=%d signo=%d\n",
      			 si.si_signo != SIGTRAP, si.si_signo);
      		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
      		  goto repeat;
      	  }
      
      	  for (i = 0; i < 3; i++) {
      		  nanosleep(&ts1s, NULL);
      		  printf("mother: SIGSTOP\n");
      		  kill(tracee, SIGSTOP);
      		  nanosleep(&ts1s, NULL);
      		  printf("mother: SIGCONT\n");
      		  kill(tracee, SIGCONT);
      	  }
      	  nanosleep(&ts1s, NULL);
      
      	  kill(tracer, SIGKILL);
      	  kill(tracee, SIGKILL);
      	  return 0;
        }
      
      In the above program, tracer keeps tracee running and gets
      notification of each group stop state changes.
      
        # ./test-notify
        tracer: stopped=0 signo=5
        mother: SIGSTOP
        tracer: SIG 19
        tracer: stopped=1 signo=19
        mother: SIGCONT
        tracer: stopped=0 signo=5
        tracer: SIG 18
        mother: SIGSTOP
        tracer: SIG 19
        tracer: stopped=1 signo=19
        mother: SIGCONT
        tracer: stopped=0 signo=5
        tracer: SIG 18
        mother: SIGSTOP
        tracer: SIG 19
        tracer: stopped=1 signo=19
        mother: SIGCONT
        tracer: stopped=0 signo=5
        tracer: SIG 18
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      fb1d910c
    • Tejun Heo's avatar
      ptrace: implement PTRACE_INTERRUPT · fca26f26
      Tejun Heo authored
      Currently, there's no way to trap a running ptracee short of sending a
      signal which has various side effects.  This patch implements
      PTRACE_INTERRUPT which traps ptracee without any signal or job control
      related side effect.
      
      The implementation is almost trivial.  It uses the group stop trap -
      SIGTRAP | PTRACE_EVENT_STOP << 8.  A new trap flag
      JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and
      cleared when any trap happens.  As INTERRUPT should be useable
      regardless of the current state of tracee, task_is_traced() test in
      ptrace_check_attach() is skipped for INTERRUPT.
      
      PTRACE_INTERRUPT is available iff tracee is attached with
      PTRACE_SEIZE.
      
      Test program follows.
      
        #define PTRACE_SEIZE		0x4206
        #define PTRACE_INTERRUPT	0x4207
      
        #define PTRACE_SEIZE_DEVEL	0x80000000
      
        static const struct timespec ts100ms = { .tv_nsec = 100000000 };
        static const struct timespec ts1s = { .tv_sec = 1 };
        static const struct timespec ts3s = { .tv_sec = 3 };
      
        int main(int argc, char **argv)
        {
      	  pid_t tracee;
      
      	  tracee = fork();
      	  if (tracee == 0) {
      		  nanosleep(&ts100ms, NULL);
      		  while (1) {
      			  printf("tracee: alive pid=%d\n", getpid());
      			  nanosleep(&ts1s, NULL);
      		  }
      	  }
      
      	  if (argc > 1)
      		  kill(tracee, SIGSTOP);
      
      	  nanosleep(&ts100ms, NULL);
      
      	  ptrace(PTRACE_SEIZE, tracee, NULL,
      		 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
      	  if (argc > 1) {
      		  waitid(P_PID, tracee, NULL, WSTOPPED);
      		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
      	  }
      	  nanosleep(&ts3s, NULL);
      
      	  printf("tracer: INTERRUPT and DETACH\n");
      	  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
      	  waitid(P_PID, tracee, NULL, WSTOPPED);
      	  ptrace(PTRACE_DETACH, tracee, NULL, NULL);
      	  nanosleep(&ts3s, NULL);
      
      	  printf("tracer: exiting\n");
      	  kill(tracee, SIGKILL);
      	  return 0;
        }
      
      When called without argument, tracee is seized from running state,
      interrupted and then detached back to running state.
      
        # ./test-interrupt
        tracee: alive pid=4546
        tracee: alive pid=4546
        tracee: alive pid=4546
        tracer: INTERRUPT and DETACH
        tracee: alive pid=4546
        tracee: alive pid=4546
        tracee: alive pid=4546
        tracer: exiting
      
      When called with argument, tracee is seized from stopped state,
      continued, interrupted and then detached back to stopped state.
      
        # ./test-interrupt  1
        tracee: alive pid=4548
        tracee: alive pid=4548
        tracee: alive pid=4548
        tracer: INTERRUPT and DETACH
        tracer: exiting
      
      Before PTRACE_INTERRUPT, once the tracee was running, there was no way
      to trap tracee and do PTRACE_DETACH without causing side effect.
      
      -v2: Updated to use task_set_jobctl_pending() so that it doesn't end
           up scheduling TRAP_STOP if child is dying which may make the
           child unkillable.  Spotted by Oleg.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      fca26f26
    • Tejun Heo's avatar
      ptrace: implement PTRACE_SEIZE · 3544d72a
      Tejun Heo authored
      PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side
      effects on tracee signal and job control states.  This patch
      implements a new ptrace request PTRACE_SEIZE which attaches a tracee
      without trapping it or affecting its signal and job control states.
      
      The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_*
      flags in @data.  Currently, the only defined flag is
      PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE.
      PTRACE_SEIZE will change ptrace behaviors outside of attach itself.
      The changes will be implemented gradually and the DEVEL flag is to
      prevent programs which expect full SEIZE behavior from using it before
      all the behavior modifications are complete while allowing unit
      testing.  The flag will be removed once SEIZE behaviors are completely
      implemented.
      
      * PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap.  After
        attaching tracee continues to run unless a trap condition occurs.
      
      * PTRACE_SEIZE doesn't affect signal or group stop state.
      
      * If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
        exit_code of (signr | PTRACE_EVENT_STOP << 8) where signr is one of
        the stopping signals if group stop is in effect or SIGTRAP
        otherwise, and returns usual trap siginfo on PTRACE_GETSIGINFO
        instead of NULL.
      
      Seizing sets PT_SEIZED in ->ptrace of the tracee.  This flag will be
      used to determine whether new SEIZE behaviors should be enabled.
      
      Test program follows.
      
        #define PTRACE_SEIZE		0x4206
        #define PTRACE_SEIZE_DEVEL	0x80000000
      
        static const struct timespec ts100ms = { .tv_nsec = 100000000 };
        static const struct timespec ts1s = { .tv_sec = 1 };
        static const struct timespec ts3s = { .tv_sec = 3 };
      
        int main(int argc, char **argv)
        {
      	  pid_t tracee;
      
      	  tracee = fork();
      	  if (tracee == 0) {
      		  nanosleep(&ts100ms, NULL);
      		  while (1) {
      			  printf("tracee: alive\n");
      			  nanosleep(&ts1s, NULL);
      		  }
      	  }
      
      	  if (argc > 1)
      		  kill(tracee, SIGSTOP);
      
      	  nanosleep(&ts100ms, NULL);
      
      	  ptrace(PTRACE_SEIZE, tracee, NULL,
      		 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
      	  if (argc > 1) {
      		  waitid(P_PID, tracee, NULL, WSTOPPED);
      		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
      	  }
      	  nanosleep(&ts3s, NULL);
      	  printf("tracer: exiting\n");
      	  return 0;
        }
      
      When the above program is called w/o argument, tracee is seized while
      running and remains running.  When tracer exits, tracee continues to
      run and print out messages.
      
        # ./test-seize-simple
        tracee: alive
        tracee: alive
        tracee: alive
        tracer: exiting
        tracee: alive
        tracee: alive
      
      When called with an argument, tracee is seized from stopped state and
      continued, and returns to stopped state when tracer exits.
      
        # ./test-seize
        tracee: alive
        tracee: alive
        tracee: alive
        tracer: exiting
        # ps -el|grep test-seize
        1 T     0  4720     1  0  80   0 -   941 signal ttyS0    00:00:00 test-seize
      
      -v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan
           suggested.
      
      -v3: PTRACE_EVENT_STOP traps now report group stop state by signr.  If
           group stop is in effect the stop signal number is returned as
           part of exit_code; otherwise, SIGTRAP.  This was suggested by
           Denys and Oleg.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      3544d72a
    • Tejun Heo's avatar
      job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap · 73ddff2b
      Tejun Heo authored
      do_signal_stop() implemented both normal group stop and trap for group
      stop while ptraced.  This approach has been enough but scheduled
      changes require trap mechanism which can be used in more generic
      manner and using group stop trap for generic trap site simplifies both
      userland visible interface and implementation.
      
      This patch adds a new jobctl flag - JOBCTL_TRAP_STOP.  When set, it
      triggers a trap site, which behaves like group stop trap, in
      get_signal_to_deliver() after checking for pending signals.  While
      ptraced, do_signal_stop() doesn't stop itself.  It initiates group
      stop if requested and schedules JOBCTL_TRAP_STOP and returns.  The
      caller - get_signal_to_deliver() - is responsible for checking whether
      TRAP_STOP is pending afterwards and handling it.
      
      ptrace_attach() is updated to use JOBCTL_TRAP_STOP instead of
      JOBCTL_STOP_PENDING and __ptrace_unlink() to clear all pending trap
      bits and TRAPPING so that TRAP_STOP and future trap bits don't linger
      after detach.
      
      While at it, add proper function comment to do_signal_stop() and make
      it return bool.
      
      -v2: __ptrace_unlink() updated to clear JOBCTL_TRAP_MASK and TRAPPING
           instead of JOBCTL_PENDING_MASK.  This avoids accidentally
           clearing JOBCTL_STOP_CONSUME.  Spotted by Oleg.
      
      -v3: do_signal_stop() updated to return %false without dropping
           siglock while ptraced and TRAP_STOP check moved inside for(;;)
           loop after group stop participation.  This avoids unnecessary
           relocking and also will help avoiding unnecessary traps by
           consuming group stop before handling pending traps.
      
      -v4: Jobctl trap handling moved into a separate function -
           do_jobctl_trap().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      73ddff2b
  6. 04 Jun, 2011 11 commits
    • Tejun Heo's avatar
      signal: remove three noop tracehooks · dd1d6772
      Tejun Heo authored
      Remove the following three noop tracehooks in signals.c.
      
      * tracehook_force_sigpending()
      * tracehook_get_signal()
      * tracehook_finish_jctl()
      
      The code area is about to be updated and these hooks don't do anything
      other than obfuscating the logic.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      dd1d6772
    • Tejun Heo's avatar
      ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit · 62c124ff
      Tejun Heo authored
      ptracer->signal->wait_chldexit was used to wait for TRAPPING; however,
      ->wait_chldexit was already complicated with waker-side filtering
      without adding TRAPPING wait on top of it.  Also, it unnecessarily
      made TRAPPING clearing depend on the current ptrace relationship - if
      the ptracee is detached, wakeup is lost.
      
      There is no reason to use signal->wait_chldexit here.  We're just
      waiting for JOBCTL_TRAPPING bit to clear and given the relatively
      infrequent use of ptrace, bit_waitqueue can serve it perfectly.
      
      This patch makes JOBCTL_TRAPPING wait use bit_waitqueue instead of
      signal->wait_chldexit.
      
      -v2: Use JOBCTL_*_BIT macros instead of ilog2() as suggested by Linus.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      62c124ff
    • Tejun Heo's avatar
      job control: introduce task_set_jobctl_pending() · 7dd3db54
      Tejun Heo authored
      task->jobctl currently hosts JOBCTL_STOP_PENDING and will host TRAP
      pending bits too.  Setting pending conditions on a dying task may make
      the task unkillable.  Currently, each setting site is responsible for
      checking for the condition but with to-be-added job control traps this
      becomes too fragile.
      
      This patch adds task_set_jobctl_pending() which should be used when
      setting task->jobctl bits to schedule a stop or trap.  The function
      performs the followings to ease setting pending bits.
      
      * Sanity checks.
      
      * If fatal signal is pending or PF_EXITING is set, no bit is set.
      
      * STOP_SIGMASK is automatically cleared if new value is being set.
      
      do_signal_stop() and ptrace_attach() are updated to use
      task_set_jobctl_pending() instead of setting STOP_PENDING explicitly.
      The surrounding structures around setting are changed to fit
      task_set_jobctl_pending() better but there should be no userland
      visible behavior difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      7dd3db54
    • Tejun Heo's avatar
      job control: make task_clear_jobctl_pending() clear TRAPPING automatically · 6dfca329
      Tejun Heo authored
      JOBCTL_TRAPPING indicates that ptracer is waiting for tracee to
      (re)transit into TRACED.  task_clear_jobctl_pending() must be called
      when either tracee enters TRACED or the transition is cancelled for
      some reason.  The former is achieved by explicitly calling
      task_clear_jobctl_pending() in ptrace_stop() and the latter by calling
      it at the end of do_signal_stop().
      
      Calling task_clear_jobctl_trapping() at the end of do_signal_stop()
      limits the scope TRAPPING can be used and is fragile in that seemingly
      unrelated changes to tracee's control flow can lead to stuck TRAPPING.
      
      We already have task_clear_jobctl_pending() calls on those cancelling
      events to clear JOBCTL_STOP_PENDING.  Cancellations can be handled by
      making those call sites use JOBCTL_PENDING_MASK instead and updating
      task_clear_jobctl_pending() such that task_clear_jobctl_trapping() is
      called automatically if no stop/trap is pending.
      
      This patch makes the above changes and removes the fallback
      task_clear_jobctl_trapping() call from do_signal_stop().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      6dfca329
    • Tejun Heo's avatar
      job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending() · 3759a0d9
      Tejun Heo authored
      This patch introduces JOBCTL_PENDING_MASK and replaces
      task_clear_jobctl_stop_pending() with task_clear_jobctl_pending()
      which takes an extra @mask argument.
      
      JOBCTL_PENDING_MASK is currently equal to JOBCTL_STOP_PENDING but
      future patches will add more bits.  recalc_sigpending_tsk() is updated
      to use JOBCTL_PENDING_MASK instead.
      
      task_clear_jobctl_pending() takes @mask which in subset of
      JOBCTL_PENDING_MASK and clears the relevant jobctl bits.  If
      JOBCTL_STOP_PENDING is set, other STOP bits are cleared together.  All
      task_clear_jobctl_stop_pending() users are updated to call
      task_clear_jobctl_pending() with JOBCTL_STOP_PENDING which is
      functionally identical to task_clear_jobctl_stop_pending().
      
      This patch doesn't cause any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      3759a0d9
    • Tejun Heo's avatar
      ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() · 81be24b8
      Tejun Heo authored
      In ptrace_stop(), after arch hook is done, the task state and jobctl
      bits are updated while holding siglock.  The ordering requirement
      there is that TASK_TRACED is set before JOBCTL_TRAPPING is cleared to
      prevent ptracer waiting on TRAPPING doesn't end up waking up TRACED is
      actually set and sees TASK_RUNNING in wait(2).
      
      Move set_current_state(TASK_TRACED) to the top of the block and
      reorganize comments.  This makes the ordering more obvious
      (TASK_TRACED before other updates) and helps future updates to group
      stop participation.
      
      This patch doesn't cause any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      81be24b8
    • Tejun Heo's avatar
      ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments · 755e276b
      Tejun Heo authored
      PTRACE_INTERRUPT is going to be added which should also skip
      task_is_traced() check in ptrace_check_attach().  Rename @kill to
      @ignore_state and make it bool.  Add function comment while at it.
      
      This patch doesn't introduce any behavior difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      755e276b
    • Tejun Heo's avatar
      job control: rename signal->group_stop and flags to jobctl and update them · a8f072c1
      Tejun Heo authored
      signal->group_stop currently hosts mostly group stop related flags;
      however, it's gonna be used for wider purposes and the GROUP_STOP_
      flag prefix becomes confusing.  Rename signal->group_stop to
      signal->jobctl and rename all GROUP_STOP_* flags to JOBCTL_*.
      
      Bit position macros JOBCTL_*_BIT are defined and JOBCTL_* flags are
      defined in terms of them to allow using bitops later.
      
      While at it, reassign JOBCTL_TRAPPING to bit 22 to better accomodate
      future additions.
      
      This doesn't cause any functional change.
      
      -v2: JOBCTL_*_BIT macros added as suggested by Linus.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      a8f072c1
    • Tejun Heo's avatar
      ptrace: remove silly wait_trap variable from ptrace_attach() · 0b1007c3
      Tejun Heo authored
      Remove local variable wait_trap which determines whether to wait for
      !TRAPPING or not and simply wait for it if attach was successful.
      
      -v2: Oleg pointed out wait should happen iff attach was successful.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      0b1007c3
    • Linus Torvalds's avatar
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 0e833d8c
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
        tg3: Fix tg3_skb_error_unmap()
        net: tracepoint of net_dev_xmit sees freed skb and causes panic
        drivers/net/can/flexcan.c: add missing clk_put
        net: dm9000: Get the chip in a known good state before enabling interrupts
        drivers/net/davinci_emac.c: add missing clk_put
        af-packet: Add flag to distinguish VID 0 from no-vlan.
        caif: Fix race when conditionally taking rtnl lock
        usbnet/cdc_ncm: add missing .reset_resume hook
        vlan: fix typo in vlan_dev_hard_start_xmit()
        net/ipv4: Check for mistakenly passed in non-IPv4 address
        iwl4965: correctly validate temperature value
        bluetooth l2cap: fix locking in l2cap_global_chan_by_psm
        ath9k: fix two more bugs in tx power
        cfg80211: don't drop p2p probe responses
        Revert "net: fix section mismatches"
        drivers/net/usb/catc.c: Fix potential deadlock in catc_ctrl_run()
        sctp: stop pending timers and purge queues when peer restart asoc
        drivers/net: ks8842 Fix crash on received packet when in PIO mode.
        ip_options_compile: properly handle unaligned pointer
        iwlagn: fix incorrect PCI subsystem id for 6150 devices
        ...
      0e833d8c
  7. 03 Jun, 2011 5 commits