An error occurred fetching the project authors.
  1. 05 Aug, 2003 1 commit
    • Roland McGrath's avatar
      [PATCH] spurious SIGCHLD from dying thread group leader · 2ed879ff
      Roland McGrath authored
      A dying initial thread (thread group leader) sends SIGCHLD when it exits,
      but it ought to wait until all other threads exit as well.  The cases of
      secondary threads exitting first were handled properly, but not this one.
      
      This exit.c patch fixes that test case, and I think catches the other
      potential bugs of this kind as well.  The signal.c change adds some bug
      catchers, the second of which will trip on the test case in the absence
      of the exit.c fix.
      2ed879ff
  2. 27 Jul, 2003 1 commit
  3. 07 Jul, 2003 1 commit
    • Ulrich Drepper's avatar
      [PATCH] tgkill patch for safe inter-thread signals · 62133cb1
      Ulrich Drepper authored
      This is the updated versions of the patch Ingo sent some time ago to
      implement a new tgkill() syscall which specifies the target thread
      without any possibility of ambiguity or thread ID wrap races, by passing
      in both the thread group _and_ the thread ID as the arguments.
      
      This is really needed since many/most people still run with limited PID
      ranges (maybe due to legacy apps breaking) and the PID reuse can cause
      problems.
      62133cb1
  4. 04 Jul, 2003 2 commits
    • Ulrich Drepper's avatar
      [PATCH] wrong pid in siginfo_t · c7aa953c
      Ulrich Drepper authored
      If a signal is sent via kill() or tkill() the kernel fills in the wrong
      PID value in the siginfo_t structure (obviously only if the handler has
      SA_SIGINFO set).
      
      POSIX specifies the the si_pid field is filled with the process ID, and
      in Linux parlance that's the "thread group" ID, not the thread ID.
      c7aa953c
    • Linus Torvalds's avatar
      When forcing through a signal for some thread-synchronous · 9e008c3c
      Linus Torvalds authored
      event (ie SIGSEGV, SIGFPE etc that happens as a result of a
      trap as opposed to an external event), if the signal is
      blocked we will not invoce a signal handler, we will just
      kill the thread with the signal.
      
      This is equivalent to what we do in the SIG_IGN case: you
      cannot ignore or block synchronous signals, and if you try,
      we'll just have to kill you.
      
      We don't want to handle endless recursive faults, which the
      old behaviour easily led to if the stack was bad, for example.
      9e008c3c
  5. 02 Jun, 2003 1 commit
    • Jim Houston's avatar
      [PATCH] preallocate signal queue resource - Posix timers · d1791d31
      Jim Houston authored
      This adds a new interface to kernel/signal.c which allows signals to be
      sent using preallocated sigqueue structures.  It also modifies
      kernel/posix-timers.c to use this interface.
      
      The current timer code may fail to deliver a timer expiry signal if
      there are no sigqueue structures available at the time of the expiry.
      The Posix specification is clear that the signal queuing resource should
      be allocated at timer_create time.  This allows the error to be returned
      to the application rather than silently losing the signal.
      
      This patch does not change the sigqueue structure allocation policy.  I
      hope to revisit that in another patch.
      
      Here is the definition for the new interface:
      
      struct sigqueue *sigqueue_alloc(void)
      	Preallocate a sigqueue structure for use with the functions
      	described below.
      
      void sigqueue_free(struct sigqueue *q)
      	Free a preallocated sigqueue structure.  If the sigqueue
      	structure being freed is still queued, it will be removed
      	from the queue.  I currently leave the signal pending.
      	It may be delivered without the siginfo structure.
      
      int send_sigqueue(int sig, struct sigqueue *q, struct task_struct *p)
      	This function is equivalent to send_sig_info().  It queues
      	a signal to the specified thread using  the supplied sigqueue
      	structure.  The caller is expected to fill in the siginfo_t
      	which is part of the sigqueue structure.
      
      int send_group_sigqueue(int sig, struct sigqueue *q, struct task_struct *p)
      	This function is equivalent to send_group_sig_info().  It queues
      	the signal to a process allowing the system to select which thread
      	will receive the signal in a multi-threaded process.
      	Again, the sigqueue structure is used to queue the signal.
      
      Both send_sigqueue() and send_group_sigqueue() return 0 if the signal
      is queued. They return 1 if the signal was not queued because the
      process is ignoring the signal.
      
      Both versions include code to increment the si_overrun count if the
      sigqueue entry is for a Posix timer and they are called while the
      sigqueue entry is still queued.  Yes, I know that the current code
      doesn't rearm the timer until the signal is delivered.  Having this
      extra bit of code doesn't do any harm, and I plan to use it.
      
      These routines do not check if there already is a legacy (non-realtime)
      signal pending.  They always queue the signal.  This requires that
      collect_signal() always checks if there is another matching siginfo
      before clearing the signal bit.
      d1791d31
  6. 25 May, 2003 2 commits
    • Andrew Morton's avatar
      [PATCH] add notify_count for de_thread · 73accc3d
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      de_thread is called by exec to kill all threads in the thread group except
      the threads required for exec.
      
      The waiting is implemented by waiting for a wakeup from __exit_signal: If
      the reference count is less or equal to 2, then the waiter is woken up.  If
      exec is called by a non-leader thread, then two threads are required for
      exec.
      
      But if a thread group leader calls exec, then only one thread is required
      for exec.  Thus the hardcoded "2" leads to a superfluous wakeup.  The patch
      fixes that by adding a "notify_count" field to the signal structure.
      73accc3d
    • Geert Uytterhoeven's avatar
      [PATCH] HAVE_ARCH_GET_SIGNAL_TO_DELIVER warning · 8c3b1bca
      Geert Uytterhoeven authored
      Kill warning about unused static functions if HAVE_ARCH_GET_SIGNAL_TO_DELIVER
      is defined.
      8c3b1bca
  7. 19 May, 2003 1 commit
    • Ingo Molnar's avatar
      [PATCH] signal latency fixes · 79e4dd94
      Ingo Molnar authored
      This fixes an SMP window where the kernel could miss to handle a signal,
      and increase signal delivery latency up to 200 msecs.  Sun has reported
      to Ulrich that their JVM sees occasional unexpected signal delays under
      Linux.  The more CPUs, the more delays.
      
      The cause of the problem is that the current signal wakeup
      implementation is racy in kernel/signal.c:signal_wake_up():
      
              if (t->state == TASK_RUNNING)
                      kick_if_running(t);
      	...
              if (t->state & mask) {
                      wake_up_process(t);
                      return;
              }
      
      If thread (or process) 't' is woken up on another CPU right after the
      TASK_RUNNING check, and thread starts to run, then the wake_up_process()
      here will do nothing, and the signal stays pending up until the thread
      will call into the kernel next time - which can be up to 200 msecs
      later.
      
      The solution is to do the 'kicking' of a running thread on a remote CPU
      atomically with the wakeup.  For this i've added wake_up_process_kick().
      There is no slowdown for the other wakeup codepaths, the new flag to
      try_to_wake_up() is compiled off for them.  Some other subsystems might
      want to use this wakeup facility as well in the future (eg.  AIO).
      
      In fact this race triggers quite often under Volanomark rusg, with this
      change added, Volanomark performance is up from 500-800 to 2000-3000, on
      a 4-way x86 box.
      79e4dd94
  8. 12 May, 2003 1 commit
  9. 09 May, 2003 1 commit
    • Petr Vandrovec's avatar
      [PATCH] Fix potential runqueue deadlock · b36c92e7
      Petr Vandrovec authored
      send_sig_info() has been broken since 2.5.60.
      
      The function can be invoked from a the time interrupt (timer_interrpt ->
      do_timer -> update_process_times -> -> update_one_process -> (
      do_process_times, do_it_prof, do_it_virt ) -> -> send_sig ->
      send_sig_info) but it uses spin_unlock_irq instead of the correct
      spin_unlock_irqrestore. 
      
      This enables interrupts, and later scheduler_tick() locks runqueue
      (without disabling interrupts).  And if we are unlucky, a new interrupt
      comes at this point.  And if this interrupt tries to do wake_up() (like
      RTC interrupt does), we will deadlock on runqueue lock :-(
      
      The bug was introduced by signal-fixes-2.5.59-A4, which split the
      original send_sig_info into two functions, and in one branch it started
      using these unsafe spinlock variants (while the "group" variant uses
      irqsave/restore correctly). 
      b36c92e7
  10. 08 Apr, 2003 1 commit
  11. 04 Apr, 2003 1 commit
    • Roland McGrath's avatar
      [PATCH] linux-2.5.66-signal-cleanup.patch · da334d91
      Roland McGrath authored
      Here is the cleanup patch I promised back in February.  Sorry it took a
      while.
      
      The effects should be purely cosmetic in 2.5.66.  However, the new
      interface for the proper way to send thread-specific of process-global
      signals from inside the kernel is needed for correct implementation of
      some fixes to timer stuff that Ulrich told me about.
      
      This cleans up some obsolete comments and macros in kernel/signal.c,
      restores send_sig_info to its original behavior, and adds a global entry
      point send_group_sig_info.  I checked all the uses of send_sig and
      send_sig_info and changed a few to send_group_sig_info.
      
      I think it would be cleanest if the whole mess of *_sig* entry points were
      reduced to two or three, but I did the change that minimized the number of
      callers I had to fix up.
      
      There should be no discernible difference, since the 2.5.66 send_sig_info
      function did group semantics for those signals by number already.  The only
      exception to that is pdeath_signal, which I guess can be any signal number
      but I deemed ought to be process-wide.
      
      I did not change any of the calls using SIGKILL, though that does have
      process-wide semantics.  There is no need to change it since SIGKILL always
      kills the whole group, though the code path for send_sig(SIGKILL,...) calls
      in multithreaded processes will be different now.
      da334d91
  12. 22 Mar, 2003 1 commit
  13. 17 Mar, 2003 1 commit
    • Roland McGrath's avatar
      [PATCH] signal fix for wedge on multithreaded core dump · 874f2e47
      Roland McGrath authored
      This is a fix made almost a month ago, during the flurry of signal changes.
      I didn't realize until today that this hadn't made it into 2.5.  Sorry
      about the delay.
      
      This fix is necessary to avoid sometimes wedging in uninterruptible sleep
      when doing a multithreaded core dump triggered by a process signal (kill)
      rather than a trap.  You can reproduce the problem by running your favorite
      multithreaded program (NPTL) and then using "kill -SEGV" on it.  It will
      often wedge.  The actual fix could be just a two line diff:
      
      +                       if (current->signal->group_exit)
      +                               goto dequeue;
      
      after the group_exit_task check.  That is the fix that has been used in
      Ingo's backport for weeks and tested heavily (well, as heavily as core
      dumping ever gets tested, but it's been in our production systems).
      
      But I broke the hair out into a separate function.  The patch below has the
      same effect as the two-liner, and no other difference.  I have tested
      2.5.64 with this patch and it works for me, though I haven't beat on it.
      
      The way the wedge happens is that for a core-dump signal group_send_sig_info
      does a group stop of other threads before the one thread handles the fatal
      signal.  If the fatal thread gets into do_coredump and coredump_wait first,
      then other threads see the group stop and suspend with SIGKILL pending.
      All other fatal cases clear group_stop_count, so this is the only way this
      ever happens.  Checking group_exit fixes it.  I didn't make do_coredump
      clear group_stop_count because doing it with the appropriate ordering and
      locking doesn't fit the organization that code.
      874f2e47
  14. 24 Feb, 2003 1 commit
  15. 18 Feb, 2003 2 commits
    • Andrew Morton's avatar
      970f319a
    • George Anzinger's avatar
      [PATCH] POSIX clocks & timers · db8b50ba
      George Anzinger authored
      This is version 23 or so of the POSIX timer code.
      
      Internal changelog:
      
       - Changed the signals code to match the new order of things.  Also the
         new xtime_lock code needed to be picked up.  It made some things a lot
         simpler.
      
       - Fixed a spin lock hand off problem in locking timers (thanks
         to Randy).
      
       - Fixed nanosleep to test for out of bound nanoseconds
         (thanks to Julie).
      
       - Fixed a couple of id deallocation bugs that left old ids
         laying around (hey I get this one).
      
       - This version has a new timer id manager.  Andrew Morton
         suggested elimination of recursion (done) and I added code
         to allow it to release unused nodes.  The prior version only
         released the leaf nodes.  (The id manager uses radix tree
         type nodes.)  Also added is a reuse count so ids will not
         repeat for at least 256 alloc/ free cycles.
      
       - The changes for the new sys_call restart now allow one
         restart function to handle both nanosleep and clock_nanosleep.
         Saves a bit of code, nice.
      
       - All the requested changes and Lindent too :).
      
       - I also broke clock_nanosleep() apart much the same way
         nanosleep() was with the 2.5.50-bk5 changes.
      
      TIMER STORMS
      
      The POSIX clocks and timers code prevents "timer storms" by
      not putting repeating timers back in the timer list until
      the signal is delivered for the prior expiry.  Timer events
      missed by this delay are accounted for in the timer overrun
      count.  The net result is MUCH lower system overhead while
      presenting the same info to the user as would be the case if
      an interrupt and timer processing were required for each
      increment in the overrun count.
      db8b50ba
  16. 17 Feb, 2003 1 commit
  17. 15 Feb, 2003 1 commit
  18. 13 Feb, 2003 1 commit
  19. 12 Feb, 2003 1 commit
  20. 11 Feb, 2003 2 commits
  21. 10 Feb, 2003 1 commit
    • David S. Miller's avatar
      [SIGNAL]: Allow more platforms to use generic get_signal_to_deliver. · b6f7756d
      David S. Miller authored
      The few platforms that cannot use the generic
      get_signal_to_deliver implementation cannot do
      so because they do special things for ptraced
      children.  This can be easily avoided and thus
      all of the signal handling code duplication can
      be eliminated.
      
      This is the first part, which adds a platform hook
      right before the parent of the ptraced child is woken.
      Data can be passed in via a cookie argument.
      
      The next part will be dealing with platforms
      that need to muck with breakpoints in the child
      in this same code block.
      b6f7756d
  22. 09 Feb, 2003 5 commits
  23. 08 Feb, 2003 1 commit
  24. 07 Feb, 2003 4 commits
    • Roland McGrath's avatar
      [PATCH] TASK_STOPPED wakeup cleanup · 03e21831
      Roland McGrath authored
      For handle_stop_signal to do the special case for SIGKILL and have it
      work right in all SMP cases (without changing all the existing ptrace
      stops), it needs to at least set TIF_SIGPENDING on each thread before
      resuming it.
      
      handle_stop_signal addresses a related race for SIGCONT by setting
      TIF_SIGPENDING already, so having SIGKILL handled the same way makes
      sense.
      
      Now it seems pretty clean to have handle_stop_signal resume threads for
      SIGKILL, and have on SIGKILL special case in group_send_sig_info.
      
      There is also an SMP race issue with cases like do_syscall_trace, i.e.
      TASK_STOPPED state set without holding the siglock.  So I think
      handle_stop_signal should call wake_up_process unconditionally.
      03e21831
    • Linus Torvalds's avatar
      Split up "struct signal_struct" into "signal" and "sighand" parts. · 8eae2998
      Linus Torvalds authored
      This is required to get make the old LinuxThread semantics work
      together with the fixed-for-POSIX full signal sharing. A traditional
      CLONE_SIGHAND thread (LinuxThread) will not see any other shared
      signal state, while a new-style CLONE_THREAD thread will share all
      of it.
      
      This way the two methods don't confuse each other.
      8eae2998
    • Linus Torvalds's avatar
      Don't special-case SIGKILL/SIGSTOP - the blocking masks should · fef31b03
      Linus Torvalds authored
      already take care of it.
      
      This fixes kernel threads that _do_ block SIGKILL/STOP.
      fef31b03
    • Roland McGrath's avatar
      [PATCH] do_sigaction locking cleanup · 530a7dbc
      Roland McGrath authored
      This changes do_sigaction to avoid read_lock(&tasklist_lock) on every
      call.  Only in the fairly uncommon cases where it's really needed will
      it take that lock (which requires unlocking and relocking the siglock
      for locking order).
      
      I also changed the ERESTARTSYS added in my earlier patch to ERESTARTNOINTR.
      That is an "instantaneous" case, and there is no reason to have it possibly
      return EINTR if !SA_RESTART (which AFAIK sigaction never could before, and
      it might not be kosher by POSIX); rollback is always better.
      530a7dbc
  25. 06 Feb, 2003 1 commit
    • Ingo Molnar's avatar
      [PATCH] signal-fixes-2.5.59-A4 · ebf5ebe3
      Ingo Molnar authored
      this is the current threading patchset, which accumulated up during the
      past two weeks. It consists of a biggest set of changes from Roland, to
      make threaded signals work. There were still tons of testcases and
      boundary conditions (mostly in the signal/exit/ptrace area) that we did
      not handle correctly.
      
      Roland's thread-signal semantics/behavior/ptrace fixes:
      
       - fix signal delivery race with do_exit() => signals are re-queued to the
         'process' if do_exit() finds pending unhandled ones. This prevents
         signals getting lost upon thread-sys_exit().
      
       - a non-main thread has died on one processor and gone to TASK_ZOMBIE,
         but before it's gotten to release_task a sys_wait4 on the other
         processor reaps it.  It's only because it's ptraced that this gets
         through eligible_child.  Somewhere in there the main thread is also
         dying so it reparents the child thread to hit that case.  This means
         that there is a race where P might be totally invalid.
      
       - forget_original_parent is not doing the right thing when the group
         leader dies, i.e. reparenting threads to init when there is a zombie
         group leader.  Perhaps it doesn't matter for any practical purpose
         without ptrace, though it makes for ppid=1 for each thread in core
         dumps, which looks funny. Incidentally, SIGCHLD here really should be
         p->exit_signal.
      
       - one of the gdb tests makes a questionable assumption about what kill
         will do when it has some threads stopped by ptrace and others running.
      
      exit races:
      
      1. Processor A is in sys_wait4 case TASK_STOPPED considering task P.
         Processor B is about to resume P and then switch to it.
      
         While A is inside that case block, B starts running P and it clears
         P->exit_code, or takes a pending fatal signal and sets it to a new
         value. Depending on the interleaving, the possible failure modes are:
              a. A gets to its put_user after B has cleared P->exit_code
                 => returns with WIFSTOPPED, WSTOPSIG==0
              b. A gets to its put_user after B has set P->exit_code anew
                 => returns with e.g. WIFSTOPPED, WSTOPSIG==SIGKILL
      
         A can spend an arbitrarily long time in that case block, because
         there's getrusage and put_user that can take page faults, and
         write_lock'ing of the tasklist_lock that can block.  But even if it's
         short the race is there in principle.
      
      2. This is new with NPTL, i.e. CLONE_THREAD.
         Two processors A and B are both in sys_wait4 case TASK_STOPPED
         considering task P.
      
         Both get through their tests and fetches of P->exit_code before either
         gets to P->exit_code = 0.  => two threads return the same pid from
         waitpid.
      
         In other interleavings where one processor gets to its put_user after
         the other has cleared P->exit_code, it's like case 1(a).
      
      
      3. SMP races with stop/cont signals
      
         First, take:
      
              kill(pid, SIGSTOP);
              kill(pid, SIGCONT);
      
         or:
      
              kill(pid, SIGSTOP);
              kill(pid, SIGKILL);
      
         It's possible for this to leave the process stopped with a pending
         SIGCONT/SIGKILL.  That's a state that should never be possible.
         Moreover, kill(pid, SIGKILL) without any repetition should always be
         enough to kill a process.  (Likewise SIGCONT when you know it's
         sequenced after the last stop signal, must be sufficient to resume a
         process.)
      
      4. take:
      
              kill(pid, SIGKILL);     // or any fatal signal
              kill(pid, SIGCONT);     // or SIGKILL
      
          it's possible for this to cause pid to be reaped with status 0
          instead of its true termination status.  The equivalent scenario
          happens when the process being killed is in an _exit call or a
          trap-induced fatal signal before the kills.
      
      plus i've done stability fixes for bugs that popped up during
      beta-testing, and minor tidying of Roland's changes:
      
       - a rare tasklist corruption during exec, causing some very spurious and
         colorful crashes.
      
       - a copy_process()-related dereference of already freed thread structure
         if hit with a SIGKILL in the wrong moment.
      
       - SMP spinlock deadlocks in the signal code
      
      this patchset has been tested quite well in the 2.4 backport of the
      threading changes - and i've done some stresstesting on 2.5.59 SMP as
      well, and did an x86 UP testcompile + testboot as well.
      ebf5ebe3
  26. 18 Jan, 2003 1 commit
    • Daniel Jacobowitz's avatar
      Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO · 1669ce53
      Daniel Jacobowitz authored
      These new ptrace commands allow a debugger to control signals more precisely;
      for instance, store a signal and deliver it later, as if it had come from the
      original outside process or in response to the same faulting memory access.
      1669ce53
  27. 15 Dec, 2002 2 commits
    • Ingo Molnar's avatar
      [PATCH] ptrace-sigfix-2.5.51-A1 · 08222038
      Ingo Molnar authored
      This fixes a threading/ptrace bug noticed by the gdb people: when a
      thread is ptraced but other threads in the thread group are not then a
      SIGTRAP (via int3 or any of the other debug traps) causes the child
      thread(s) to die unexpectedly.  This is because the default behavior for
      a no-handler SIGTRAP is to broadcast it.
      
      The solution is to make all such signals specific, then the ptracer (gdb)
      can filter the signal and upon continuation it's being handled properly
      (or put on the shared signal queue). SIGKILL and SIGSTOP are an exception.
      The patch only affects threaded and ptrace-d processes.
      08222038
    • Ingo Molnar's avatar
      [PATCH] threaded coredumps, tcore-fixes-2.5.51-A0 · b9daa006
      Ingo Molnar authored
      This fixes one more threaded-coredumps detail reported by the glibc
      people: all threads taken down by the coredump code should report the
      proper exit code.  We can do this rather easily via the group_exit
      mechanism.  'Other' threads used to report SIGKILL, which was highly
      confusing as the shell often displayed the 'Killed' message instead of a
      'Segmentation fault' message.
      
      Another missing bit was the 0x80 bit set in the exit status for all
      threads, if the coredump was successful.  (it's safe to set this bit in
      ->sig->group_exit_code in an unlocked way because all threads are
      artificially descheduled by the coredump code.)
      b9daa006
  28. 06 Dec, 2002 1 commit