An error occurred fetching the project authors.
- 05 Aug, 2003 1 commit
-
-
Roland McGrath authored
A dying initial thread (thread group leader) sends SIGCHLD when it exits, but it ought to wait until all other threads exit as well. The cases of secondary threads exitting first were handled properly, but not this one. This exit.c patch fixes that test case, and I think catches the other potential bugs of this kind as well. The signal.c change adds some bug catchers, the second of which will trip on the test case in the absence of the exit.c fix.
-
- 27 Jul, 2003 1 commit
-
-
Linus Torvalds authored
also detach the threads. Otherwise we'll leave them around as zombies, waiting for them to be picked up by their parent. Which might be the execve() thread itself, causing a deadlock.
-
- 07 Jul, 2003 1 commit
-
-
Ulrich Drepper authored
This is the updated versions of the patch Ingo sent some time ago to implement a new tgkill() syscall which specifies the target thread without any possibility of ambiguity or thread ID wrap races, by passing in both the thread group _and_ the thread ID as the arguments. This is really needed since many/most people still run with limited PID ranges (maybe due to legacy apps breaking) and the PID reuse can cause problems.
-
- 04 Jul, 2003 2 commits
-
-
Ulrich Drepper authored
If a signal is sent via kill() or tkill() the kernel fills in the wrong PID value in the siginfo_t structure (obviously only if the handler has SA_SIGINFO set). POSIX specifies the the si_pid field is filled with the process ID, and in Linux parlance that's the "thread group" ID, not the thread ID.
-
Linus Torvalds authored
event (ie SIGSEGV, SIGFPE etc that happens as a result of a trap as opposed to an external event), if the signal is blocked we will not invoce a signal handler, we will just kill the thread with the signal. This is equivalent to what we do in the SIG_IGN case: you cannot ignore or block synchronous signals, and if you try, we'll just have to kill you. We don't want to handle endless recursive faults, which the old behaviour easily led to if the stack was bad, for example.
-
- 02 Jun, 2003 1 commit
-
-
Jim Houston authored
This adds a new interface to kernel/signal.c which allows signals to be sent using preallocated sigqueue structures. It also modifies kernel/posix-timers.c to use this interface. The current timer code may fail to deliver a timer expiry signal if there are no sigqueue structures available at the time of the expiry. The Posix specification is clear that the signal queuing resource should be allocated at timer_create time. This allows the error to be returned to the application rather than silently losing the signal. This patch does not change the sigqueue structure allocation policy. I hope to revisit that in another patch. Here is the definition for the new interface: struct sigqueue *sigqueue_alloc(void) Preallocate a sigqueue structure for use with the functions described below. void sigqueue_free(struct sigqueue *q) Free a preallocated sigqueue structure. If the sigqueue structure being freed is still queued, it will be removed from the queue. I currently leave the signal pending. It may be delivered without the siginfo structure. int send_sigqueue(int sig, struct sigqueue *q, struct task_struct *p) This function is equivalent to send_sig_info(). It queues a signal to the specified thread using the supplied sigqueue structure. The caller is expected to fill in the siginfo_t which is part of the sigqueue structure. int send_group_sigqueue(int sig, struct sigqueue *q, struct task_struct *p) This function is equivalent to send_group_sig_info(). It queues the signal to a process allowing the system to select which thread will receive the signal in a multi-threaded process. Again, the sigqueue structure is used to queue the signal. Both send_sigqueue() and send_group_sigqueue() return 0 if the signal is queued. They return 1 if the signal was not queued because the process is ignoring the signal. Both versions include code to increment the si_overrun count if the sigqueue entry is for a Posix timer and they are called while the sigqueue entry is still queued. Yes, I know that the current code doesn't rearm the timer until the signal is delivered. Having this extra bit of code doesn't do any harm, and I plan to use it. These routines do not check if there already is a legacy (non-realtime) signal pending. They always queue the signal. This requires that collect_signal() always checks if there is another matching siginfo before clearing the signal bit.
-
- 25 May, 2003 2 commits
-
-
Andrew Morton authored
From: Manfred Spraul <manfred@colorfullife.com> de_thread is called by exec to kill all threads in the thread group except the threads required for exec. The waiting is implemented by waiting for a wakeup from __exit_signal: If the reference count is less or equal to 2, then the waiter is woken up. If exec is called by a non-leader thread, then two threads are required for exec. But if a thread group leader calls exec, then only one thread is required for exec. Thus the hardcoded "2" leads to a superfluous wakeup. The patch fixes that by adding a "notify_count" field to the signal structure.
-
Geert Uytterhoeven authored
Kill warning about unused static functions if HAVE_ARCH_GET_SIGNAL_TO_DELIVER is defined.
-
- 19 May, 2003 1 commit
-
-
Ingo Molnar authored
This fixes an SMP window where the kernel could miss to handle a signal, and increase signal delivery latency up to 200 msecs. Sun has reported to Ulrich that their JVM sees occasional unexpected signal delays under Linux. The more CPUs, the more delays. The cause of the problem is that the current signal wakeup implementation is racy in kernel/signal.c:signal_wake_up(): if (t->state == TASK_RUNNING) kick_if_running(t); ... if (t->state & mask) { wake_up_process(t); return; } If thread (or process) 't' is woken up on another CPU right after the TASK_RUNNING check, and thread starts to run, then the wake_up_process() here will do nothing, and the signal stays pending up until the thread will call into the kernel next time - which can be up to 200 msecs later. The solution is to do the 'kicking' of a running thread on a remote CPU atomically with the wakeup. For this i've added wake_up_process_kick(). There is no slowdown for the other wakeup codepaths, the new flag to try_to_wake_up() is compiled off for them. Some other subsystems might want to use this wakeup facility as well in the future (eg. AIO). In fact this race triggers quite often under Volanomark rusg, with this change added, Volanomark performance is up from 500-800 to 2000-3000, on a 4-way x86 box.
-
- 12 May, 2003 1 commit
-
-
Steven Cole authored
Don't depend on undefined preprocessor symbols evaluating to zero.
-
- 09 May, 2003 1 commit
-
-
Petr Vandrovec authored
send_sig_info() has been broken since 2.5.60. The function can be invoked from a the time interrupt (timer_interrpt -> do_timer -> update_process_times -> -> update_one_process -> ( do_process_times, do_it_prof, do_it_virt ) -> -> send_sig -> send_sig_info) but it uses spin_unlock_irq instead of the correct spin_unlock_irqrestore. This enables interrupts, and later scheduler_tick() locks runqueue (without disabling interrupts). And if we are unlucky, a new interrupt comes at this point. And if this interrupt tries to do wake_up() (like RTC interrupt does), we will deadlock on runqueue lock :-( The bug was introduced by signal-fixes-2.5.59-A4, which split the original send_sig_info into two functions, and in one branch it started using these unsafe spinlock variants (while the "group" variant uses irqsave/restore correctly).
-
- 08 Apr, 2003 1 commit
-
-
Linus Torvalds authored
was the first file tested with my type checker with the anal pointer attribute checking turned on.
-
- 04 Apr, 2003 1 commit
-
-
Roland McGrath authored
Here is the cleanup patch I promised back in February. Sorry it took a while. The effects should be purely cosmetic in 2.5.66. However, the new interface for the proper way to send thread-specific of process-global signals from inside the kernel is needed for correct implementation of some fixes to timer stuff that Ulrich told me about. This cleans up some obsolete comments and macros in kernel/signal.c, restores send_sig_info to its original behavior, and adds a global entry point send_group_sig_info. I checked all the uses of send_sig and send_sig_info and changed a few to send_group_sig_info. I think it would be cleanest if the whole mess of *_sig* entry points were reduced to two or three, but I did the change that minimized the number of callers I had to fix up. There should be no discernible difference, since the 2.5.66 send_sig_info function did group semantics for those signals by number already. The only exception to that is pdeath_signal, which I guess can be any signal number but I deemed ought to be process-wide. I did not change any of the calls using SIGKILL, though that does have process-wide semantics. There is no need to change it since SIGKILL always kills the whole group, though the code path for send_sig(SIGKILL,...) calls in multithreaded processes will be different now.
-
- 22 Mar, 2003 1 commit
-
-
Andrew Morton authored
From: "Randy.Dunlap" <randy.dunlap@verizon.net> Fix up various syscalls to return longs, as x86_64 and ia64 (at least) require.
-
- 17 Mar, 2003 1 commit
-
-
Roland McGrath authored
This is a fix made almost a month ago, during the flurry of signal changes. I didn't realize until today that this hadn't made it into 2.5. Sorry about the delay. This fix is necessary to avoid sometimes wedging in uninterruptible sleep when doing a multithreaded core dump triggered by a process signal (kill) rather than a trap. You can reproduce the problem by running your favorite multithreaded program (NPTL) and then using "kill -SEGV" on it. It will often wedge. The actual fix could be just a two line diff: + if (current->signal->group_exit) + goto dequeue; after the group_exit_task check. That is the fix that has been used in Ingo's backport for weeks and tested heavily (well, as heavily as core dumping ever gets tested, but it's been in our production systems). But I broke the hair out into a separate function. The patch below has the same effect as the two-liner, and no other difference. I have tested 2.5.64 with this patch and it works for me, though I haven't beat on it. The way the wedge happens is that for a core-dump signal group_send_sig_info does a group stop of other threads before the one thread handles the fatal signal. If the fatal thread gets into do_coredump and coredump_wait first, then other threads see the group stop and suspend with SIGKILL pending. All other fatal cases clear group_stop_count, so this is the only way this ever happens. Checking group_exit fixes it. I didn't make do_coredump clear group_stop_count because doing it with the appropriate ordering and locking doesn't fit the organization that code.
-
- 24 Feb, 2003 1 commit
-
-
Linus Torvalds authored
Make kmod force default handlers before executing the user process.
-
- 18 Feb, 2003 2 commits
-
-
Andrew Morton authored
-
George Anzinger authored
This is version 23 or so of the POSIX timer code. Internal changelog: - Changed the signals code to match the new order of things. Also the new xtime_lock code needed to be picked up. It made some things a lot simpler. - Fixed a spin lock hand off problem in locking timers (thanks to Randy). - Fixed nanosleep to test for out of bound nanoseconds (thanks to Julie). - Fixed a couple of id deallocation bugs that left old ids laying around (hey I get this one). - This version has a new timer id manager. Andrew Morton suggested elimination of recursion (done) and I added code to allow it to release unused nodes. The prior version only released the leaf nodes. (The id manager uses radix tree type nodes.) Also added is a reuse count so ids will not repeat for at least 256 alloc/ free cycles. - The changes for the new sys_call restart now allow one restart function to handle both nanosleep and clock_nanosleep. Saves a bit of code, nice. - All the requested changes and Lindent too :). - I also broke clock_nanosleep() apart much the same way nanosleep() was with the 2.5.50-bk5 changes. TIMER STORMS The POSIX clocks and timers code prevents "timer storms" by not putting repeating timers back in the timer list until the signal is delivered for the prior expiry. Timer events missed by this delay are accounted for in the timer overrun count. The net result is MUCH lower system overhead while presenting the same info to the user as would be the case if an interrupt and timer processing were required for each increment in the overrun count.
-
- 17 Feb, 2003 1 commit
-
-
Linus Torvalds authored
state changes due to execve() and exit(). We need to hold the tasklist lock to guarantee stability of "task->sighand".
-
- 15 Feb, 2003 1 commit
-
-
Daniel Jacobowitz authored
-
- 13 Feb, 2003 1 commit
-
-
Linus Torvalds authored
This simplifies it and makes it more generic.
-
- 12 Feb, 2003 1 commit
-
-
Linus Torvalds authored
-
- 11 Feb, 2003 2 commits
-
-
Linus Torvalds authored
Add a name argument to daemonize() (va_arg) to avoid all the kernel threads having to duplicate the name setting over and over again. Make daemonize() disable all signals by default, and add a "allow_signal()" function to let daemons say they explicitly want to support a signal. Make flush_signal() take the signal lock, so that callers do not need to.
-
Linus Torvalds authored
tasks (even if we don't otherwise need to wake anything up), since otherwise later signals would see that signals are already pending and wouldn't cause wakeups.
-
- 10 Feb, 2003 1 commit
-
-
David S. Miller authored
The few platforms that cannot use the generic get_signal_to_deliver implementation cannot do so because they do special things for ptraced children. This can be easily avoided and thus all of the signal handling code duplication can be eliminated. This is the first part, which adds a platform hook right before the parent of the ptraced child is woken. Data can be passed in via a cookie argument. The next part will be dealing with platforms that need to muck with breakpoints in the child in this same code block.
-
- 09 Feb, 2003 5 commits
-
-
Linus Torvalds authored
so that there isn't any window for running before the signal handler has been invoced.
-
Linus Torvalds authored
from certain states. This simplifies "default_wake_function()", and makes it possible for signal handling to wake up only the processes it _should_ wake up without races.
-
Linus Torvalds authored
return an error. Interestingly, nobody much seems to care. Apparently few programs check the error value.
-
Linus Torvalds authored
them do want to temporarily block signals. Kernel users can also block signals that are normally unblockable to user space, ie SIGKILL and SIGSTOP. Make nfsd and autofs use the new interface, as an example to others.
-
Ingo Molnar authored
- a read_lock(&tasklist_lock) is missing around the group_send_sig_info() in send_sig_info().
-
- 08 Feb, 2003 1 commit
-
-
Linus Torvalds authored
This fixes the signal code to not wake up threads with blocked signals, especially noticeable with kernel threads that may not be able to handle signals at all. We also don't unnecessarily wake processes in TASK_UNINTERRUPTIBLE.
-
- 07 Feb, 2003 4 commits
-
-
Roland McGrath authored
For handle_stop_signal to do the special case for SIGKILL and have it work right in all SMP cases (without changing all the existing ptrace stops), it needs to at least set TIF_SIGPENDING on each thread before resuming it. handle_stop_signal addresses a related race for SIGCONT by setting TIF_SIGPENDING already, so having SIGKILL handled the same way makes sense. Now it seems pretty clean to have handle_stop_signal resume threads for SIGKILL, and have on SIGKILL special case in group_send_sig_info. There is also an SMP race issue with cases like do_syscall_trace, i.e. TASK_STOPPED state set without holding the siglock. So I think handle_stop_signal should call wake_up_process unconditionally.
-
Linus Torvalds authored
This is required to get make the old LinuxThread semantics work together with the fixed-for-POSIX full signal sharing. A traditional CLONE_SIGHAND thread (LinuxThread) will not see any other shared signal state, while a new-style CLONE_THREAD thread will share all of it. This way the two methods don't confuse each other.
-
Linus Torvalds authored
already take care of it. This fixes kernel threads that _do_ block SIGKILL/STOP.
-
Roland McGrath authored
This changes do_sigaction to avoid read_lock(&tasklist_lock) on every call. Only in the fairly uncommon cases where it's really needed will it take that lock (which requires unlocking and relocking the siglock for locking order). I also changed the ERESTARTSYS added in my earlier patch to ERESTARTNOINTR. That is an "instantaneous" case, and there is no reason to have it possibly return EINTR if !SA_RESTART (which AFAIK sigaction never could before, and it might not be kosher by POSIX); rollback is always better.
-
- 06 Feb, 2003 1 commit
-
-
Ingo Molnar authored
this is the current threading patchset, which accumulated up during the past two weeks. It consists of a biggest set of changes from Roland, to make threaded signals work. There were still tons of testcases and boundary conditions (mostly in the signal/exit/ptrace area) that we did not handle correctly. Roland's thread-signal semantics/behavior/ptrace fixes: - fix signal delivery race with do_exit() => signals are re-queued to the 'process' if do_exit() finds pending unhandled ones. This prevents signals getting lost upon thread-sys_exit(). - a non-main thread has died on one processor and gone to TASK_ZOMBIE, but before it's gotten to release_task a sys_wait4 on the other processor reaps it. It's only because it's ptraced that this gets through eligible_child. Somewhere in there the main thread is also dying so it reparents the child thread to hit that case. This means that there is a race where P might be totally invalid. - forget_original_parent is not doing the right thing when the group leader dies, i.e. reparenting threads to init when there is a zombie group leader. Perhaps it doesn't matter for any practical purpose without ptrace, though it makes for ppid=1 for each thread in core dumps, which looks funny. Incidentally, SIGCHLD here really should be p->exit_signal. - one of the gdb tests makes a questionable assumption about what kill will do when it has some threads stopped by ptrace and others running. exit races: 1. Processor A is in sys_wait4 case TASK_STOPPED considering task P. Processor B is about to resume P and then switch to it. While A is inside that case block, B starts running P and it clears P->exit_code, or takes a pending fatal signal and sets it to a new value. Depending on the interleaving, the possible failure modes are: a. A gets to its put_user after B has cleared P->exit_code => returns with WIFSTOPPED, WSTOPSIG==0 b. A gets to its put_user after B has set P->exit_code anew => returns with e.g. WIFSTOPPED, WSTOPSIG==SIGKILL A can spend an arbitrarily long time in that case block, because there's getrusage and put_user that can take page faults, and write_lock'ing of the tasklist_lock that can block. But even if it's short the race is there in principle. 2. This is new with NPTL, i.e. CLONE_THREAD. Two processors A and B are both in sys_wait4 case TASK_STOPPED considering task P. Both get through their tests and fetches of P->exit_code before either gets to P->exit_code = 0. => two threads return the same pid from waitpid. In other interleavings where one processor gets to its put_user after the other has cleared P->exit_code, it's like case 1(a). 3. SMP races with stop/cont signals First, take: kill(pid, SIGSTOP); kill(pid, SIGCONT); or: kill(pid, SIGSTOP); kill(pid, SIGKILL); It's possible for this to leave the process stopped with a pending SIGCONT/SIGKILL. That's a state that should never be possible. Moreover, kill(pid, SIGKILL) without any repetition should always be enough to kill a process. (Likewise SIGCONT when you know it's sequenced after the last stop signal, must be sufficient to resume a process.) 4. take: kill(pid, SIGKILL); // or any fatal signal kill(pid, SIGCONT); // or SIGKILL it's possible for this to cause pid to be reaped with status 0 instead of its true termination status. The equivalent scenario happens when the process being killed is in an _exit call or a trap-induced fatal signal before the kills. plus i've done stability fixes for bugs that popped up during beta-testing, and minor tidying of Roland's changes: - a rare tasklist corruption during exec, causing some very spurious and colorful crashes. - a copy_process()-related dereference of already freed thread structure if hit with a SIGKILL in the wrong moment. - SMP spinlock deadlocks in the signal code this patchset has been tested quite well in the 2.4 backport of the threading changes - and i've done some stresstesting on 2.5.59 SMP as well, and did an x86 UP testcompile + testboot as well.
-
- 18 Jan, 2003 1 commit
-
-
Daniel Jacobowitz authored
These new ptrace commands allow a debugger to control signals more precisely; for instance, store a signal and deliver it later, as if it had come from the original outside process or in response to the same faulting memory access.
-
- 15 Dec, 2002 2 commits
-
-
Ingo Molnar authored
This fixes a threading/ptrace bug noticed by the gdb people: when a thread is ptraced but other threads in the thread group are not then a SIGTRAP (via int3 or any of the other debug traps) causes the child thread(s) to die unexpectedly. This is because the default behavior for a no-handler SIGTRAP is to broadcast it. The solution is to make all such signals specific, then the ptracer (gdb) can filter the signal and upon continuation it's being handled properly (or put on the shared signal queue). SIGKILL and SIGSTOP are an exception. The patch only affects threaded and ptrace-d processes.
-
Ingo Molnar authored
This fixes one more threaded-coredumps detail reported by the glibc people: all threads taken down by the coredump code should report the proper exit code. We can do this rather easily via the group_exit mechanism. 'Other' threads used to report SIGKILL, which was highly confusing as the shell often displayed the 'Killed' message instead of a 'Segmentation fault' message. Another missing bit was the 0x80 bit set in the exit status for all threads, if the coredump was successful. (it's safe to set this bit in ->sig->group_exit_code in an unlocked way because all threads are artificially descheduled by the coredump code.)
-
- 06 Dec, 2002 1 commit
-
-
Linus Torvalds authored
-