• Roland McGrath's avatar
    [PATCH] fix rusage semantics · 497c9d68
    Roland McGrath authored
    This patch changes the rusage bookkeeping and the semantics of the
    getrusage and times calls in a couple of ways.
    
    The first change is in the c* fields counting dead child processes.  POSIX
    requires that children that have died be counted in these fields when they
    are reaped by a wait* call, and that if they are never reaped (e.g.
    because of ignoring SIGCHLD, or exitting yourself first) then they are
    never counted.  These were counted in release_task for all threads.  I've
    changed it so they are counted in wait_task_zombie, i.e.  exactly when
    being reaped.
    
    POSIX also specifies for RUSAGE_CHILDREN that the report include the reaped
    child processes of the calling process, i.e.  whole thread group in Linux,
    not just ones forked by the calling thread.  POSIX specifies tms_c[us]time
    fields in the times call the same way.  I've moved the c* fields that
    contain this information into signal_struct, where the single set of
    counters accumulates data from any thread in the group that calls wait*.
    
    Finally, POSIX specifies getrusage and times as returning cumulative totals
    for the whole process (aka thread group), not just the calling thread.
    I've added fields in signal_struct to accumulate the stats of detached
    threads as they die.  The process stats are the sums of these records plus
    the stats of remaining each live/zombie thread.  The times and getrusage
    calls, and the internal uses for filling in wait4 results and siginfo_t,
    now iterate over the threads in the thread group and sum up their stats
    along with the stats recorded for threads already dead and gone.
    
    I added a new value RUSAGE_GROUP (-3) for the getrusage system call rather
    than changing the behavior of the old RUSAGE_SELF (0).  POSIX specifies
    RUSAGE_SELF to mean all threads, so the glibc getrusage call will just
    translate it to RUSAGE_GROUP for new kernels.  I did this thinking that
    someone somewhere might want the old behavior with an old glibc and a new
    kernel (it is only different if they are using CLONE_THREAD anyway). 
    However, I've changed the times system call to conform to POSIX as well and
    did not provide any backward compatibility there.  In that case there is
    nothing easy like a parameter value to use, it would have to be a new
    system call number.  That seems pretty pointless.  Given that, I wonder if
    it is worth bothering to preserve the compatible RUSAGE_SELF behavior by
    introducing RUSAGE_GROUP instead of just changing RUSAGE_SELF's meaning.
    Comments?
    
    I've done some basic testing on x86 and x86-64, and all the numbers come
    out right after these fixes.  (I have a test program that shows a few
    Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    497c9d68
osf_sys.c 31 KB