• Michal Hocko's avatar
    kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd · 95bacfe6
    Michal Hocko authored
    commit 735f2770 upstream.
    
    Commit fec1d011 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal
    exit") has caused a subtle regression in nscd which uses
    CLONE_CHILD_CLEARTID to clear the nscd_certainly_running flag in the
    shared databases, so that the clients are notified when nscd is
    restarted.  Now, when nscd uses a non-persistent database, clients that
    have it mapped keep thinking the database is being updated by nscd, when
    in fact nscd has created a new (anonymous) one (for non-persistent
    databases it uses an unlinked file as backend).
    
    The original proposal for the CLONE_CHILD_CLEARTID change claimed
    (https://lkml.org/lkml/2006/10/25/233):
    
    : The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls
    : on behalf of pthread_create() library calls.  This feature is used to
    : request that the kernel clear the thread-id in user space (at an address
    : provided in the syscall) when the thread disassociates itself from the
    : address space, which is done in mm_release().
    :
    : Unfortunately, when a multi-threaded process incurs a core dump (such as
    : from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of
    : the other threads, which then proceed to clear their user-space tids
    : before synchronizing in exit_mm() with the start of core dumping.  This
    : misrepresents the state of process's address space at the time of the
    : SIGSEGV and makes it more difficult for someone to debug NPTL and glibc
    : problems (misleading him/her to conclude that the threads had gone away
    : before the fault).
    :
    : The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a
    : core dump has been initiated.
    
    The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269)
    seems to have a larger scope than the original patch asked for.  It
    seems that limitting the scope of the check to core dumping should work
    for SIGSEGV issue describe above.
    
    [Changelog partly based on Andreas' description]
    Fixes: fec1d011 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
    Link: http://lkml.kernel.org/r/1471968749-26173-1-git-send-email-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Tested-by: default avatarWilliam Preston <wpreston@suse.com>
    Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
    Cc: Roland McGrath <roland@hack.frob.com>
    Cc: Andreas Schwab <schwab@suse.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
    95bacfe6
fork.c 46.3 KB