• Roman Gushchin's avatar
    proc, coredump: add CoreDumping flag to /proc/pid/status · c6434012
    Roman Gushchin authored
    Right now there is no convenient way to check if a process is being
    coredumped at the moment.
    
    It might be necessary to recognize such state to prevent killing the
    process and getting a broken coredump.  Writing a large core might take
    significant time, and the process is unresponsive during it, so it might
    be killed by timeout, if another process is monitoring and
    killing/restarting hanging tasks.
    
    We're getting a significant number of corrupted coredump files on
    machines in our fleet, just because processes are being killed by
    timeout in the middle of the core writing process.
    
    We do have a process health check, and some agent is responsible for
    restarting processes which are not responding for health check requests.
    Writing a large coredump to the disk can easily exceed the reasonable
    timeout (especially on an overloaded machine).
    
    This flag will allow the agent to distinguish processes which are being
    coredumped, extend the timeout for them, and let them produce a full
    coredump file.
    
    To provide an ability to detect if a process is in the state of being
    coredumped, we can expose a boolean CoreDumping flag in
    /proc/pid/status.
    
    Example:
    $ cat core.sh
      #!/bin/sh
    
      echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
      sleep 1000 &
      PID=$!
    
      cat /proc/$PID/status | grep CoreDumping
      kill -ABRT $PID
      sleep 1
      cat /proc/$PID/status | grep CoreDumping
    
    $ ./core.sh
      CoreDumping:	0
      CoreDumping:	1
    
    [guro@fb.com: document CoreDumping flag in /proc/<pid>/status]
      Link: http://lkml.kernel.org/r/20170928135357.GA8470@castle.DHCP.thefacebook.com
    Link: http://lkml.kernel.org/r/20170920230634.31572-1-guro@fb.comSigned-off-by: default avatarRoman Gushchin <guro@fb.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Konstantin Khlebnikov <koct9i@gmail.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    c6434012
proc.txt 88.5 KB