• Andrew Morton's avatar
    revert "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY" · 60fd760f
    Andrew Morton authored
    Revert commit 0c2d64fb because it causes
    (arguably poorly designed) existing userspace to spend interminable
    periods closing billions of not-open file descriptors.
    
    We could bring this back, with some sort of opt-in tunable in /proc, which
    defaults to "off".
    
    Peter's alanysis follows:
    
    : I spent several hours trying to get to the bottom of a serious
    : performance issue that appeared on one of our servers after upgrading to
    : 2.6.28.  In the end it's what could be considered a userspace bug that
    : was triggered by a change in 2.6.28.  Since this might also affect other
    : people I figured I'd at least document what I found here, and maybe we
    : can even do something about it:
    :
    :
    : So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
    : the team maintaining our ftp archive complained that one of their
    : scripts that previously ran in a few minutes still hadn't even come
    : close to being done after an hour or so.  Downgrading to 2.6.27 fixed
    : that.
    :
    : Turns out that script is forking a lot and something in it or python or
    : whereever closes all the file descriptors it doesn't want to pass on.
    : That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
    : closes them all with a few exceptions.
    :
    : Turns out that takes a long time when your limit -n is now 2^20 (1048576).
    :
    : With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
    : now a thousand times that.
    :
    : 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
    : RLIM_INFINITY" (0c2d64fb)[1] that
    : allows, as the title implies, to set the limit for number of files to
    : infinity.
    :
    : Closer investigation showed that the broken default ulimit did not apply
    : to "system" processes (like stuff started from init).  In the end I
    : could establish that all processes that passed through pam_limit at one
    : point had the bad resource limit.
    :
    : Apparently the pam library in Debian etch (4.0) initializes the limits
    : to some default values when it doesn't have any settings in limit.conf
    : to override them.  Turns out that for nofiles this is RLIM_INFINITY.
    : Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
    : package version 0.79-5 fixes that - tho I'm not sure what side effects
    : that has.
    :
    : Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
    : uses a different pam (version).
    Reported-by: default avatarPeter Palfrader <weasel@debian.org>
    Cc: Adam Tkac <vonsch@gmail.com>
    Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
    Cc: <stable@kernel.org>		[2.6.28.x]
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    60fd760f
sys.c 41.9 KB