Commit f491bd71 authored by Michael Kerrisk (man-pages)'s avatar Michael Kerrisk (man-pages) Committed by Linus Torvalds

pipe: relocate round_pipe_size() above pipe_set_size()

Patch series "pipe: fix limit handling", v2.

When changing a pipe's capacity with fcntl(F_SETPIPE_SZ), various limits
defined by /proc/sys/fs/pipe-* files are checked to see if unprivileged
users are exceeding limits on memory consumption.

While documenting and testing the operation of these limits I noticed
that, as currently implemented, these checks have a number of problems:

(1) When increasing the pipe capacity, the checks against the limits
    in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against
    existing consumption, and exclude the memory required for the
    increased pipe capacity. The new increase in pipe capacity can then
    push the total memory used by the user for pipes (possibly far) over
    a limit. This can also trigger the problem described next.

(2) The limit checks are performed even when the new pipe capacity
    is less than the existing pipe capacity. This can lead to problems
    if a user sets a large pipe capacity, and then the limits are
    lowered, with the result that the user will no longer be able to
    decrease the pipe capacity.

(3) As currently implemented, accounting and checking against the
    limits is done as follows:

    (a) Test whether the user has exceeded the limit.
    (b) Make new pipe buffer allocation.
    (c) Account new allocation against the limits.

    This is racey. Multiple processes may pass point (a) simultaneously,
    and then allocate pipe buffers that are accounted for only in step
    (c).  The race means that the user's pipe buffer allocation could be
    pushed over the limit (by an arbitrary amount, depending on how
    unlucky we were in the race). [Thanks to Vegard Nossum for spotting
    this point, which I had missed.]

This patch series addresses these three problems.

This patch (of 8):

This is a minor preparatory patch.  After subsequent patches,
round_pipe_size() will be called from pipe_set_size(), so place
round_pipe_size() above pipe_set_size().

Link: http://lkml.kernel.org/r/91a91fdb-a959-ba7f-b551-b62477cc98a1@gmail.comSigned-off-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
Reviewed-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: <socketpair@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Jens Axboe <axboe@fb.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent fcc24534
...@@ -1007,6 +1007,18 @@ const struct file_operations pipefifo_fops = { ...@@ -1007,6 +1007,18 @@ const struct file_operations pipefifo_fops = {
.fasync = pipe_fasync, .fasync = pipe_fasync,
}; };
/*
* Currently we rely on the pipe array holding a power-of-2 number
* of pages.
*/
static inline unsigned int round_pipe_size(unsigned int size)
{
unsigned long nr_pages;
nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
return roundup_pow_of_two(nr_pages) << PAGE_SHIFT;
}
/* /*
* Allocate a new array of pipe buffers and copy the info over. Returns the * Allocate a new array of pipe buffers and copy the info over. Returns the
* pipe size if successful, or return -ERROR on error. * pipe size if successful, or return -ERROR on error.
...@@ -1058,18 +1070,6 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long nr_pages) ...@@ -1058,18 +1070,6 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long nr_pages)
return nr_pages * PAGE_SIZE; return nr_pages * PAGE_SIZE;
} }
/*
* Currently we rely on the pipe array holding a power-of-2 number
* of pages.
*/
static inline unsigned int round_pipe_size(unsigned int size)
{
unsigned long nr_pages;
nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
return roundup_pow_of_two(nr_pages) << PAGE_SHIFT;
}
/* /*
* This should work even if CONFIG_PROC_FS isn't set, as proc_dointvec_minmax * This should work even if CONFIG_PROC_FS isn't set, as proc_dointvec_minmax
* will return an error. * will return an error.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment