- 18 Nov, 2017 40 commits
-
-
Ryusuke Konishi authored
Fix the following checkpatch warning: WARNING: Block comments should align the * on each line #633: FILE: sufile.c:633: +/** + * nilfs_sufile_truncate_range - truncate range of segment array Link: http://lkml.kernel.org/r/1509367935-3086-4-git-send-email-konishi.ryusuke@lab.ntt.co.jpSigned-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Elena Reshetova authored
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nilfs_root.count is used as pure reference counter. Convert it to refcount_t and fix up the operations. Link: http://lkml.kernel.org/r/1509367935-3086-3-git-send-email-konishi.ryusuke@lab.ntt.co.jpSigned-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andreas Rohner authored
There is a race condition between nilfs_dirty_inode() and nilfs_set_file_dirty(). When a file is opened, nilfs_dirty_inode() is called to update the access timestamp in the inode. It calls __nilfs_mark_inode_dirty() in a separate transaction. __nilfs_mark_inode_dirty() caches the ifile buffer_head in the i_bh field of the inode info structure and marks it as dirty. After some data was written to the file in another transaction, the function nilfs_set_file_dirty() is called, which adds the inode to the ns_dirty_files list. Then the segment construction calls nilfs_segctor_collect_dirty_files(), which goes through the ns_dirty_files list and checks the i_bh field. If there is a cached buffer_head in i_bh it is not marked as dirty again. Since nilfs_dirty_inode() and nilfs_set_file_dirty() use separate transactions, it is possible that a segment construction that writes out the ifile occurs in-between the two. If this happens the inode is not on the ns_dirty_files list, but its ifile block is still marked as dirty and written out. In the next segment construction, the data for the file is written out and nilfs_bmap_propagate() updates the b-tree. Eventually the bmap root is written into the i_bh block, which is not dirty, because it was written out in another segment construction. As a result the bmap update can be lost, which leads to file system corruption. Either the virtual block address points to an unallocated DAT block, or the DAT entry will be reused for something different. The error can remain undetected for a long time. A typical error message would be one of the "bad btree" errors or a warning that a DAT entry could not be found. This bug can be reproduced reliably by a simple benchmark that creates and overwrites millions of 4k files. Link: http://lkml.kernel.org/r/1509367935-3086-2-git-send-email-konishi.ryusuke@lab.ntt.co.jpSigned-off-by: Andreas Rohner <andreas.rohner@gmx.net> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Andreas Rohner <andreas.rohner@gmx.net> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kees Cook authored
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. This requires adding a pointer to hold the timer's target task, as the lifetime of sc_task doesn't appear to match the timer's task. Link: http://lkml.kernel.org/r/20171016235900.GA102729@beastSigned-off-by: Kees Cook <keescook@chromium.org> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Lawrence authored
Mikulas noticed in the existing do_proc_douintvec_minmax_conv() and do_proc_dopipe_max_size_conv() introduced in this patchset, that they inconsistently handle overflow and min/max range inputs: For example: 0 ... param->min - 1 ---> ERANGE param->min ... param->max ---> the value is accepted param->max + 1 ... 0x100000000L + param->min - 1 ---> ERANGE 0x100000000L + param->min ... 0x100000000L + param->max ---> EINVAL 0x100000000L + param->max + 1, 0x200000000L + param->min - 1 ---> ERANGE 0x200000000L + param->min ... 0x200000000L + param->max ---> EINVAL 0x200000000L + param->max + 1, 0x300000000L + param->min - 1 ---> ERANGE In do_proc_do*() routines which store values into unsigned int variables (4 bytes wide for 64-bit builds), first validate that the input unsigned long value (8 bytes wide for 64-bit builds) will fit inside the smaller unsigned int variable. Then check that the unsigned int value falls inside the specified parameter min, max range. Otherwise the unsigned long -> unsigned int conversion drops leading bits from the input value, leading to the inconsistent pattern Mikulas documented above. Link: http://lkml.kernel.org/r/1507658689-11669-5-git-send-email-joe.lawrence@redhat.comSigned-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Lawrence authored
pipe_max_size is assigned directly via procfs sysctl: static struct ctl_table fs_table[] = { ... { .procname = "pipe-max-size", .data = &pipe_max_size, .maxlen = sizeof(int), .mode = 0644, .proc_handler = &pipe_proc_fn, .extra1 = &pipe_min_size, }, ... int pipe_proc_fn(struct ctl_table *table, int write, void __user *buf, size_t *lenp, loff_t *ppos) { ... ret = proc_dointvec_minmax(table, write, buf, lenp, ppos) ... and then later rounded in-place a few statements later: ... pipe_max_size = round_pipe_size(pipe_max_size); ... This leaves a window of time between initial assignment and rounding that may be visible to other threads. (For example, one thread sets a non-rounded value to pipe_max_size while another reads its value.) Similar reads of pipe_max_size are potentially racy: pipe.c :: alloc_pipe_info() pipe.c :: pipe_set_size() Add a new proc_dopipe_max_size() that consolidates reading the new value from the user buffer, verifying bounds, and calling round_pipe_size() with a single assignment to pipe_max_size. Link: http://lkml.kernel.org/r/1507658689-11669-4-git-send-email-joe.lawrence@redhat.comSigned-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Lawrence authored
round_pipe_size() contains a right-bit-shift expression which may overflow, which would cause undefined results in a subsequent roundup_pow_of_two() call. static inline unsigned int round_pipe_size(unsigned int size) { unsigned long nr_pages; nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; return roundup_pow_of_two(nr_pages) << PAGE_SHIFT; } PAGE_SIZE is defined as (1UL << PAGE_SHIFT), so: - 4 bytes wide on 32-bit (0 to 0xffffffff) - 8 bytes wide on 64-bit (0 to 0xffffffffffffffff) That means that 32-bit round_pipe_size(), nr_pages may overflow to 0: size=0x00000000 nr_pages=0x0 size=0x00000001 nr_pages=0x1 size=0xfffff000 nr_pages=0xfffff size=0xfffff001 nr_pages=0x0 << ! size=0xffffffff nr_pages=0x0 << ! This is bad because roundup_pow_of_two(n) is undefined when n == 0! 64-bit is not a problem as the unsigned int size is 4 bytes wide (similar to 32-bit) and the larger, 8 byte wide unsigned long, is sufficient to handle the largest value of the bit shift expression: size=0xffffffff nr_pages=100000 Modify round_pipe_size() to return 0 if n == 0 and updates its callers to handle accordingly. Link: http://lkml.kernel.org/r/1507658689-11669-3-git-send-email-joe.lawrence@redhat.comSigned-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Lawrence authored
Patch series "A few round_pipe_size() and pipe-max-size fixups", v3. While backporting Michael's "pipe: fix limit handling" patchset to a distro-kernel, Mikulas noticed that current upstream pipe limit handling contains a few problems: 1 - procfs signed wrap: echo'ing a large number into /proc/sys/fs/pipe-max-size and then cat'ing it back out shows a negative value. 2 - round_pipe_size() nr_pages overflow on 32bit: this would subsequently try roundup_pow_of_two(0), which is undefined. 3 - visible non-rounded pipe-max-size value: there is no mutual exclusion or protection between the time pipe_max_size is assigned a raw value from proc_dointvec_minmax() and when it is rounded. 4 - unsigned long -> unsigned int conversion makes for potential odd return errors from do_proc_douintvec_minmax_conv() and do_proc_dopipe_max_size_conv(). This version underwent the same testing as v1: https://marc.info/?l=linux-kernel&m=150643571406022&w=2 This patch (of 4): pipe_max_size is defined as an unsigned int: unsigned int pipe_max_size = 1048576; but its procfs/sysctl representation is an integer: static struct ctl_table fs_table[] = { ... { .procname = "pipe-max-size", .data = &pipe_max_size, .maxlen = sizeof(int), .mode = 0644, .proc_handler = &pipe_proc_fn, .extra1 = &pipe_min_size, }, ... that is signed: int pipe_proc_fn(struct ctl_table *table, int write, void __user *buf, size_t *lenp, loff_t *ppos) { ... ret = proc_dointvec_minmax(table, write, buf, lenp, ppos) This leads to signed results via procfs for large values of pipe_max_size: % echo 2147483647 >/proc/sys/fs/pipe-max-size % cat /proc/sys/fs/pipe-max-size -2147483648 Use unsigned operations on this variable to avoid such negative values. Link: http://lkml.kernel.org/r/1507658689-11669-2-git-send-email-joe.lawrence@redhat.comSigned-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
NeilBrown authored
Currently if the autofs kernel module gets an error when writing to the pipe which links to the daemon, then it marks the whole moutpoint as catatonic, and it will stop working. It is possible that the error is transient. This can happen if the daemon is slow and more than 16 requests queue up. If a subsequent process tries to queue a request, and is then signalled, the write to the pipe will return -ERESTARTSYS and autofs will take that as total failure. So change the code to assess -ERESTARTSYS and -ENOMEM as transient failures which only abort the current request, not the whole mountpoint. It isn't a crash or a data corruption, but having autofs mountpoints suddenly stop working is rather inconvenient. Ian said: : And given the problems with a half dozen (or so) user space applications : consuming large amounts of CPU under heavy mount and umount activity this : could happen more easily than we expect. Link: http://lkml.kernel.org/r/87y3norvgp.fsf@notabene.neil.brown.nameSigned-off-by: NeilBrown <neilb@suse.com> Acked-by: Ian Kent <raven@themaw.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Masahiro Yamada authored
init/version.c has nothing to do with modules, so remove the <linux/modude.h>. Instead, include <linux/export.h> for EXPORT_SYMBOL_GPL. This cuts off a lot of unnecessary header parsing. Link: http://lkml.kernel.org/r/1505920984-8523-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jason Baron authored
The use of ep_call_nested() in ep_eventpoll_poll(), which is the .poll routine for an epoll fd, is used to prevent excessively deep epoll nesting, and to prevent circular paths. However, we are already preventing these conditions during EPOLL_CTL_ADD. In terms of too deep epoll chains, we do in fact allow deep nesting of the epoll fds themselves (deeper than EP_MAX_NESTS), however we don't allow more than EP_MAX_NESTS when an epoll file descriptor is actually connected to a wakeup source. Thus, we do not require the use of ep_call_nested(), since ep_eventpoll_poll(), which is called via ep_scan_ready_list() only continues nesting if there are events available. Since ep_call_nested() is implemented using a global lock, applications that make use of nested epoll can see large performance improvements with this change. Davidlohr said: : Improvements are quite obscene actually, such as for the following : epoll_wait() benchmark with 2 level nesting on a 80 core IvyBridge: : : ncpus vanilla dirty delta : 1 2447092 3028315 +23.75% : 4 231265 2986954 +1191.57% : 8 121631 2898796 +2283.27% : 16 59749 2902056 +4757.07% : 32 26837 2326314 +8568.30% : 64 12926 1341281 +10276.61% : : (http://linux-scalability.org/epoll/epoll-test.c) Link: http://lkml.kernel.org/r/1509430214-5599-1-git-send-email-jbaron@akamai.comSigned-off-by: Jason Baron <jbaron@akamai.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Salman Qazi <sqazi@google.com> Cc: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jason Baron authored
ep_poll_safewake() is used to wakeup potentially nested epoll file descriptors. The function uses ep_call_nested() to prevent entering the same wake up queue more than once, and to prevent excessively deep wakeup paths (deeper than EP_MAX_NESTS). However, this is not necessary since we are already preventing these conditions during EPOLL_CTL_ADD. This saves extra function calls, and avoids taking a global lock during the ep_call_nested() calls. I have, however, left ep_call_nested() for the CONFIG_DEBUG_LOCK_ALLOC case, since ep_call_nested() keeps track of the nesting level, and this is required by the call to spin_lock_irqsave_nested(). It would be nice to remove the ep_call_nested() calls for the CONFIG_DEBUG_LOCK_ALLOC case as well, however its not clear how to simply pass the nesting level through multiple wake_up() levels without more surgery. In any case, I don't think CONFIG_DEBUG_LOCK_ALLOC is generally used for production. This patch, also apparently fixes a workload at Google that Salman Qazi reported by completely removing the poll_safewake_ncalls->lock from wakeup paths. Link: http://lkml.kernel.org/r/1507920533-8812-1-git-send-email-jbaron@akamai.comSigned-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Salman Qazi <sqazi@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Shakeel Butt authored
A userspace application can directly trigger the allocations from eventpoll_epi and eventpoll_pwq slabs. A buggy or malicious application can consume a significant amount of system memory by triggering such allocations. Indeed we have seen in production where a buggy application was leaking the epoll references and causing a burst of eventpoll_epi and eventpoll_pwq slab allocations. This patch opt-in the charging of eventpoll_epi and eventpoll_pwq slabs. There is a per-user limit (~4% of total memory if no highmem) on these caches. I think it is too generous particularly in the scenario where jobs of multiple users are running on the system and the administrator is reducing cost by overcomitting the memory. This is unaccounted kernel memory and will not be considered by the oom-killer. I think by accounting it to kmemcg, for systems with kmem accounting enabled, we can provide better isolation between jobs of different users. Link: http://lkml.kernel.org/r/20171003021519.23907-1-shakeelb@google.comSigned-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Masahiro Yamada authored
checkpatch.pl does not check missing blank line before module_*_driver. I want it to behave likewise for builtin_*_driver. Link: http://lkml.kernel.org/r/1505700081-12854-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Lines that end in an open bracket or open parenthesis are generally hard to follow. Lines following those ending with open parenthesis are also rarely aligned to that open parenthesis. Suggest not ending lines with '[' or '(' Link: http://lkml.kernel.org/r/8fd0b2b4a7482064254e37931eb9302a81d5aa2f.1508340786.git.joe@perches.comSigned-off-by: Joe Perches <joe@perches.com> Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
So the line length check can be bypassed by its callers. Link: http://lkml.kernel.org/r/7de542c08a6e79f2ebe7c1416c9f403c23fdcc09.1508282823.git.joe@perches.comSigned-off-by: Joe Perches <joe@perches.com> Reported-by: Song Liu <songliubraving@fb.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Some of the definitions are very long and can't be split into multiple lines because ctags is limited. Exempt these lines from the line length checks. See commit 25528213 ("tags: Fix DEFINE_PER_CPU expansions") for more details. Link: http://lkml.kernel.org/r/1508170320.6530.15.camel@perches.comSigned-off-by: Joe Perches <joe@perches.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
There was code in checkpatch that allowed continuation printks to be used without KERN_CONT. Remove the continuation check and always require a KERN_<LEVEL>. Link: http://lkml.kernel.org/r/61980ef41d5b9b6543da1c49055042e0ab74d308.1507047008.git.joe@perches.comSigned-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Heinrich Schuchardt authored
void foo(int a) switch (a) { case 'h': fun1(); exit(1); default: } creates a warning "Possible switch case/default not preceded by break or fallthrough comment". exit( should be treated like return. Link: http://lkml.kernel.org/r/20170910154618.25819-1-xypron.glpk@gmx.deSigned-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Miles Chen authored
Current unnamed function definition argument does not include function pointer cases and it reports something like: WARNING: function definition argument 'void' should also have an identifier name +unsigned int (*dummy)(void); Support function pointers for unnamed function arguments Link: http://lkml.kernel.org/r/1505389925-31087-1-git-send-email-miles.chen@mediatek.comSigned-off-by: Miles Chen <miles.chen@mediatek.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yury Norov authored
find_bit functions are widely used in the kernel, including hot paths. This module tests performance of those functions in 2 typical scenarios: randomly filled bitmap with relatively equal distribution of set and cleared bits, and sparse bitmap which has 1 set bit for 500 cleared bits. On ThunderX machine: Start testing find_bit() with random-filled bitmap find_next_bit: 240043 cycles, 164062 iterations find_next_zero_bit: 312848 cycles, 163619 iterations find_last_bit: 193748 cycles, 164062 iterations find_first_bit: 177720874 cycles, 164062 iterations Start testing find_bit() with sparse bitmap find_next_bit: 3633 cycles, 656 iterations find_next_zero_bit: 620399 cycles, 327025 iterations find_last_bit: 3038 cycles, 656 iterations find_first_bit: 691407 cycles, 656 iterations [arnd@arndb.de: use correct format string for find-bit tests] Link: http://lkml.kernel.org/r/20171113135605.3166307-1-arnd@arndb.de Link: http://lkml.kernel.org/r/20171109140714.13168-1-ynorov@caviumnetworks.comSigned-off-by: Yury Norov <ynorov@caviumnetworks.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Clement Courbet <courbet@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Davidlohr Bueso authored
Fengguang reported soft lockups while running the rbtree and interval tree test modules. The logic for these tests all occur in init phase, and we currently are pounding with the default values for number of nodes and number of iterations of each test. Reduce the latter by two orders of magnitude. This does not influence the value of the tests in that one thousand times by default is enough to get the picture. Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Cheng Jian authored
The uniform structure filter_arg sets its union based on the difference of enum filter_arg_type, However, some functions use implicit type conversion obviously. warning: implicit conversion from enumeration type 'enum filter_exp_type' to different enumeration type 'enum filter_op_type' warning: implicit conversion from enumeration type 'enum filter_cmp_type' to different enumeration type 'enum filter_exp_type' Link: http://lkml.kernel.org/r/1509938415-113825-1-git-send-email-cj.chengjian@huawei.comSigned-off-by: Cheng Jian <cj.chengjian@huawei.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Steven Rostedt (Red Hat) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Liu, Changcheng authored
Don't leak idle function address in NMI backtrace. Link: http://lkml.kernel.org/r/20171106165648.GA95243@sofiaSigned-off-by: Liu Changcheng <changcheng.liu@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Stephen Bates authored
If the amount of resources allocated to a gen_pool exceeds 2^32 then the avail atomic overflows and this causes problems when clients try and borrow resources from the pool. This is only expected to be an issue on 64 bit systems. Add the <linux/atomic.h> header to pull in atomic_long* operations. So that 32 bit systems continue to use atomic32_t but 64 bit systems can use atomic64_t. Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.comSigned-off-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Daniel Mentz <danielmentz@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Our current int_sqrt() is not rough nor any approximation; it calculates the exact value of: floor(sqrt()). Document this. Link: http://lkml.kernel.org/r/20171020164645.001652117@infradead.orgSigned-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Anshul Garg <aksgarg1989@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Michael Davidson <md@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
The initial value (@M) compute is: m = 1UL << (BITS_PER_LONG - 2); while (m > x) m >>= 2; Which is a linear search for the highest even bit smaller or equal to @x We can implement this using a binary search using __fls() (or better when its hardware implemented). m = 1UL << (__fls(x) & ~1UL); Especially for small values of @x; which are the more common arguments when doing a CDF on idle times; the linear search is near to worst case, while the binary search of __fls() is a constant 6 (or 5 on 32bit) branches. cycles: branches: branch-misses: PRE: hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681 cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219 SOFTWARE FLS: hot: 29.576176 +- 0.028850 26.666730 +- 0.004511 0.019463 +- 0.000663 cold: 165.947136 +- 0.188406 26.666746 +- 0.004511 6.133897 +- 0.004386 HARDWARE FLS: hot: 24.720922 +- 0.025161 20.666784 +- 0.004509 0.020836 +- 0.000677 cold: 132.777197 +- 0.127471 20.666776 +- 0.004509 5.080285 +- 0.003874 Averages computed over all values <128k using a LFSR to generate order. Cold numbers have a LFSR based branch trace buffer 'confuser' ran between each int_sqrt() invocation. Link: http://lkml.kernel.org/r/20171020164644.936577234@infradead.orgSigned-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Joe Perches <joe@perches.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Anshul Garg <aksgarg1989@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Michael Davidson <md@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
The current int_sqrt() computation is sub-optimal for the case of small @x. Which is the interesting case when we're going to do cumulative distribution functions on idle times, which we assume to be a random variable, where the target residency of the deepest idle state gives an upper bound on the variable (5e6ns on recent Intel chips). In the case of small @x, the compute loop: while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } can be reduced to: while (m > x) m >>= 2; Because y==0, b==m and until x>=m y will remain 0. And while this is computationally equivalent, it runs much faster because there's less code, in particular less branches. cycles: branches: branch-misses: OLD: hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593 cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305 PRE: hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113 cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568 POST: hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681 cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219 Averages computed over all values <128k using a LFSR to generate order. Cold numbers have a LFSR based branch trace buffer 'confuser' ran between each int_sqrt() invocation. Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org Fixes: 30493cc9 ("lib/int_sqrt.c: optimize square root algorithm") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Anshul Garg <aksgarg1989@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Davidson <md@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Markus Elfring authored
Omit extra messages for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Link: http://lkml.kernel.org/r/410a4c5a-4ee0-6fcc-969c-103d8e496b78@users.sourceforge.netSigned-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Geert Uytterhoeven authored
Extract the string test code into its own source file, to allow compiling it either to a loadable module, or built into the kernel. Fixes: 03270c13 ("lib/string.c: add testcases for memset16/32/64") Link: http://lkml.kernel.org/r/1505397744-3387-1-git-send-email-geert@linux-m68k.orgSigned-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Masahiro Yamada authored
This include was added by commit 187f1882 ("BUG: headers with BUG/BUG_ON etc. need linux/bug.h") because BUG_ON() was used in this header at that time. Some time later, commit 6d75f366 ("lib: radix-tree: check accounting of existing slot replacement users") removed the use of BUG_ON() from this header. Since then, there is no reason to include <linux/bug.h>. Link: http://lkml.kernel.org/r/1505660151-4383-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Chris Mi <chrism@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Masahiro Yamada authored
Since commit bc6245e5 ("bug: split BUILD_BUG stuff out into <linux/build_bug.h>"), #include <linux/build_bug.h> is better to pull minimal headers needed for BUILG_BUG() family. Link: http://lkml.kernel.org/r/1505700775-19826-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Dinan Gunawardena <dinan.gunawardena@netronome.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Add tests for duplicate section headers, missing section content, link and scm reachability. Miscellanea: o Add --self-test=<foo> options (a comma separated list of any of sections, patterns, links or scm) where the default without options is all tests o Rename check_maintainers_patterns to self_test o Rename self_test_pattern_info to self_test_info [tom.saeger@oracle.com: improvements] Link: http://lkml.kernel.org/r/13e3986c374902fcf08ae947e36c5c608bbe3b79.1510075301.git.joe@perches.comSigned-off-by: Joe Perches <joe@perches.com> Reviewed-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tom Saeger authored
Add "--self-test" option to get_maintainer.pl to show potential issues in MAINTAINERS file(s) content. Pattern check warnings are shown for "F" and "X" patterns found in MAINTAINERS file(s) which do not match any files known by git. Link: http://lkml.kernel.org/r/64994f911b3510d0f4c8ac2e113501dfcec1f3c9.1509559540.git.tom.saeger@oracle.comSigned-off-by: Tom Saeger <tom.saeger@oracle.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Randy Dunlap authored
Fix minor typo. Fix missing words in explaining parsing of last line number. Link: http://lkml.kernel.org/r/ebb7ff42-4945-103f-d5b4-f07a6f3343a7@infradead.orgSigned-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Randy Dunlap authored
line-range is supposed to treat "1-" as "1-endoffile", so handle the special case by setting last_lineno to UINT_MAX. Fixes this error: dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1 dynamic_debug:ddebug_exec_query: query parse failed Link: http://lkml.kernel.org/r/10a6a101-e2be-209f-1f41-54637824788e@infradead.orgSigned-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christophe JAILLET authored
If 'write' is 0, we can avoid a call to spin_lock/spin_unlock. Link: http://lkml.kernel.org/r/20171020193331.7233-1-christophe.jaillet@wanadoo.frSigned-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sandipan Das authored
The GCC randomize layout plugin can randomize the member offsets of sensitive kernel data structures. To use this feature, certain annotations and members are added to the structures which affect the member offsets even if this plugin is not used. All of these structures are completely randomized, except for task_struct which leaves out some of its members. All the other members are wrapped within an anonymous struct with the __randomize_layout attribute. This is done using the randomized_struct_fields_start and randomized_struct_fields_end defines. When the plugin is disabled, the behaviour of this attribute can vary based on the GCC version. For GCC 5.1+, this attribute maps to __designated_init otherwise it is just an empty define but the anonymous structure is still present. For other compilers, both randomized_struct_fields_start and randomized_struct_fields_end default to empty defines meaning the anonymous structure is not introduced at all. So, if a module compiled with Clang, such as a BPF program, needs to access task_struct fields such as pid and comm, the offsets of these members as recognized by Clang are different from those recognized by modules compiled with GCC. If GCC 4.6+ is used to build the kernel, this can be solved by introducing appropriate defines for Clang so that the anonymous structure is seen when determining the offsets for the members. Link: http://lkml.kernel.org/r/20171109064645.25581-1-sandipan@linux.vnet.ibm.comSigned-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Alexei Starovoitov <ast@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kees Cook authored
Prior to v4.11, x86 used warn_slowpath_fmt() for handling WARN()s. After WARN() was moved to using UD0 on x86, the warning text started appearing _before_ the "cut here" line. This appears to have been a long-standing bug on architectures that used __WARN_TAINT, but it didn't get fixed. v4.11 and earlier on x86: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2956 at drivers/misc/lkdtm_bugs.c:65 lkdtm_WARNING+0x21/0x30 This is a warning message Modules linked in: v4.12 and later on x86: This is a warning message ------------[ cut here ]------------ WARNING: CPU: 1 PID: 2982 at drivers/misc/lkdtm_bugs.c:68 lkdtm_WARNING+0x15/0x20 Modules linked in: With this fix: ------------[ cut here ]------------ This is a warning message WARNING: CPU: 3 PID: 3009 at drivers/misc/lkdtm_bugs.c:67 lkdtm_WARNING+0x15/0x20 Since the __FILE__ reporting happens as part of the UD0 handler, it isn't trivial to move the message to after the WARNING line, but at least we can fix the position of the "cut here" line so all the various logging tools will start including the actual runtime warning message again, when they follow the instruction and "cut here". Link: http://lkml.kernel.org/r/1510100869-73751-4-git-send-email-keescook@chromium.org Fixes: 9a93848f ("x86/debug: Implement __WARN() using UD0") Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kees Cook authored
The "cut here" string is used in a few paths. Define it in a single place. Link: http://lkml.kernel.org/r/1510100869-73751-3-git-send-email-keescook@chromium.orgSigned-off-by: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-