An error occurred fetching the project authors.
- 29 Oct, 2014 2 commits
-
-
Pranith Kumar authored
PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU and uses PREEMPT_RCU config option in its place. Signed-off-by:
Pranith Kumar <bobby.prani@gmail.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Clark Williams authored
Rename CONFIG_RCU_BOOST_PRIO to CONFIG_RCU_KTHREAD_PRIO and use this value for both the per-CPU kthreads (rcuc/N) and the rcu boosting threads (rcub/n). Also, create the module_parameter rcutree.kthread_prio to be used on the kernel command line at boot to set a new value (rcutree.kthread_prio=N). Signed-off-by:
Clark Williams <clark.williams@gmail.com> [ paulmck: Ported to rcu/dev, applied Paul Bolle and Peter Zijlstra feedback. ] Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 28 Oct, 2014 1 commit
-
-
Stefan Hengelein authored
Every choice item of the "Build-forced no-CBs CPUs" choice had a dependency to RCU_NOCB_CPU. It's more comprehensible if the choice itself has the dependency instead of every choice item. The choice itself doesn't need to be visible if there are no items selectable (i.e. on arch/frv) or RCU_NOCB_CPU is not defined. Signed-off-by:
Stefan Hengelein <stefan.hengelein@fau.de> Signed-off-by:
Andreas Ruprecht <rupran@einserver.de> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 27 Oct, 2014 1 commit
-
-
Alexei Starovoitov authored
introduce two configs: - hidden CONFIG_BPF to select eBPF interpreter that classic socket filters depend on - visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use that solves several problems: - tracing and others that wish to use eBPF don't need to depend on NET. They can use BPF_SYSCALL to allow loading from userspace or select BPF to use it directly from kernel in NET-less configs. - in 3.18 programs cannot be attached to events yet, so don't force it on - when the rest of eBPF infra is there in 3.19+, it's still useful to switch it off to minimize kernel size bloat-o-meter on x64 shows: add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601) tested with many different config combinations. Hopefully didn't miss anything. Signed-off-by:
Alexei Starovoitov <ast@plumgrid.com> Acked-by:
Daniel Borkmann <dborkman@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 14 Oct, 2014 1 commit
-
-
Geert Uytterhoeven authored
When configuring a uniprocessor kernel, don't bother the user with an irrelevant LOG_CPU_MAX_BUF_SHIFT question, and don't build the unused code. Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Acked-by:
Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 10 Oct, 2014 1 commit
-
-
Mel Gorman authored
ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented _PAGE_NUMA using _PROT_NONE. This saved using an additional PTE bit and relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting fault scanner. This was found to be conceptually confusing with a lot of implicit assumptions and it was asked that an alternative be found. Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE bits and shrunk the maximum possible swap size but it did not go far enough. There are no architectures that reuse _PROT_NONE as _PROT_NUMA but the relics still exist. This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary duplication in powerpc vs the generic implementation by defining the types the core NUMA helpers expected to exist from x86 with their ppc64 equivalent. This necessitated that a PTE bit mask be created that identified the bits that distinguish present from NUMA pte entries but it is expected this will only differ between arches based on _PAGE_PROTNONE. The naming for the generic helpers was taken from x86 originally but ppc64 has types that are equivalent for the purposes of the helper so they are mapped instead of duplicating code. Signed-off-by:
Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 03 Oct, 2014 2 commits
-
-
Josh Triplett authored
commit 03b8c7b6 ("futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test") added the HAVE_FUTEX_CMPXCHG symbol right below FUTEX. This placed it right in the middle of the options for the EXPERT menu. However, HAVE_FUTEX_CMPXCHG does not depend on EXPERT or FUTEX, so Kconfig stops placing items in the EXPERT menu, and displays the remaining several EXPERT items (starting with EPOLL) directly in the General Setup menu. Since both users of HAVE_FUTEX_CMPXCHG only select it "if FUTEX", make HAVE_FUTEX_CMPXCHG itself depend on FUTEX. With this change, the subsequent items display as part of the EXPERT menu again; the EMBEDDED menu now appears as the next top-level item in the General Setup menu, which makes General Setup much shorter and more usable. Signed-off-by:
Josh Triplett <josh@joshtriplett.org> Acked-by:
Randy Dunlap <rdunlap@infradead.org> Cc: stable <stable@vger.kernel.org>
-
Josh Triplett authored
The buffers sized by CONFIG_LOG_BUF_SHIFT and CONFIG_LOG_CPU_MAX_BUF_SHIFT do not exist if CONFIG_PRINTK=n, so don't ask about their size at all. Signed-off-by:
Josh Triplett <josh@joshtriplett.org> Acked-by:
Randy Dunlap <rdunlap@infradead.org> Cc: stable <stable@vger.kernel.org>
-
- 16 Sep, 2014 1 commit
-
-
Paul E. McKenney authored
Commit b58cc46c (rcu: Don't offload callbacks unless specifically requested) failed to adjust the callback lists of the CPUs that are known to be no-CBs CPUs only because they are also nohz_full= CPUs. This failure can result in callbacks that are posted during early boot getting stranded on nxtlist for CPUs whose no-CBs property becomes apparent late, and there can also be spurious warnings about offline CPUs posting callbacks. This commit fixes these problems by adding an early-boot rcu_init_nohz() that properly initializes the no-CBs CPUs. Note that kernels built with CONFIG_RCU_NOCB_CPU_ALL=y or with CONFIG_RCU_NOCB_CPU=n do not exhibit this bug. Neither do kernels booted without the nohz_full= boot parameter. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Pranith Kumar <bobby.prani@gmail.com> Tested-by:
Paul Gortmaker <paul.gortmaker@windriver.com>
-
- 07 Sep, 2014 1 commit
-
-
Paul E. McKenney authored
This commit adds a new RCU-tasks flavor of RCU, which provides call_rcu_tasks(). This RCU flavor's quiescent states are voluntary context switch (not preemption!) and userspace execution (not the idle loop -- use some sort of schedule_on_each_cpu() if you need to handle the idle tasks. Note that unlike other RCU flavors, these quiescent states occur in tasks, not necessarily CPUs. Includes fixes from Steven Rostedt. This RCU flavor is assumed to have very infrequent latency-tolerant updaters. This assumption permits significant simplifications, including a single global callback list protected by a single global lock, along with a single task-private linked list containing all tasks that have not yet passed through a quiescent state. If experience shows this assumption to be incorrect, the required additional complexity will be added. Suggested-by:
Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 27 Aug, 2014 1 commit
-
-
Bertrand Jacquin authored
Since module-init-tools (gzip) and kmod (gzip and xz) support compressed modules, it could be useful to include a support for compressing modules right after having them installed. Doing this in kbuild instead of per distro can permit to make this kind of usage more generic. This patch add a Kconfig entry to "Enable loadable module support" menu and let you choose to compress using gzip (default) or xz. Both gzip and xz does not used any extra -[1-9] option since Andi Kleen and Rusty Russell prove no gain is made using them. gzip is called with -n argument to avoid storing original filename inside compressed file, that way we can save some more bytes. On a v3.16 kernel, 'make allmodconfig' generated 4680 modules for a total of 378MB (no strip, no sign, no compress), the following table shows observed disk space gain based on the allmodconfig .config : | time | +-------------+-----------------+ | manual .ko | make | size | percent | compression | modules_install | | gain +-------------+-----------------+------+-------- - | | 18.61s | 378M | GZIP | 3m16s | 3m37s | 102M | 73.41% XZ | 5m22s | 5m39s | 77M | 79.83% The gain for restricted environnement seems to be interesting while uncompress can be time consuming but happens only while loading a module, that is generally done only once. This is fully compatible with signed modules while the signed module is compressed. module-init-tools or kmod handles decompression and provide to other layer the uncompressed but signed payload. Reviewed-by:
Willy Tarreau <w@1wt.eu> Signed-off-by:
Bertrand Jacquin <beber@meleeweb.net> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- 26 Aug, 2014 1 commit
-
-
Geert Uytterhoeven authored
Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- 18 Aug, 2014 1 commit
-
-
Josh Triplett authored
Many embedded systems will not need these syscalls, and omitting them saves space. Add a new EXPERT config option CONFIG_ADVISE_SYSCALLS (default y) to support compiling them out. bloat-o-meter: add/remove: 0/3 grow/shrink: 0/0 up/down: 0/-2250 (-2250) function old new delta sys_fadvise64 57 - -57 sys_fadvise64_64 691 - -691 sys_madvise 1502 - -1502 Signed-off-by:
Josh Triplett <josh@joshtriplett.org>
-
- 14 Aug, 2014 1 commit
-
-
Geert Uytterhoeven authored
Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 08 Aug, 2014 1 commit
-
-
Vivek Goyal authored
currently bin2c builds only if CONFIG_IKCONFIG=y. But bin2c will now be used by kexec too. So make it compilation dependent on CONFIG_BUILD_BIN2C and this config option can be selected by CONFIG_KEXEC and CONFIG_IKCONFIG. Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Dave Young <dyoung@redhat.com> Cc: WANG Chao <chaowang@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 07 Aug, 2014 1 commit
-
-
Luis R. Rodriguez authored
The default size of the ring buffer is too small for machines with a large amount of CPUs under heavy load. What ends up happening when debugging is the ring buffer overlaps and chews up old messages making debugging impossible unless the size is passed as a kernel parameter. An idle system upon boot up will on average spew out only about one or two extra lines but where this really matters is on heavy load and that will vary widely depending on the system and environment. There are mechanisms to help increase the kernel ring buffer for tracing through debugfs, and those interfaces even allow growing the kernel ring buffer per CPU. We also have a static value which can be passed upon boot. Relying on debugfs however is not ideal for production, and relying on the value passed upon bootup is can only used *after* an issue has creeped up. Instead of being reactive this adds a proactive measure which lets you scale the amount of contributions you'd expect to the kernel ring buffer under load by each CPU in the worst case scenario. We use num_possible_cpus() to avoid complexities which could be introduced by dynamically changing the ring buffer size at run time, num_possible_cpus() lets us use the upper limit on possible number of CPUs therefore avoiding having to deal with hotplugging CPUs on and off. This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT which is used to specify the maximum amount of contributions to the kernel ring buffer in the worst case before the kernel ring buffer flips over, the size is specified as a power of 2. The total amount of contributions made by each CPU must be greater than half of the default kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger an increase upon bootup. The kernel ring buffer is increased to the next power of two that would fit the required minimum kernel ring buffer size plus the additional CPU contribution. For example if LOG_BUF_SHIFT is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs in order to trigger an increase of the kernel ring buffer. With a LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64 possible CPUs to trigger an increase. If you had 128 possible CPUs the amount of minimum required kernel ring buffer bumps to: ((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB Since we require the ring buffer to be a power of two the new required size would be 1024 KB. This CPU contributions are ignored when the "log_buf_len" kernel parameter is used as it forces the exact size of the ring buffer to an expected power of two value. [pmladek@suse.cz: fix build] Signed-off-by:
Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by:
Petr Mladek <pmladek@suse.cz> Tested-by:
Davidlohr Bueso <davidlohr@hp.com> Tested-by:
Petr Mladek <pmladek@suse.cz> Reviewed-by:
Davidlohr Bueso <davidlohr@hp.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Petr Mladek <pmladek@suse.cz> Cc: Joe Perches <joe@perches.com> Cc: Arun KS <arunks.linux@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 09 Jul, 2014 1 commit
-
-
Paul E. McKenney authored
Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Lai Jiangshan <laijs@cn.fujitsu.com>
-
- 07 Jul, 2014 1 commit
-
-
Paul E. McKenney authored
Enabling NO_HZ_FULL currently has the side effect of enabling callback offloading on all CPUs. This results in lots of additional rcuo kthreads, and can also increase context switching and wakeups, even in cases where callback offloading is neither needed nor particularly desirable. This commit therefore enables callback offloading on a given CPU only if specifically requested at build time or boot time, or if that CPU has been specifically designated (again, either at build time or boot time) as a nohz_full CPU. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 04 Jun, 2014 4 commits
-
-
Fabian Frederick authored
sys_sgetmask and sys_ssetmask are obsolete system calls no longer supported in libc. This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert mode configuration.That option is enabled by default for those architectures. Signed-off-by:
Fabian Frederick <fabf@skynet.be> Cc: Steven Miao <realmz6@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Konstantin Khlebnikov authored
CONFIG_CROSS_MEMORY_ATTACH adds couple syscalls: process_vm_readv and process_vm_writev, it's a kind of IPC for copying data between processes. Currently this option is placed inside "Processor type and features". This patch moves it into "General setup" (where all other arch-independed syscalls and ipc features are placed) and changes prompt string to less cryptic. Signed-off-by:
Konstantin Khlebnikov <koct9i@gmail.com> Cc: Christopher Yeoh <cyeoh@au1.ibm.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Hugh Dickins <hughd@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
CONFIG_MM_OWNER makes no sense. It is not user-selectable, it is only selected by CONFIG_MEMCG automatically. So we can kill this option in init/Kconfig and do s/CONFIG_MM_OWNER/CONFIG_MEMCG/ globally. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Acked-by:
Michal Hocko <mhocko@suse.cz> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Vladimir Davydov authored
Kmemcg is currently under development and lacks some important features. In particular, it does not have support of kmem reclaim on memory pressure inside cgroup, which practically makes it unusable in real life. Let's warn about it in both Kconfig and Documentation to prevent complaints arising. Signed-off-by:
Vladimir Davydov <vdavydov@parallels.com> Acked-by:
Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 18 Apr, 2014 1 commit
-
-
Peter Foley authored
The SYSTEM_TRUSTED_KEYRING config option is not in any menu, causing it to show up in the toplevel of the kernel configuration. Fix this by moving it under the General Setup menu. Signed-off-by:
Peter Foley <pefoley2@pefoley.com> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 07 Apr, 2014 1 commit
-
-
Josh Triplett authored
"make allnoconfig" exists to ease testing of minimal configurations. Documentation/SubmitChecklist includes a note to test with allnoconfig. This helps catch missing dependencies on common-but-not-required functionality, which might otherwise go unnoticed. However, allnoconfig still leaves many symbols enabled, because they're hidden behind CONFIG_EMBEDDED or CONFIG_EXPERT. For instance, allnoconfig still has CONFIG_PRINTK and CONFIG_BLOCK enabled, so drivers don't typically get build-tested with those disabled. To address this, introduce a new Kconfig option "allnoconfig_y", used on symbols which only exist to hide other symbols. Set it on CONFIG_EMBEDDED (which then selects CONFIG_EXPERT). allnoconfig will then disable all the symbols hidden behind those. Signed-off-by:
Josh Triplett <josh@joshtriplett.org> Tested-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 03 Apr, 2014 2 commits
-
-
Josh Triplett authored
uselib hasn't been used since libc5; glibc does not use it. Support turning it off. When disabled, also omit the load_elf_library implementation from binfmt_elf.c, which only uselib invokes. bloat-o-meter: add/remove: 0/4 grow/shrink: 0/1 up/down: 0/-785 (-785) function old new delta padzero 39 36 -3 uselib_flags 20 - -20 sys_uselib 168 - -168 SyS_uselib 168 - -168 load_elf_library 426 - -426 The new CONFIG_USELIB defaults to `y'. Signed-off-by:
Josh Triplett <josh@joshtriplett.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Fabian Frederick authored
sys_sysfs is an obsolete system call no longer supported by libc. - This patch adds a default CONFIG_SYSFS_SYSCALL=y - Option can be turned off in expert mode. - cond_syscall added to kernel/sys_ni.c [akpm@linux-foundation.org: tweak Kconfig help text] Signed-off-by:
Fabian Frederick <fabf@skynet.be> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 20 Mar, 2014 2 commits
-
-
AKASHI Takahiro authored
Currently AUDITSYSCALL has a long list of architecture depencency: depends on AUDIT && (X86 || PARISC || PPC || S390 || IA64 || UML || SPARC64 || SUPERH || (ARM && AEABI && !OABI_COMPAT) || ALPHA) The purpose of this patch is to replace it with HAVE_ARCH_AUDITSYSCALL for simplicity. Signed-off-by:
AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> (arm) Acked-by: Richard Guy Briggs <rgb@redhat.com> (audit) Acked-by: Matt Turner <mattst88@gmail.com> (alpha) Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by:
Eric Paris <eparis@redhat.com>
-
蔡正龙 authored
Signed-off-by:
Zhenglong.cai <zhenglong.cai@cs2c.com.cn> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
- 03 Mar, 2014 1 commit
-
-
Heiko Carstens authored
If an architecture has futex_atomic_cmpxchg_inatomic() implemented and there is no runtime check necessary, allow to skip the test within futex_init(). This allows to get rid of some code which would always give the same result, and also allows the compiler to optimize a couple of if statements away. Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20140302120947.GA3641@osirisSigned-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- 11 Feb, 2014 1 commit
-
-
Tejun Heo authored
cgroup filesystem code was derived from the original sysfs implementation which was heavily intertwined with vfs objects and locking with the goal of re-using the existing vfs infrastructure. That experiment turned out rather disastrous and sysfs switched, a long time ago, to distributed filesystem model where a separate representation is maintained which is queried by vfs. Unfortunately, cgroup stuck with the failed experiment all these years and accumulated even more problems over time. Locking and object lifetime management being entangled with vfs is probably the most egregious. vfs is never designed to be misused like this and cgroup ends up jumping through various convoluted dancing to make things work. Even then, operations across multiple cgroups can't be done safely as it'll deadlock with rename locking. Recently, kernfs is separated out from sysfs so that it can be used by users other than sysfs. This patch converts cgroup to use kernfs, which will bring the following benefits. * Separation from vfs internals. Locking and object lifetime management is contained in cgroup proper making things a lot simpler. This removes significant amount of locking convolutions, hairy object lifetime rules and the restriction on multi-cgroup operations. * Can drop a lot of code to implement filesystem interface as most are provided by kernfs. * Proper "severing" semantics, which allows controllers to not worry about lingering file accesses after offline. While the preceding patches did as much as possible to make the transition less painful, large part of the conversion has to be one discrete step making this patch rather large. The rest of the commit message lists notable changes in different areas. Overall ------- * vfs constructs replaced with kernfs ones. cgroup->dentry w/ ->kn, cgroupfs_root->sb w/ ->kf_root. * All dentry accessors are removed. Helpers to map from kernfs constructs are added. * All vfs plumbing around dentry, inode and bdi removed. * cgroup_mount() now directly looks for matching root and then proceeds to create a new one if not found. Synchronization and object lifetime ----------------------------------- * vfs inode locking removed. Among other things, this removes the need for the convolution in cgroup_cfts_commit(). Future patches will further simplify it. * vfs refcnting replaced with cgroup internal ones. cgroup->refcnt, cgroupfs_root->refcnt added. cgroup_put_root() now directly puts root->refcnt and when it reaches zero proceeds to destroy it thus merging cgroup_put_root() and the former cgroup_kill_sb(). Simliarly, cgroup_put() now directly schedules cgroup_free_rcu() when refcnt reaches zero. * Unlike before, kernfs objects don't hold onto cgroup objects. When cgroup destroys a kernfs node, all existing operations are drained and the association is broken immediately. The same for cgroupfs_roots and mounts. * All operations which come through kernfs guarantee that the associated cgroup is and stays valid for the duration of operation; however, there are two paths which need to find out the associated cgroup from dentry without going through kernfs - css_tryget_from_dir() and cgroupstats_build(). For these two, kernfs_node->priv is RCU managed so that they can dereference it under RCU read lock. File and directory handling --------------------------- * File and directory operations converted to kernfs_ops and kernfs_syscall_ops. * xattrs is implicitly supported by kernfs. No need to worry about it from cgroup. This means that "xattr" mount option is no longer necessary. A future patch will add a deprecated warning message when sane_behavior. * When cftype->max_write_len > PAGE_SIZE, it's necessary to make a private copy of one of the kernfs_ops to set its atomic_write_len. cftype->kf_ops is added and cgroup_init/exit_cftypes() are updated to handle it. * cftype->lockdep_key added so that kernfs lockdep annotation can be per cftype. * Inidividual file entries and open states are now managed by kernfs. No need to worry about them from cgroup. cfent, cgroup_open_file and their friends are removed. * kernfs_nodes are created deactivated and kernfs_activate() invocations added to places where creation of new nodes are committed. * cgroup_rmdir() uses kernfs_[un]break_active_protection() for self-removal. v2: - Li pointed out in an earlier patch that specifying "name=" during mount without subsystem specification should succeed if there's an existing hierarchy with a matching name although it should fail with -EINVAL if a new hierarchy should be created. Prior to the conversion, this used by handled by deferring failure from NULL return from cgroup_root_from_opts(), which was necessary because root was being created before checking for existing ones. Note that cgroup_root_from_opts() returned an ERR_PTR() value for error conditions which require immediate mount failure. As we now have separate search and creation steps, deferring failure from cgroup_root_from_opts() is no longer necessary. cgroup_root_from_opts() is updated to always return ERR_PTR() value on failure. - The logic to match existing roots is updated so that a mount attempt with a matching name but different subsys_mask are rejected. This was handled by a separate matching loop under the comment "Check for name clashes with existing mounts" but got lost during conversion. Merge the check into the main search loop. - Add __rcu __force casting in RCU_INIT_POINTER() in cgroup_destroy_locked() to avoid the sparse address space warning reported by kbuild test bot. Maybe we want an explicit interface to use kn->priv as RCU protected pointer? v3: Make CONFIG_CGROUPS select CONFIG_KERNFS. v4: Rebased on top of 0ab02ca8 ("cgroup: protect modifications to cgroup_idr with cgroup_mutex"). Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com> Cc: kbuild test robot fengguang.wu@intel.com>
-
- 31 Jan, 2014 1 commit
-
-
蔡正龙 authored
Signed-off-by:
Zhenglong.cai <zhenglong.cai@cs2c.com.cn> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
- 11 Dec, 2013 1 commit
-
-
Peter Zijlstra authored
Introduce mul_u64_u32_shr() as proposed by Andy a while back; it allows using 64x64->128 muls on 64bit archs and recent GCC which defines __SIZEOF_INT128__ and __int128. (This new method will be used by the scheduler.) Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Cc: fweisbec@gmail.com Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-hxjoeuzmrcaumR0uZwjpe2pv@git.kernel.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 02 Dec, 2013 2 commits
-
-
Paul Gortmaker authored
Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by:
Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org>
-
Paul Gortmaker authored
Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- 27 Nov, 2013 1 commit
-
-
Eric W. Biederman authored
Removing UIDGID_STRICT_TYPE_CHECKS simplifies the code and always generates a compile error if the uids and kuids or gids and kgids are mixed by accident. Now that the appropriate conversions have been placed throughout the kernel there is no longer a need for a mode where we don't detect them as compile errors. Acked-by:
Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- 22 Nov, 2013 1 commit
-
-
Tejun Heo authored
cgroup_event is way over-designed and tries to build a generic flexible event mechanism into cgroup - fully customizable event specification for each user of the interface. This is utterly unnecessary and overboard especially in the light of the planned unified hierarchy as there's gonna be single agent. Simply generating events at fixed points, or if that's too restrictive, configureable cadence or single set of configureable points should be enough. Thankfully, memcg is the only user and gets to keep it. Replacing it with something simpler on sane_behavior is strongly recommended. This patch moves cgroup_event and "cgroup.event_control" implementation to mm/memcontrol.c. Clearing of events on cgroup destruction is moved from cgroup_destroy_locked() to mem_cgroup_css_offline(), which shouldn't make any noticeable difference. cgroup_css() and __file_cft() are exported to enable the move; however, this will soon be reverted once the event code is updated to be memcg specific. Note that "cgroup.event_control" will now exist only on the hierarchy with memcg attached to it. While this change is visible to userland, it is unlikely to be noticeable as the file has never been meaningful outside memcg. Aside from the above change, this is pure code relocation. v2: Per Li Zefan's comments, init/Kconfig updated accordingly and poll.h inclusion moved from cgroup.c to memcontrol.c. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com> Acked-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by:
Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com>
-
- 17 Nov, 2013 1 commit
-
-
H. Peter Anvin authored
This reverts commit 69f0554e. This patch breaks randconfig on at least the x86-64 architecture, and most likely on others. There is work underway to support uncompressed kernels in a generic way, but it looks like it will amount to rewriting the support from scratch; see the LKML thread in the Link: for info. Therefore, revert this change and wait for the fix. Reported-by:
Pavel Roskin <proski@gnu.org> Cc: Christian Ruppert <christian.ruppert@abilis.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131113113418.167b8ffd@IRBT4585Signed-off-by:
H. Peter Anvin <hpa@zytor.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 13 Nov, 2013 1 commit
-
-
Christian Ruppert authored
Some ARC users say they can boot faster with without kernel compression. This probably depends on things like the FLASH chip they use etc. Until now, kernel compression can only be disabled by removing "select HAVE_<compression>" lines from the architecture Kconfig. So add the Kconfig logic to permit disabling of kernel compression. Signed-off-by:
Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 07 Nov, 2013 1 commit
-
-
Helge Deller authored
Implement missing functions for parisc to provide kernel audit feature. Signed-off-by:
Helge Deller <deller@gmx.de>
-
- 05 Nov, 2013 1 commit
-
-
Eric Paris authored
After trying to use this feature in Fedora we found the hard coding policy like this into the kernel was a bad idea. Surprise surprise. We ran into these problems because it was impossible to launch a container as a logged in user and run a login daemon inside that container. This reverts back to the old behavior before this option was added. The option will be re-added in a userspace selectable manor such that userspace can choose when it is and when it is not appropriate. Signed-off-by:
Eric Paris <eparis@redhat.com> Signed-off-by:
Richard Guy Briggs <rgb@redhat.com> Signed-off-by:
Eric Paris <eparis@redhat.com>
-