- 28 May, 2013 5 commits
-
-
Frederic Weisbecker authored
Because we may update the execution time in sched_group_set_shares()->update_cfs_shares()->reweight_entity()->update_curr() before reweighting the entity while setting the group shares and this requires an uptodate version of the runqueue clock. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Turner <pjt@google.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1365724262-20142-3-git-send-email-fweisbec@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Frederic Weisbecker authored
Because the sched_class::put_prev_task() callback of rt and fair classes are referring to the rq clock to update their runtime statistics. There is a missing rq clock update from the CPU hotplug notifier's entry point of the scheduler. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Turner <pjt@google.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1365724262-20142-2-git-send-email-fweisbec@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Neil Zhang authored
migration_call() will do all the things that update_runtime() does. So let's remove it. Furthermore, there is potential risk that the current code will catch BUG_ON at line 689 of rt.c when do cpu hotplug while there are realtime threads running because of enabling runtime twice while the rt_runtime may already changed. Signed-off-by: Neil Zhang <zhangwm@marvell.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1365685499-26515-1-git-send-email-zhangwm@marvell.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Gerald Schaefer authored
In autogroup_create(), a tg is allocated and added to the task_groups list. If CONFIG_RT_GROUP_SCHED is set, this tg is then modified while on the list, without locking. This can race with someone walking the list, like __enable_runtime() during CPU unplug, and result in a use-after-free bug. To fix this, move sched_online_group(), which adds the tg to the list, to the end of the autogroup_create() function after the modification. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1369411669-46971-2-git-send-email-gerald.schaefer@de.ibm.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
Merge reason: these bits, formerly in sched/urgent, are too late for v3.10. Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 10 May, 2013 1 commit
-
-
Nathan Zimmer authored
It is a few instructions more efficent to and slightly more readable to use this_rq()-> instead of cpu_rq(smp_processor_id())-> . Size comparison of kernel/sched/fair.o: text data bss dec hex filename 27972 122 26 28120 6dd8 fair.o.before 27956 122 26 28104 6dc8 fair.o.after Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1368116643-87971-1-git-send-email-nzimmer@sgi.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 07 May, 2013 2 commits
-
-
Paul Gortmaker authored
These inlines are only used by kernel/sched/fair.c so they do not need to be present in the main kernel/sched/sched.h file. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1366398650-31599-3-git-send-email-paul.gortmaker@windriver.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Paul Gortmaker authored
This large chunk of load calculation code can be easily divorced from the main core.c scheduler file, with only a couple prototypes and externs added to a kernel/sched header. Some recent commits expanded the code and the documentation of it, making it large enough to warrant separation. For example, see: 556061b0, "sched/nohz: Fix rq->cpu_load[] calculations" 5aaa0b7a, "sched/nohz: Fix rq->cpu_load calculations some more" 5167e8d5, "sched/nohz: Rewrite and fix load-avg computation -- again" More importantly, it helps reduce the size of the main sched/core.c by yet another significant amount (~600 lines). Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1366398650-31599-2-git-send-email-paul.gortmaker@windriver.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 05 May, 2013 32 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull 'full dynticks' support from Ingo Molnar: "This tree from Frederic Weisbecker adds a new, (exciting! :-) core kernel feature to the timer and scheduler subsystems: 'full dynticks', or CONFIG_NO_HZ_FULL=y. This feature extends the nohz variable-size timer tick feature from idle to busy CPUs (running at most one task) as well, potentially reducing the number of timer interrupts significantly. This feature got motivated by real-time folks and the -rt tree, but the general utility and motivation of full-dynticks runs wider than that: - HPC workloads get faster: CPUs running a single task should be able to utilize a maximum amount of CPU power. A periodic timer tick at HZ=1000 can cause a constant overhead of up to 1.0%. This feature removes that overhead - and speeds up the system by 0.5%-1.0% on typical distro configs even on modern systems. - Real-time workload latency reduction: CPUs running critical tasks should experience as little jitter as possible. The last remaining source of kernel-related jitter was the periodic timer tick. - A single task executing on a CPU is a pretty common situation, especially with an increasing number of cores/CPUs, so this feature helps desktop and mobile workloads as well. The cost of the feature is mainly related to increased timer reprogramming overhead when a CPU switches its tick period, and thus slightly longer to-idle and from-idle latency. Configuration-wise a third mode of operation is added to the existing two NOHZ kconfig modes: - CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named as a config option. This is the traditional Linux periodic tick design: there's a HZ tick going on all the time, regardless of whether a CPU is idle or not. - CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the periodic tick when a CPU enters idle mode. - CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the tick when a CPU is idle, also slows the tick down to 1 Hz (one timer interrupt per second) when only a single task is running on a CPU. The .config behavior is compatible: existing !CONFIG_NO_HZ and CONFIG_NO_HZ=y settings get translated to the new values, without the user having to configure anything. CONFIG_NO_HZ_FULL is turned off by default. This feature is based on a lot of infrastructure work that has been steadily going upstream in the last 2-3 cycles: related RCU support and non-periodic cputime support in particular is upstream already. This tree adds the final pieces and activates the feature. The pull request is marked RFC because: - it's marked 64-bit only at the moment - the 32-bit support patch is small but did not get ready in time. - it has a number of fresh commits that came in after the merge window. The overwhelming majority of commits are from before the merge window, but still some aspects of the tree are fresh and so I marked it RFC. - it's a pretty wide-reaching feature with lots of effects - and while the components have been in testing for some time, the full combination is still not very widely used. That it's default-off should reduce its regression abilities and obviously there are no known regressions with CONFIG_NO_HZ_FULL=y enabled either. - the feature is not completely idempotent: there is no 100% equivalent replacement for a periodic scheduler/timer tick. In particular there's ongoing work to map out and reduce its effects on scheduler load-balancing and statistics. This should not impact correctness though, there are no known regressions related to this feature at this point. - it's a pretty ambitious feature that with time will likely be enabled by most Linux distros, and we'd like you to make input on its design/implementation, if you dislike some aspect we missed. Without flaming us to crisp! :-) Future plans: - there's ongoing work to reduce 1Hz to 0Hz, to essentially shut off the periodic tick altogether when there's a single busy task on a CPU. We'd first like 1 Hz to be exposed more widely before we go for the 0 Hz target though. - once we reach 0 Hz we can remove the periodic tick assumption from nr_running>=2 as well, by essentially interrupting busy tasks only as frequently as the sched_latency constraints require us to do - once every 4-40 msecs, depending on nr_running. I am personally leaning towards biting the bullet and doing this in v3.10, like the -rt tree this effort has been going on for too long - but the final word is up to you as usual. More technical details can be found in Documentation/timers/NO_HZ.txt" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) sched: Keep at least 1 tick per second for active dynticks tasks rcu: Fix full dynticks' dependency on wide RCU nocb mode nohz: Protect smp_processor_id() in tick_nohz_task_switch() nohz_full: Add documentation. cputime_nsecs: use math64.h for nsec resolution conversion helpers nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks config nohz: Reduce overhead under high-freq idling patterns nohz: Remove full dynticks' superfluous dependency on RCU tree nohz: Fix unavailable tick_stop tracepoint in dynticks idle nohz: Add basic tracing nohz: Select wide RCU nocb for full dynticks nohz: Disable the tick when irq resume in full dynticks CPU nohz: Re-evaluate the tick for the new task after a context switch nohz: Prepare to stop the tick on irq exit nohz: Implement full dynticks kick nohz: Re-evaluate the tick from the scheduler IPI sched: New helper to prevent from stopping the tick in full dynticks sched: Kick full dynticks CPU that have more than one task enqueued. perf: New helper to prevent full dynticks CPUs from stopping tick perf: Kick full dynticks CPU if events rotation is needed ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Ingo Molnar: "Misc fixes plus a small hw-enablement patch for Intel IB model 58 uncore events" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL perf/x86/intel/lbr: Fix LBR filter perf/x86: Blacklist all MEM_*_RETIRED events for Ivy Bridge perf: Fix vmalloc ring buffer pages handling perf/x86/intel: Fix unintended variable name reuse perf/x86/intel: Add support for IvyBridge model 58 Uncore perf/x86/intel: Fix typo in perf_event_intel_uncore.c x86: Eliminate irq_mis_count counted in arch_irq_stat
-
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linuxLinus Torvalds authored
Pull mudule updates from Rusty Russell: "We get rid of the general module prefix confusion with a binary config option, fix a remove/insert race which Never Happens, and (my favorite) handle the case when we have too many modules for a single commandline. Seriously, the kernel is full, please go away!" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: modpost: fix unwanted VMLINUX_SYMBOL_STR expansion X.509: Support parse long form of length octets in Authority Key Identifier module: don't unlink the module until we've removed all exposure. kernel: kallsyms: memory override issue, need check destination buffer length MODSIGN: do not send garbage to stderr when enabling modules signature modpost: handle huge numbers of modules. modpost: add -T option to read module names from file/stdin. modpost: minor cleanup. genksyms: pass symbol-prefix instead of arch module: fix symbol versioning with symbol prefixes CONFIG_SYMBOL_PREFIX: cleanup.
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull single_open() leak fixes from Al Viro: "A bunch of fixes for a moderately common class of bugs: file with single_open() done by its ->open() and seq_release as its ->release(). That leaks; fortunately, it's not _too_ common (either people manage to RTFM that says "When using single_open(), the programmer should use single_release() instead of seq_release() in the file_operations structure to avoid a memory leak", or they just copy a correct instance), but grepping through the tree has caught quite a pile. All of that is, AFAICS, -stable fodder, for as far as the patches apply. I tried to carve it up into reasonably-sized pieces (more or less "comes from the same tree")" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: rcutrace: single_open() leaks gadget: single_open() leaks staging: single_open() leaks megaraid: single_open() leak wireless: single_open() leaks input: single_open() leak rtc: single_open() leaks ds1620: single_open() leak sh: single_open() leaks parisc: single_open() leaks mips: single_open() leaks ia64: single_open() leaks h8300: single_open() leaks cris: single_open() leaks arm: single_open() leaks
-
Linus Torvalds authored
Merge ipc fixes and cleanups from my IPC branch. The ipc locking has always been pretty ugly, and the scalability fixes to some degree made it even less readable. We had two cases of double unlocks in error paths due to this (one rcu read unlock, one semaphore unlock), and this fixes the bugs I found while trying to clean things up a bit so that we are less likely to have more. * ipc-cleanups: ipc: simplify rcu_read_lock() in semctl_nolock() ipc: simplify semtimedop/semctl_main() common error path handling ipc: move sem_obtain_lock() rcu locking into the only caller ipc: fix double sem unlock in semctl error path ipc: move the rcu_read_lock() from sem_lock_and_putref() into callers ipc: sem_putref() does not need the semaphore lock any more ipc: move rcu_read_unlock() out of sem_unlock() and into callers
-
Peter Zijlstra authored
We should always have proper privileges when requesting kernel data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20130503121256.230745028@chello.nl [ Fix build error reported by fengguang.wu@intel.com, propagate error code back. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/n/tip-v0x9ky3ahzr6nm3c6ilwrili@git.kernel.org
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Several routines do not use netdev_features_t to hold such bitmasks, fixes from Patrick McHardy and Bjørn Mork. 2) Update cpsw IRQ software state and the actual HW irq enabling in the correct order. From Mugunthan V N. 3) When sending tipc packets to multiple bearers, we have to make copies of the SKB rather than just giving the original SKB directly. Fix from Gerlando Falauto. 4) Fix race with bridging topology change timer, from Stephen Hemminger. 5) Fix TCPv6 segmentation handling in GRE and VXLAN, from Pravin B Shelar. 6) Endian bug in USB pegasus driver, from Dan Carpenter. 7) Fix crashes on MTU reduction in USB asix driver, from Holger Eitzenberger. 8) Don't allow the kernel to BUG() just because the user puts some crap in an AF_PACKET mmap() ring descriptor. Fix from Daniel Borkmann. 9) Don't use variable sized arrays on the stack in xen-netback, from Wei Liu. 10) Fix stats reporting and an unbalanced napi_disable() in be2net driver. From Somnath Kotur and Ajit Khaparde. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits) cxgb4: fix error recovery when t4_fw_hello returns a positive value sky2: Fix crash on receiving VLAN frames packet: tpacket_v3: do not trigger bug() on wrong header status asix: fix BUG in receive path when lowering MTU net: qmi_wwan: Add Telewell TW-LTE 4G usbnet: pegasus: endian bug in write_mii_word() vxlan: Fix TCPv6 segmentation. gre: Fix GREv4 TCPv6 segmentation. bridge: fix race with topology change timer tipc: pskb_copy() buffers when sending on more than one bearer tipc: tipc_bcbearer_send(): simplify bearer selection tipc: cosmetic: clean up comments and break a long line drivers: net: cpsw: irq not disabled in cpsw isr in particular sequence xen-netback: better names for thresholds xen-netback: avoid allocating variable size array on stack xen-netback: remove redundent parameter in netbk_count_requests be2net: Fix to fail probe if MSI-X enable fails for a VF be2net: avoid napi_disable() when it has not been enabled be2net: Fix firmware download for Lancer be2net: Fix to receive Multicast Packets when Promiscuous mode is enabled on certain devices ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-nextLinus Torvalds authored
Pull sparc updates from David Miller: 1) Hibernation support, as well as removal of excess interrupt twiddling in MMU context allocation on sparc64 from Kirill Tkhai. 2) Kill references to __ARCH_WANT_UNLOCKED_CTXSW. 3) Sparc32 LEON bug fixes from Daniel Hellstrom and Andreas Larsson. 4) Provide cmpxchg64(), from Geert Uytterhoeven. 5) Device refcount and registry bug fixes from Federico Vaga and Wei Yongjun. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next: serial: sunsu: add missing platform_driver_unregister() when module exit sparc32, leon: Do not overwrite previously set irq flow handlers sparc/kernel/vio.c: add put_device() after device_find_child() sparc64: Do not save/restore interrupts in get_new_mmu_context() sparc: Consistently use 'wr' and 'rd' instructions for ASRs. sparc64: Kill __ARCH_WANT_UNLOCKED_CTXSW sparc64: Provide cmpxchg64() sparc64: Do not change num_physpages during initmem freeing sparc64: Hibernation support sparc,leon: updated GRPCI2 config name sparc,leon: support for GRPCI1 PCI host bridge controller sparc32,leon: add support for PCI busn resource for GRPCI2
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcDavid S. Miller authored
Merge sparc bug fixes that didn't make it into v3.9 into sparc-next. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
We have registered platform driver when module init, and need unregister it when module exit. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andreas Larsson authored
This is needed because when scan_of_devices finds the GAISLER_GPTIMER core that corresponds to the SMP "ticker" timer, the previously set proper irq flow handler gets overwritten with an incorrect one. This leads to very flaky timer interrupt handling on some hardware. Proper updates to handlers can still be done using leon_update_virq_handling. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Federico Vaga authored
The vio_remove() function uses device_find_child() but it does not drop the reference of the retrieved child. Signed-off-by: Federico Vaga <federico.vaga@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Torvalds authored
This trivially combines two rcu_read_lock() calls in both sides of a if-statement into one single one in front of the if-statement. Split out as an independent cleanup from the previous commit. Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
With various straight RCU lock/unlock movements, one common exit path pattern had become rcu_read_unlock(); goto out_wakeup; and in fact there were no cases where we wanted to exit to out_wakeup _without_ releasing the RCU read lock. So replace that pattern with "goto out_rcu_wakeup", and remove the old out_wakeup. Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
sem_obtain_lock() was another of those functions that returned with the RCU lock held for reading in the success case. Move the RCU locking to the caller (semtimedop()), making it more obvious. We already did RCU locking elsewhere in that function. Side note: why does semtimedop() re-do the semphore lookup after the sleep, rather than just getting a reference to the semaphore it already looked up originally? Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Fix another ipc locking buglet introduced by the scalability patches: when semctl_down() was changed to delay the semaphore locking, one error path for security_sem_semctl() went through the semaphore unlock logic even though the semaphore had never been locked. Introduced by commit 16df3674 ("ipc,sem: do not hold ipc lock more than necessary") Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
This is another ipc semaphore locking cleanup, trying to make the locking more straightforward. We move the rcu read locking into the callers of sem_lock_and_putref(), which in general means that we now mostly do the rcu_read_lock() and rcu_read_unlock() in the same function. Mostly. We still have the ipc_addid/newary/freeary mess, and things like ipcctl_pre_down_nolock(). Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-