- 12 Aug, 2014 40 commits
-
-
Johannes Weiner authored
When kswapd exits, it can end up taking locks that were previously held by allocating tasks while they waited for reclaim. Lockdep currently warns about this: On Wed, May 28, 2014 at 06:06:34PM +0800, Gu Zheng wrote: > inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage. > kswapd2/1151 [HC0[0]:SC0[0]:HE1:SE1] takes: > (&sig->group_rwsem){+++++?}, at: exit_signals+0x24/0x130 > {RECLAIM_FS-ON-W} state was registered at: > mark_held_locks+0xb9/0x140 > lockdep_trace_alloc+0x7a/0xe0 > kmem_cache_alloc_trace+0x37/0x240 > flex_array_alloc+0x99/0x1a0 > cgroup_attach_task+0x63/0x430 > attach_task_by_pid+0x210/0x280 > cgroup_procs_write+0x16/0x20 > cgroup_file_write+0x120/0x2c0 > vfs_write+0xc0/0x1f0 > SyS_write+0x4c/0xa0 > tracesys+0xdd/0xe2 > irq event stamp: 49 > hardirqs last enabled at (49): _raw_spin_unlock_irqrestore+0x36/0x70 > hardirqs last disabled at (48): _raw_spin_lock_irqsave+0x2b/0xa0 > softirqs last enabled at (0): copy_process.part.24+0x627/0x15f0 > softirqs last disabled at (0): (null) > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&sig->group_rwsem); > <Interrupt> > lock(&sig->group_rwsem); > > *** DEADLOCK *** > > no locks held by kswapd2/1151. > > stack backtrace: > CPU: 30 PID: 1151 Comm: kswapd2 Not tainted 3.10.39+ #4 > Call Trace: > dump_stack+0x19/0x1b > print_usage_bug+0x1f7/0x208 > mark_lock+0x21d/0x2a0 > __lock_acquire+0x52a/0xb60 > lock_acquire+0xa2/0x140 > down_read+0x51/0xa0 > exit_signals+0x24/0x130 > do_exit+0xb5/0xa50 > kthread+0xdb/0x100 > ret_from_fork+0x7c/0xb0 This is because the kswapd thread is still marked as a reclaimer at the time of exit. But because it is exiting, nobody is actually waiting on it to make reclaim progress anymore, and it's nothing but a regular thread at this point. Be tidy and strip it of all its powers (PF_MEMALLOC, PF_SWAPWRITE, PF_KSWAPD, and the lockdep reclaim state) before returning from the thread function. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Reported-by:
Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 71abdc15) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bart Van Assche authored
Avoid that closing /dev/infiniband/umad<n> or /dev/infiniband/issm<n> triggers a use-after-free. __fput() invokes f_op->release() before it invokes cdev_put(). Make sure that the ib_umad_device structure is freed by the cdev_put() call instead of f_op->release(). This avoids that changing the port mode from IB into Ethernet and back to IB followed by restarting opensmd triggers the following kernel oops: general protection fault: 0000 [#1] PREEMPT SMP RIP: 0010:[<ffffffff810cc65c>] [<ffffffff810cc65c>] module_put+0x2c/0x170 Call Trace: [<ffffffff81190f20>] cdev_put+0x20/0x30 [<ffffffff8118e2ce>] __fput+0x1ae/0x1f0 [<ffffffff8118e35e>] ____fput+0xe/0x10 [<ffffffff810723bc>] task_work_run+0xac/0xe0 [<ffffffff81002a9f>] do_notify_resume+0x9f/0xc0 [<ffffffff814b8398>] int_signal+0x12/0x17 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=75051Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Yann Droneaud <ydroneaud@opteya.com> Cc: <stable@vger.kernel.org> # 3.x: 8ec0a0e6: IB/umad: Fix error handling Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit 60e1751c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Nicholas Bellinger authored
This patch adds an explicit check in chap_server_compute_md5() to ensure the CHAP_C value received from the initiator during mutual authentication does not match the original CHAP_C provided by the target. This is in line with RFC-3720, section 8.2.1: Originators MUST NOT reuse the CHAP challenge sent by the Responder for the other direction of a bidirectional authentication. Responders MUST check for this condition and close the iSCSI TCP connection if it occurs. Reported-by:
Tejas Vaykole <tejas.vaykole@calsoftinc.com> Cc: stable@vger.kernel.org # 3.1+ Signed-off-by:
Nicholas Bellinger <nab@linux-iscsi.org> (cherry picked from commit 1d2b60a5) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Kailang Yang authored
New codec support for ALC891. Signed-off-by:
Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> (cherry picked from commit b6c5fbad) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Anton Blanchard authored
commit 8f9c0119 (compat: fs: Generic compat_sys_sendfile implementation) changed the PowerPC 64bit sendfile call from sys_sendile64 to sys_sendfile. Unfortunately this broke sendfile of lengths greater than 2G because sys_sendfile caps at MAX_NON_LFS. Restore what we had previously which fixes the bug. Cc: stable@vger.kernel.org Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> (cherry picked from commit 5d73320a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Benjamin Herrenschmidt authored
We had a mix & match of flags used when creating legacy ports depending on where we found them in the device-tree. Among others we were missing UPF_SKIP_TEST for some kind of ISA ports which is a problem as quite a few UARTs out there don't support the loopback test (such as a lot of BMCs). Let's pick the set of flags used by the SoC code and generalize it which means autoconf, no loopback test, irq maybe shared and fixed port. Sending to stable as the lack of UPF_SKIP_TEST is breaking serial on some machines so I want this back into distros Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: stable@vger.kernel.org (cherry picked from commit c4cad90f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Tony Luck authored
When Linux sees an "action optional" machine check (where h/w has reported an error that is not in the current execution path) we generally do not want to signal a process, since most processes do not have a SIGBUS handler - we'd just prematurely terminate the process for a problem that they might never actually see. task_early_kill() decides whether to consider a process - and it checks whether this specific process has been marked for early signals with "prctl", or if the system administrator has requested early signals for all processes using /proc/sys/vm/memory_failure_early_kill. But for MF_ACTION_REQUIRED case we must not defer. The error is in the execution path of the current thread so we must send the SIGBUS immediatley. Fix by passing a flag argument through collect_procs*() to task_early_kill() so it knows whether we can defer or must take action. Signed-off-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@suse.de> Cc: Chen Gong <gong.chen@linux.jf.intel.com> Cc: <stable@vger.kernel.org> [3.2+] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 74614de1) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Tony Luck authored
When a thread in a multi-threaded application hits a machine check because of an uncorrectable error in memory - we want to send the SIGBUS with si.si_code = BUS_MCEERR_AR to that thread. Currently we fail to do that if the active thread is not the primary thread in the process. collect_procs() just finds primary threads and this test: if ((flags & MF_ACTION_REQUIRED) && t == current) { will see that the thread we found isn't the current thread and so send a si.si_code = BUS_MCEERR_AO to the primary (and nothing to the active thread at this time). We can fix this by checking whether "current" shares the same mm with the process that collect_procs() said owned the page. If so, we send the SIGBUS to current (with code BUS_MCEERR_AR). Signed-off-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by:
Otto Bruggeman <otto.g.bruggeman@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@suse.de> Cc: Chen Gong <gong.chen@linux.jf.intel.com> Cc: <stable@vger.kernel.org> [3.2+] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit a70ffcac) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mel Gorman authored
The test_bit operations in get/set pageblock flags are expensive. This patch reads the bitmap on a word basis and use shifts and masks to isolate the bits of interest. Similarly masks are used to set a local copy of the bitmap and then use cmpxchg to update the bitmap if there have been no other changes made in parallel. In a test running dd onto tmpfs the overhead of the pageblock-related functions went from 1.27% in profiles to 0.5%. In addition to the performance benefits, this patch closes races that are possible between: a) get_ and set_pageblock_migratetype(), where get_pageblock_migratetype() reads part of the bits before and other part of the bits after set_pageblock_migratetype() has updated them. b) set_pageblock_migratetype() and set_pageblock_skip(), where the non-atomic read-modify-update set bit operation in set_pageblock_skip() will cause lost updates to some bits changed in the set_pageblock_migratetype(). Joonsoo Kim first reported the case a) via code inspection. Vlastimil Babka's testing with a debug patch showed that either a) or b) occurs roughly once per mmtests' stress-highalloc benchmark (although not necessarily in the same pageblock). Furthermore during development of unrelated compaction patches, it was observed that frequent calls to {start,undo}_isolate_page_range() the race occurs several thousands of times and has resulted in NULL pointer dereferences in move_freepages() and free_one_page() in places where free_list[migratetype] is manipulated by e.g. list_move(). Further debugging confirmed that migratetype had invalid value of 6, causing out of bounds access to the free_list array. That confirmed that the race exist, although it may be extremely rare, and currently only fatal where page isolation is performed due to memory hot remove. Races on pageblocks being updated by set_pageblock_migratetype(), where both old and new migratetype are lower MIGRATE_RESERVE, currently cannot result in an invalid value being observed, although theoretically they may still lead to unexpected creation or destruction of MIGRATE_RESERVE pageblocks. Furthermore, things could get suddenly worse when memory isolation is used more, or when new migratetypes are added. After this patch, the race has no longer been observed in testing. Signed-off-by:
Mel Gorman <mgorman@suse.de> Acked-by:
Vlastimil Babka <vbabka@suse.cz> Reported-by:
Joonsoo Kim <iamjoonsoo.kim@lge.com> Reported-and-tested-by:
Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit e58469ba) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Michal Hocko authored
Eric has reported that he can see task(s) stuck in memcg OOM handler regularly. The only way out is to echo 0 > $GROUP/memory.oom_control His usecase is: - Setup a hierarchy with memory and the freezer (disable kernel oom and have a process watch for oom). - In that memory cgroup add a process with one thread per cpu. - In one thread slowly allocate once per second I think it is 16M of ram and mlock and dirty it (just to force the pages into ram and stay there). - When oom is achieved loop: * attempt to freeze all of the tasks. * if frozen send every task SIGKILL, unfreeze, remove the directory in cgroupfs. Eric has then pinpointed the issue to be memcg specific. All tasks are sitting on the memcg_oom_waitq when memcg oom is disabled. Those that have received fatal signal will bypass the charge and should continue on their way out. The tricky part is that the exit path might trigger a page fault (e.g. exit_robust_list), thus the memcg charge, while its memcg is still under OOM because nobody has released any charges yet. Unlike with the in-kernel OOM handler the exiting task doesn't get TIF_MEMDIE set so it doesn't shortcut further charges of the killed task and falls to the memcg OOM again without any way out of it as there are no fatal signals pending anymore. This patch fixes the issue by checking PF_EXITING early in mem_cgroup_try_charge and bypass the charge same as if it had fatal signal pending or TIF_MEMDIE set. Normally exiting tasks (aka not killed) will bypass the charge now but this should be OK as the task is leaving and will release memory and increasing the memory pressure just to release it in a moment seems dubious wasting of cycles. Besides that charges after exit_signals should be rare. I am bringing this patch again (rebased on the current mmotm tree). I hope we can move forward finally. If there is still an opposition then I would really appreciate a concurrent approach so that we can discuss alternatives. http://comments.gmane.org/gmane.linux.kernel.stable/77650 is a reference to the followup discussion when the patch has been dropped from the mmotm last time. Reported-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Michal Hocko <mhocko@suse.cz> Acked-by:
David Rientjes <rientjes@google.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit d8dc595c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mel Gorman authored
throttle_direct_reclaim() is meant to trigger during swap-over-network during which the min watermark is treated as a pfmemalloc reserve. It throttes on the first node in the zonelist but this is flawed. The user-visible impact is that a process running on CPU whose local memory node has no ZONE_NORMAL will stall for prolonged periods of time, possibly indefintely. This is due to throttle_direct_reclaim thinking the pfmemalloc reserves are depleted when in fact they don't exist on that node. On a NUMA machine running a 32-bit kernel (I know) allocation requests from CPUs on node 1 would detect no pfmemalloc reserves and the process gets throttled. This patch adjusts throttling of direct reclaim to throttle based on the first node in the zonelist that has a usable ZONE_NORMAL or lower zone. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by:
Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 675becce) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hugh Dickins authored
Trinity reports BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:47 in_atomic(): 0, irqs_disabled(): 0, pid: 5787, name: trinity-c27 __might_sleep < down_write < __put_anon_vma < page_get_anon_vma < migrate_pages < compact_zone < compact_zone_order < try_to_compact_pages .. Right, since conversion to mutex then rwsem, we should not put_anon_vma() from inside an rcu_read_lock()ed section: fix the two places that did so. And add might_sleep() to anon_vma_free(), as suggested by Peter Zijlstra. Fixes: 88c22088 ("mm: optimize page_lock_anon_vma() fast-path") Reported-by:
Dave Jones <davej@redhat.com> Signed-off-by:
Hugh Dickins <hughd@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 7f39dda9) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jérôme Carretero authored
This device normally comes with a proprietary driver, using a web GUI to configure RAID: http://www.highpoint-tech.com/USA_new/series_rr600-download.htm But thankfully it also works out of the box with the AHCI driver, being just a Marvell 88SE9235. Devices 640L, 644L, 644LS should also be supported but not tested here. Signed-off-by:
Jérôme Carretero <cJ-ko@zougloub.eu> Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org (cherry picked from commit d2518365) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
May fix display issues with non-HDMI displays. Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit 7d5ab300) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
We need to specify the encoder mode as LVDS for eDP when using the Crtc_Source atom table in order to properly set up the FMT hardware. bug: https://bugs.freedesktop.org/show_bug.cgi?id=73911Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit 64252835) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
Only DCE5+ asics support DP 1.2. Noticed by ArtForz on IRC. Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit 3b6d9fd2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
We were checking the ext clock rather than the display clock. Noticed by ArtForz on IRC. Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit af5d3653) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jukka Taimisto authored
-[0x01 Introduction We have found a programming error causing a deadlock in Bluetooth subsystem of Linux kernel. The problem is caused by missing release_sock() call when L2CAP connection creation fails due full accept queue. The issue can be reproduced with 3.15-rc5 kernel and is also present in earlier kernels. -[0x02 Details The problem occurs when multiple L2CAP connections are created to a PSM which contains listening socket (like SDP) and left pending, for example, configuration (the underlying ACL link is not disconnected between connections). When L2CAP connection request is received and listening socket is found the l2cap_sock_new_connection_cb() function (net/bluetooth/l2cap_sock.c) is called. This function locks the 'parent' socket and then checks if the accept queue is full. 1178 lock_sock(parent); 1179 1180 /* Check for backlog size */ 1181 if (sk_acceptq_is_full(parent)) { 1182 BT_DBG("backlog full %d", parent->sk_ack_backlog); 1183 return NULL; 1184 } If case the accept queue is full NULL is returned, but the 'parent' socket is not released. Thus when next L2CAP connection request is received the code blocks on lock_sock() since the parent is still locked. Also note that for connections already established and waiting for configuration to complete a timeout will occur and l2cap_chan_timeout() (net/bluetooth/l2cap_core.c) will be called. All threads calling this function will also be blocked waiting for the channel mutex since the thread which is waiting on lock_sock() alread holds the channel mutex. We were able to reproduce this by sending continuously L2CAP connection request followed by disconnection request containing invalid CID. This left the created connections pending configuration. After the deadlock occurs it is impossible to kill bluetoothd, btmon will not get any more data etc. requiring reboot to recover. -[0x03 Fix Releasing the 'parent' socket when l2cap_sock_new_connection_cb() returns NULL seems to fix the issue. Signed-off-by:
Jukka Taimisto <jtt@codenomicon.com> Reported-by:
Tommi Mäkilä <tmakila@codenomicon.com> Signed-off-by:
Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org (cherry picked from commit 8a96f3cd) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
hujianyang authored
I hit the same assert failed as Dolev Raviv reported in Kernel v3.10 shows like this: [ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297) [ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G O 3.10.40 #1 [ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24) [ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28) [ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) [ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) [ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) [ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) [ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) [ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) [ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238) [ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) [ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) [ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) [ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) [ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418) [ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530) [ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54) [ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184) [ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac) [ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0) [ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40) After analyzing the code, I found a condition that may cause this failed in correct operations. Thus, I think this assertion is wrong and should be removed. Suppose there are two clean znodes and one dirty znode in TNC. So the per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops on this znode. We clear COW bit and DIRTY bit in write_index() without @tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the comments in write_index() shows, if another process hold @tnc_mutex and dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1). We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in free_obsolete_znodes() to keep it right. If shrink_tnc() performs between decrease and increase, it will release other 2 clean znodes it holds and found @clean_zn_cnt is less than zero (1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will soon correct @clean_zn_cnt and no harm to fs in this case, I think this assertion could be removed. 2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2 Thread A (commit) Thread B (write or others) Thread C (shrinker) ->write_index ->clear_bit(DIRTY_NODE) ->clear_bit(COW_ZNODE) @clean_zn_cnt == 2 ->mutex_locked(&tnc_mutex) ->dirty_cow_znode ->!ubifs_zn_cow(znode) ->!test_and_set_bit(DIRTY_NODE) ->atomic_dec(&clean_zn_cnt) ->mutex_unlocked(&tnc_mutex) @clean_zn_cnt == 1 ->mutex_locked(&tnc_mutex) ->shrink_tnc ->destroy_tnc_subtree ->atomic_sub(&clean_zn_cnt, 2) ->ubifs_assert <- hit ->mutex_unlocked(&tnc_mutex) @clean_zn_cnt == -1 ->mutex_lock(&tnc_mutex) ->free_obsolete_znodes ->atomic_inc(&clean_zn_cnt) ->mutux_unlock(&tnc_mutex) @clean_zn_cnt == 0 (correct after shrink) Signed-off-by:
hujianyang <hujianyang@huawei.com> Cc: stable@vger.kernel.org Signed-off-by:
Artem Bityutskiy <artem.bityutskiy@linux.intel.com> (cherry picked from commit 72abc8f4) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Christoph Hellwig authored
Note nobody's ever noticed because the typical client probably never requests FILES_AVAIL without also requesting something else on the list. Signed-off-by:
Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by:
J. Bruce Fields <bfields@redhat.com> (cherry picked from commit 12337901) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Yann Droneaud authored
The i386 ABI disagrees with most other ABIs regarding alignment of data types larger than 4 bytes: on most ABIs a padding must be added at end of the structures, while it is not required on i386. So for most ABI struct c4iw_create_cq_resp gets implicitly padded to be aligned on a 8 bytes multiple, while for i386, such padding is not added. The tool pahole can be used to find such implicit padding: $ pahole --anon_include \ --nested_anon_include \ --recursive \ --class_name c4iw_create_cq_resp \ drivers/infiniband/hw/cxgb4/iw_cxgb4.o Then, structure layout can be compared between i386 and x86_64: +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 11:43:05.547432195 +0100 --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100 @@ -14,9 +13,8 @@ struct c4iw_create_cq_resp { __u32 size; /* 28 4 */ __u32 qid_mask; /* 32 4 */ - /* size: 36, cachelines: 1, members: 6 */ - /* last cacheline: 36 bytes */ + /* size: 40, cachelines: 1, members: 6 */ + /* padding: 4 */ + /* last cacheline: 40 bytes */ }; This ABI disagreement will make an x86_64 kernel try to write past the buffer provided by an i386 binary. When boundary check will be implemented, the x86_64 kernel will refuse to write past the i386 userspace provided buffer and the uverbs will fail. If the structure is on a page boundary and the next page is not mapped, ib_copy_to_udata() will fail and the uverb will fail. This patch adds an explicit padding at end of structure c4iw_create_cq_resp, and, like 92b0ca7c ("IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_create_cq() not writting this padding field to userspace. This way, x86_64 kernel will be able to write struct c4iw_create_cq_resp as expected by unpatched and patched i386 libcxgb4. Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Fixes: cfdda9d7 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC") Fixes: e24a72a3 ("RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()") Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Yann Droneaud <ydroneaud@opteya.com> Acked-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit b6f04d3d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bart Van Assche authored
Avoid leaking a kref count in ib_umad_open() if port->ib_dev == NULL or if nonseekable_open() fails. Avoid leaking a kref count, that sm_sem is kept down and also that the IB_PORT_SM capability mask is not cleared in ib_umad_sm_open() if nonseekable_open() fails. Since container_of() never returns NULL, remove the code that tests whether container_of() returns NULL. Moving the kref_get() call from the start of ib_umad_*open() to the end is safe since it is the responsibility of the caller of these functions to ensure that the cdev pointer remains valid until at least when these functions return. Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Cc: <stable@vger.kernel.org> [ydroneaud@opteya.com: rework a bit to reduce the amount of code changed] Signed-off-by:
Yann Droneaud <ydroneaud@opteya.com> [ nonseekable_open() can't actually fail, but.... - Roland ] Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit 8ec0a0e6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Aleksander Morgado authored
A set of new VID/PIDs retrieved from the out-of-tree GobiNet/GobiSerial Sierra Wireless drivers. Signed-off-by:
Aleksander Morgado <aleksander@aleksander.es> Link: http://marc.info/?l=linux-usb&m=140136310027293&w=2 Cc: <stable@vger.kernel.org> # backport in link above Signed-off-by:
Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 0ce5fb58) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Aleksander Morgado authored
Signed-off-by:
Aleksander Morgado <aleksander@aleksander.es> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit ff1fcd50) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Arik Nemtsov authored
Doing so will lead to an oops for a p2p-dev interface, since it has no netdev. Cc: stable@vger.kernel.org Signed-off-by:
Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by:
Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by:
Johannes Berg <johannes.berg@intel.com> (cherry picked from commit 923eaf36) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Christian Borntraeger authored
The IRB might be 96 bytes if the extended-I/O-measurement facility is used. This feature is currently not used by Linux, but struct irb already has the emw defined. So let's make the irb in lowcore match the size of the internal data structure to be future proof. We also have to add a pad, to correctly align the paste. The bigger irb field also circumvents a bug in some QEMU versions that always write the emw field on test subchannel and therefore destroy the paste definitions of this CPU. Running under these QEMU version broke some timing functions in the VDSO and all users of these functions, e.g. some JREs. Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: stable@vger.kernel.org (cherry picked from commit 993072ee) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Huang Rui authored
TEST 12 and TEST 24 unlinks the URB write request for N times. When host and gadget both initialize pattern 1 (mod 63) data series to transfer, the gadget side will complain the wrong data which is not expected. Because in host side, usbtest doesn't fill the data buffer as mod 63 and this patch fixed it. [20285.488974] dwc3 dwc3.0.auto: ep1out-bulk: Transfer Not Ready [20285.489181] dwc3 dwc3.0.auto: ep1out-bulk: reason Transfer Not Active [20285.489423] dwc3 dwc3.0.auto: ep1out-bulk: req ffff8800aa6cb480 dma aeb50800 length 512 last [20285.489727] dwc3 dwc3.0.auto: ep1out-bulk: cmd 'Start Transfer' params 00000000 a9eaf000 00000000 [20285.490055] dwc3 dwc3.0.auto: Command Complete --> 0 [20285.490281] dwc3 dwc3.0.auto: ep1out-bulk: Transfer Not Ready [20285.490492] dwc3 dwc3.0.auto: ep1out-bulk: reason Transfer Active [20285.490713] dwc3 dwc3.0.auto: ep1out-bulk: endpoint busy [20285.490909] dwc3 dwc3.0.auto: ep1out-bulk: Transfer Complete [20285.491117] dwc3 dwc3.0.auto: request ffff8800aa6cb480 from ep1out-bulk completed 512/512 ===> 0 [20285.491431] zero gadget: bad OUT byte, buf[1] = 0 [20285.491605] dwc3 dwc3.0.auto: ep1out-bulk: cmd 'Set Stall' params 00000000 00000000 00000000 [20285.491915] dwc3 dwc3.0.auto: Command Complete --> 0 [20285.492099] dwc3 dwc3.0.auto: queing request ffff8800aa6cb480 to ep1out-bulk length 512 [20285.492387] dwc3 dwc3.0.auto: ep1out-bulk: Transfer Not Ready [20285.492595] dwc3 dwc3.0.auto: ep1out-bulk: reason Transfer Not Active [20285.492830] dwc3 dwc3.0.auto: ep1out-bulk: req ffff8800aa6cb480 dma aeb51000 length 512 last [20285.493135] dwc3 dwc3.0.auto: ep1out-bulk: cmd 'Start Transfer' params 00000000 a9eaf000 00000000 [20285.493465] dwc3 dwc3.0.auto: Command Complete --> 0 Cc: <stable@vger.kernel.org> Signed-off-by:
Huang Rui <ray.huang@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit e4d58f5d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bin Wang authored
the vma range size is always page size aligned in mmap, while the real io space range may not be page aligned, thus leading to range check failure in the uio_mmap_physical(). for example, in a case of io range size "mem->size == 1KB", and we have (vma->vm_end - vma->vm_start) == 4KB, due to "len" is aligned to page size in do_mmap_pgoff(). now fix this issue by align mem->size to page size in the check. Signed-off-by:
Bin Wang <binw@marvell.com> Signed-off-by:
Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit ddb09754) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Linus Torvalds authored
In commit 7314e613 ("Fix a few incorrectly checked [io_]remap_pfn_range() calls") the uio driver started more properly checking the passed-in user mapping arguments against the size of the actual uio driver data. That in turn exposed that some driver authors apparently didn't realize that mmap can only work on a page granularity, and had tried to use it with smaller mappings, with the new size check catching that out. So since it's not just the user mmap() arguments that can be confused, make the uio mmap code also verify that the uio driver has the memory allocated at page boundaries in order for mmap to work. If the device memory isn't properly aligned, we return [ENODEV] The fildes argument refers to a file whose type is not supported by mmap(). as per the open group documentation on mmap. Reported-by:
Holger Brunck <holger.brunck@keymile.com> Acked-by:
Greg KH <gregkh@linuxfoundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b6550287) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Dennis Dalessandro authored
This patch addresses an issue where the legacy diagpacket is sent in from the user, but the driver operates on only the extended diagpkt. This patch specifically initializes the extended diagpkt based on the legacy packet. Cc: <stable@vger.kernel.org> Reported-by:
Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Reviewed-by:
Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by:
Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit 7e6d3e5c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mike Marciniszyn authored
The code used a literal 1 in dispatching an IB_EVENT_PKEY_CHANGE. As of the dual port qib QDR card, this is not necessarily correct. Change to use the port as specified in the call. Cc: <stable@vger.kernel.org> Reported-by:
Alex Estrin <alex.estrin@intel.com> Reviewed-by:
Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by:
Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit 911eccd2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Maurizio Lombardi authored
The variable "size" is expressed as number of blocks and not as number of clusters, this could trigger a kernel panic when using ext4 with the size of a cluster different from the size of a block. Cc: stable@vger.kernel.org Signed-off-by:
Maurizio Lombardi <mlombard@redhat.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> (cherry picked from commit b5b60778) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jan Kara authored
Tail of a page straddling inode size must be zeroed when being written out due to POSIX requirement that modifications of mmaped page beyond inode size must not be written to the file. ext4_bio_write_page() did this only for blocks fully beyond inode size but didn't properly zero blocks partially beyond inode size. Fix this. The problem has been uncovered by mmap_11-4 test in openposix test suite (part of LTP). Reported-by:
Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com> Fixes: 5a0dc736 Fixes: bd2d0210 CC: stable@vger.kernel.org Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> (cherry picked from commit eeece469) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Andreas Schrägle authored
Add support for Marvell Technology Group Ltd. 88SE91A0 SATA 6Gb/s Controller by adding its PCI ID. Signed-off-by:
Andreas Schrägle <ajs124.ajs124@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org (cherry picked from commit 754a292f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Krzysztof Hałasa authored
Without this fix, freshly rebooted Linux creates a new IBSS instead of joining an existing one. Only when jiffies counter overflows after 5 minutes the IBSS can be successfully joined. Signed-off-by:
Krzysztof Hałasa <khalasa@piap.pl> [edit commit message slightly] Cc: stable@vger.kernel.org Signed-off-by:
Johannes Berg <johannes.berg@intel.com> (cherry picked from commit c7d37a66) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Leif Lindholm authored
A few platforms lack a 'device_type = "memory"' for their memory nodes, relying on an old ppc quirk in order to discover its memory. Add the missing data so that all parsing code can find memory nodes correctly. Signed-off-by:
Leif Lindholm <leif.lindholm@linaro.org> Acked-by:
John Crispin <blogic@openwrt.org> Signed-off-by:
Grant Likely <grant.likely@linaro.org> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Cc: gaurav.minocha@alumni.ubc.ca Patchwork: https://patchwork.linux-mips.org/patch/6989/Signed-off-by:
Ralf Baechle <ralf@linux-mips.org> (cherry picked from commit 1d530fa4) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jonathan Cameron authored
Cc: Stable@vger.kernel.org> Reported-by:
Erik Habbinga <Erik.Habbinga@schneider-electric.com> Signed-off-by:
Jonathan Cameron <jic23@kernel.org> Acked-by:
Hartmut Knaack <knaack.h@gmx.de> (cherry picked from commit a91a73c8) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Takashi Iwai authored
When ivtv PCM device is accessed at the state where no firmware is loaded, it oopses like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 IP: [<ffffffffa049a881>] try_mailbox.isra.0+0x11/0x50 [ivtv] Call Trace: [<ffffffffa049aa20>] ivtv_api_call+0x160/0x6b0 [ivtv] [<ffffffffa049af86>] ivtv_api+0x16/0x40 [ivtv] [<ffffffffa049b10c>] ivtv_vapi+0xac/0xc0 [ivtv] [<ffffffffa049d40d>] ivtv_start_v4l2_encode_stream+0x19d/0x630 [ivtv] [<ffffffffa0530653>] snd_ivtv_pcm_capture_open+0x173/0x1c0 [ivtv_alsa] [<ffffffffa04526f1>] snd_pcm_open_substream+0x51/0x100 [snd_pcm] [<ffffffffa0452853>] snd_pcm_open+0xb3/0x260 [snd_pcm] [<ffffffffa0452a37>] snd_pcm_capture_open+0x37/0x50 [snd_pcm] [<ffffffffa033f557>] snd_open+0xa7/0x1e0 [snd] [<ffffffff8118a628>] chrdev_open+0x88/0x1d0 [<ffffffff811840be>] do_dentry_open+0x1de/0x270 [<ffffffff81193a73>] do_last+0x1c3/0xec0 [<ffffffff81194826>] path_openat+0xb6/0x670 [<ffffffff81195b65>] do_filp_open+0x35/0x80 [<ffffffff81185449>] do_sys_open+0x129/0x210 [<ffffffff815b782d>] system_call_fastpath+0x1a/0x1f This patch adds the check of firmware at PCM open callback like other open callbacks of this driver. Bugzilla: https://apibugzilla.novell.com/show_bug.cgi?id=875440 Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by:
Mauro Carvalho Chehab <m.chehab@samsung.com> (cherry picked from commit deb29e90) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Olivier Langlois authored
timestamps in v4l2 buffers returned to userspace are updated in uvc_video_clock_update() which uses timestamps fetched from uvc_video_clock_decode() by calling unconditionally ktime_get_ts(). Hence setting the module clock param to realtime has no effect before this patch. This has been tested with ffmpeg: ffmpeg -y -f v4l2 -input_format yuyv422 -video_size 640x480 -framerate 30 -i /dev/video0 \ -f alsa -acodec pcm_s16le -ar 16000 -ac 1 -i default \ -c:v libx264 -preset ultrafast \ -c:a libfdk_aac \ out.mkv and inspecting the v4l2 input starting timestamp. Cc: Stable <stable@vger.kernel.org> Signed-off-by:
Olivier Langlois <olivier@trillion01.com> Signed-off-by:
Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by:
Mauro Carvalho Chehab <m.chehab@samsung.com> (cherry picked from commit 3b35fc81) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Trond Myklebust authored
If the accept() call fails, we need to put the module reference. Signed-off-by:
Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by:
J. Bruce Fields <bfields@redhat.com> (cherry picked from commit c789102c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-