- 12 Aug, 2014 40 commits
-
-
Chris Mason authored
The mlx4 driver is triggering schedules while atomic inside mlx4_en_netpoll: spin_lock_irqsave(&cq->lock, flags); napi_synchronize(&cq->napi); ^^^^^ msleep here mlx4_en_process_rx_cq(dev, cq, 0); spin_unlock_irqrestore(&cq->lock, flags); This was part of a patch by Alexander Guller from Mellanox in 2011, but it still isn't upstream. Signed-off-by:
Chris Mason <clm@fb.com> cc: stable@vger.kernel.org Acked-By:
Amir Vadai <amirv@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit c98235cb) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mikulas Patocka authored
smp_read_barrier_depends() can be used if there is data dependency between the readers - i.e. if the read operation after the barrier uses address that was obtained from the read operation before the barrier. In this file, there is only control dependency, no data dependecy, so the use of smp_read_barrier_depends() is incorrect. The code could fail in the following way: * the cpu predicts that idx < entries is true and starts executing the body of the for loop * the cpu fetches map->extent[0].first and map->extent[0].count * the cpu fetches map->nr_extents * the cpu verifies that idx < extents is true, so it commits the instructions in the body of the for loop The problem is that in this scenario, the cpu read map->extent[0].first and map->nr_extents in the wrong order. We need a full read memory barrier to prevent it. Signed-off-by:
Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit e79323bd) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Kailang Yang authored
Signed-off-by:
Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> (cherry picked from commit 7c665932) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Helge Deller authored
This bug was detected with the libio-epoll-perl debian package where the test case IO-Ppoll-compat.t failed. Signed-off-by:
Helge Deller <deller@gmx.de> CC: stable@kernel.org # 3.0+ (cherry picked from commit ab3e55b1) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Felix Fietkau authored
Their power value is initialized to zero. This patch fixes an issue where the configured power drops to the minimum value when AP_VLAN interfaces are created/removed. Cc: stable@vger.kernel.org Signed-off-by:
Felix Fietkau <nbd@openwrt.org> Signed-off-by:
Johannes Berg <johannes.berg@intel.com> (cherry picked from commit 764152ff) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johannes Berg authored
Jouni reported that when doing off-channel transmissions mixed with on-channel transmissions, the on-channel ones ended up on the off-channel in some cases. The reason for that is that during the refactoring of the off- channel code, I lost the part that stopped all activity and as a consequence the on-channel frames (including data frames) were no longer queued but would be transmitted on the temporary channel. Fix this by simply restoring the lost activity stop call. Cc: stable@vger.kernel.org Fixes: 2eb278e0 ("mac80211: unify SW/offload remain-on-channel") Reported-by:
Jouni Malinen <j@w1.fi> Signed-off-by:
Johannes Berg <johannes.berg@intel.com> (cherry picked from commit 115b943a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Kieran Clancy authored
Address a regression caused by commit ad332c8a: (ACPI / EC: Clear stale EC events on Samsung systems) After the earlier patch, there was found to be a race condition on some earlier Samsung systems (N150/N210/N220). The function acpi_ec_clear was sometimes discarding a new EC event before its GPE was triggered by the system. In the case of these systems, this meant that the "lid open" event was not registered on resume if that was the cause of the wake, leading to problems when attempting to close the lid to suspend again. After testing on a number of Samsung systems, both those affected by the previous EC bug and those affected by the race condition, it seemed that the best course of action was to process rather than discard the events. On Samsung systems which accumulate stale EC events, there does not seem to be any adverse side-effects of running the associated _Q methods. This patch adds an argument to the static function acpi_ec_sync_query so that it may be used within the acpi_ec_clear loop in place of acpi_ec_query_unlocked which was used previously. With thanks to Stefan Biereigel for reporting the issue, and for all the people who helped test the new patch on affected systems. Fixes: ad332c8a (ACPI / EC: Clear stale EC events on Samsung systems) References: https://lkml.kernel.org/r/532FE3B2.9060808@biereigel-wb.de References: https://bugzilla.kernel.org/show_bug.cgi?id=44161#c173Reported-by:
Stefan Biereigel <stefan@biereigel.de> Signed-off-by:
Kieran Clancy <clancy.kieran@gmail.com> Tested-by:
Stefan Biereigel <stefan@biereigel.de> Tested-by:
Dennis Jansen <dennis.jansen@web.de> Tested-by:
Nicolas Porcel <nicolasporcel06@gmail.com> Tested-by:
Maurizio D'Addona <mauritiusdadd@gmail.com> Tested-by:
Juan Manuel Cabo <juanmanuel.cabo@gmail.com> Tested-by:
Giannis Koutsou <giannis.koutsou@gmail.com> Tested-by:
Kieran Clancy <clancy.kieran@gmail.com> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 3eba563e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Kieran Clancy authored
A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc) continue to log events during sleep (lid open/close, AC plug/unplug, battery level change), which accumulate in the EC until a buffer fills. After the buffer is full (tests suggest it holds 8 events), GPEs stop being triggered for new events. This state persists on wake or even on power cycle, and prevents new events from being registered until the EC is manually polled. This is the root cause of a number of bugs, including AC not being detected properly, lid close not triggering suspend, and low ambient light not triggering the keyboard backlight. The bug also seemed to be responsible for performance issues on at least one user's machine. Juan Manuel Cabo found the cause of bug and the workaround of polling the EC manually on wake. The loop which clears the stale events is based on an earlier patch by Lan Tianyu (see referenced attachment). This patch: - Adds a function acpi_ec_clear() which polls the EC for stale _Q events at most ACPI_EC_CLEAR_MAX (currently 100) times. A warning is logged if this limit is reached. - Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI system vendor is Samsung. This check could be replaced by several more specific DMI vendor/product pairs, but it's likely that the bug affects more Samsung products than just the five series mentioned above. Further, it should not be harmful to run acpi_ec_clear() on systems without the bug; it will return immediately after finding no data waiting. - Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add() - Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions() References: https://bugzilla.kernel.org/show_bug.cgi?id=44161 References: https://bugzilla.kernel.org/show_bug.cgi?id=45461 References: https://bugzilla.kernel.org/show_bug.cgi?id=57271 References: https://bugzilla.kernel.org/attachment.cgi?id=126801Suggested-by:
Juan Manuel Cabo <juanmanuel.cabo@gmail.com> Signed-off-by:
Kieran Clancy <clancy.kieran@gmail.com> Reviewed-by:
Lan Tianyu <tianyu.lan@intel.com> Reviewed-by:
Dennis Jansen <dennis.jansen@web.de> Tested-by:
Kieran Clancy <clancy.kieran@gmail.com> Tested-by:
Juan Manuel Cabo <juanmanuel.cabo@gmail.com> Tested-by:
Dennis Jansen <dennis.jansen@web.de> Tested-by:
Maurizio D'Addona <mauritiusdadd@gmail.com> Tested-by:
San Zamoyski <san@plusnet.pl> Cc: All applicable <stable@vger.kernel.org> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit ad332c8a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Matthew Daley authored
Do not leak kernel-only floppy_raw_cmd structure members to userspace. This includes the linked-list pointer and the pointer to the allocated DMA space. Signed-off-by:
Matthew Daley <mattd@bugfuzz.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 2145e15e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Matthew Daley authored
Always clear out these floppy_raw_cmd struct members after copying the entire structure from userspace so that the in-kernel version is always valid and never left in an interdeterminate state. Signed-off-by:
Matthew Daley <mattd@bugfuzz.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit ef87dbe7) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Peter Hurley authored
The tty atomic_write_lock does not provide an exclusion guarantee for the tty driver if the termios settings are LECHO & !OPOST. And since it is unexpected and not allowed to call TTY buffer helpers like tty_insert_flip_string concurrently, this may lead to crashes when concurrect writers call pty_write. In that case the following two writers: * the ECHOing from a workqueue and * pty_write from the process race and can overflow the corresponding TTY buffer like follows. If we look into tty_insert_flip_string_fixed_flag, there is: int space = __tty_buffer_request_room(port, goal, flags); struct tty_buffer *tb = port->buf.tail; ... memcpy(char_buf_ptr(tb, tb->used), chars, space); ... tb->used += space; so the race of the two can result in something like this: A B __tty_buffer_request_room __tty_buffer_request_room memcpy(buf(tb->used), ...) tb->used += space; memcpy(buf(tb->used), ...) ->BOOM B's memcpy is past the tty_buffer due to the previous A's tb->used increment. Since the N_TTY line discipline input processing can output concurrently with a tty write, obtain the N_TTY ldisc output_lock to serialize echo output with normal tty writes. This ensures the tty buffer helper tty_insert_flip_string is not called concurrently and everything is fine. Note that this is nicely reproducible by an ordinary user using forkpty and some setup around that (raw termios + ECHO). And it is present in kernels at least after commit d945cb9c (pty: Rework the pty layer to use the normal buffering logic) in 2.6.31-rc3. js: add more info to the commit log js: switch to bool js: lock unconditionally js: lock only the tty->ops->write call References: CVE-2014-0196 Reported-and-tested-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Peter Hurley <peter@hurleysoftware.com> Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 4291086b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Daniel Borkmann authored
Some occurences in the netfilter tree use skb_header_pointer() in the following way ... struct dccp_hdr _dh, *dh; ... skb_header_pointer(skb, dataoff, sizeof(_dh), &dh); ... where dh itself is a pointer that is being passed as the copy buffer. Instead, we need to use &_dh as the forth argument so that we're copying the data into an actual buffer that sits on the stack. Currently, we probably could overwrite memory on the stack (e.g. with a possibly mal-formed DCCP packet), but unintentionally, as we only want the buffer to be placed into _dh variable. Fixes: 2bc78049 ("[NETFILTER]: nf_conntrack: add DCCP protocol support") Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> (cherry picked from commit b22f5126) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Fenghua Yu authored
This patch enables Opmask, ZMM_Hi256, and Hi16_ZMM AVX-512 states for xstate context switch. Signed-off-by:
Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1392931491-33237-2-git-send-email-fenghua.yu@intel.comSigned-off-by:
H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # hw enabling (cherry picked from commit c2bc11f1) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mike Marciniszyn authored
The code was incorrectly using sg_dma_address() and sg_dma_len() instead of ib_sg_dma_address() and ib_sg_dma_len(). This prevents srpt from functioning with the Intel HCA and indeed will corrupt memory badly. Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by:
Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by:
Vinod Kumar <vinod.kumar@intel.com> Signed-off-by:
Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: stable@vger.kernel.org # 3.3+ Signed-off-by:
Nicholas Bellinger <nab@linux-iscsi.org> (cherry picked from commit b0768080) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Andy Grover authored
ft_del_tpg checks tpg->tport is set before unlinking the tpg from the tport when the tpg is being removed. Set this pointer in ft_tport_create, or the unlinking won't happen in ft_del_tpg and tport->tpg will reference a deleted object. This patch sets tpg->tport in ft_tport_create, because that's what ft_del_tpg checks, and is the only way to get back to the tport to clear tport->tpg. The bug was occuring when: - lport created, tport (our per-lport, per-provider context) is allocated. tport->tpg = NULL - tpg created - a PRLI is received. ft_tport_create is called, tpg is found and tport->tpg is set - tpg removed. ft_tpg is freed in ft_del_tpg. Since tpg->tport was not set, tport->tpg is not cleared and points at freed memory - Future calls to ft_tport_create return tport via first conditional, instead of searching for new tpg by calling ft_lport_find_tpg. tport->tpg is still invalid, and will access freed memory. see https://bugzilla.redhat.com/show_bug.cgi?id=1071340 Cc: stable@vger.kernel.org # 3.0+ Signed-off-by:
Andy Grover <agrover@redhat.com> Signed-off-by:
Nicholas Bellinger <nab@linux-iscsi.org> (cherry picked from commit 2c42be2d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
H. Peter Anvin authored
The IRET instruction, when returning to a 16-bit segment, only restores the bottom 16 bits of the user space stack pointer. We have a software workaround for that ("espfix") for the 32-bit kernel, but it relies on a nonzero stack segment base which is not available in 32-bit mode. Since 16-bit support is somewhat crippled anyway on a 64-bit kernel (no V86 mode), and most (if not quite all) 64-bit processors support virtualization for the users who really need it, simply reject attempts at creating a 16-bit segment when running on top of a 64-bit kernel. Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/n/tip-kicdm89kzw9lldryb1br9od0@git.kernel.org Cc: <stable@vger.kernel.org> (cherry picked from commit b3b42ac2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Rafał Miłecki authored
Register B43_MMIO_PSM_PHY_HDR is 16 bit one, so accessing it with 32b functions isn't safe. On my machine it causes delayed (!) CPU exception: Disabling lock debugging due to kernel taint mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 4: b200000000070f0f mce: [Hardware Error]: TSC 164083803dc mce: [Hardware Error]: PROCESSOR 2:20fc2 TIME 1396650505 SOCKET 0 APIC 0 microcode 0 mce: [Hardware Error]: Run the above through 'mcelog --ascii' mce: [Hardware Error]: Machine check: Processor context corrupt Kernel panic - not syncing: Fatal machine check on current CPU Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) Signed-off-by:
Rafał Miłecki <zajec5@gmail.com> Acked-by:
Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@vger.kernel.org> [2.6.35+] Signed-off-by:
John W. Linville <linville@tuxdriver.com> (cherry picked from commit 12cd43c6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jens Axboe authored
I got a bug report yesterday from Laszlo Ersek in which he states that his kvm instance fails to suspend. Laszlo bisected it down to this commit 1cf7e9c6 ("virtio_blk: blk-mq support") where virtio-blk is converted to use the blk-mq infrastructure. After digging a bit, it became clear that the issue was with the queue drain. blk-mq tracks queue usage in a percpu counter, which is incremented on request alloc and decremented when the request is freed. The initial hunt was for an inconsistency in blk-mq, but everything seemed fine. In fact, the counter only returned crazy values when suspend was in progress. When a CPU is unplugged, the percpu counters merges that CPU state with the general state. blk-mq takes care to register a hotcpu notifier with the appropriate priority, so we know it runs after the percpu counter notifier. However, the percpu counter notifier only merges the state when the CPU is fully gone. This leaves a state transition where the CPU going away is no longer in the online mask, yet it still holds private values. This means that in this state, percpu_counter_sum() returns invalid results, and the suspend then hangs waiting for abs(dead-cpu-value) requests to complete which of course will never happen. Fix this by clearing the state earlier, so we never have a case where the CPU isn't in online mask but still holds private state. This bug has been there since forever, I guess we don't have a lot of users where percpu counters needs to be reliable during the suspend cycle. Signed-off-by:
Jens Axboe <axboe@fb.com> Reported-by:
Laszlo Ersek <lersek@redhat.com> Tested-by:
Laszlo Ersek <lersek@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit e39435ce) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Takashi Iwai authored
PCM pointer callbacks in ice1712 driver check the buffer size boundary wrongly between bytes and frames. This leads to PCM core warnings like: snd_pcm_update_hw_ptr0: 105 callbacks suppressed ALSA pcm_lib.c:352 BUG: pcmC3D0c:0, pos = 5461, buffer size = 5461, period size = 2730 This patch fixes these checks to be placed after the proper unit conversions. Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> (cherry picked from commit 4f8e9400) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Liu Hua authored
As sysctl_hung_task_timeout_sec is unsigned long, when this value is larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in watchdog will return immediately without sleep and with print : schedule_timeout: wrong timeout value ffffffffffffff83 and then the funtion watchdog will call schedule_timeout_interruptible again and again. The screen will be filled with "schedule_timeout: wrong timeout value ffffffffffffff83" This patch does some check and correction in sysctl, to let the function schedule_timeout_interruptible allways get the valid parameter. Signed-off-by:
Liu Hua <sdu.liu@huawei.com> Tested-by:
Satoru Takeuchi <satoru.takeuchi@gmail.com> Cc: <stable@vger.kernel.org> [3.4+] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 80df2847) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Oleg Nesterov authored
wait_task_zombie() first does EXIT_ZOMBIE->EXIT_DEAD transition and drops tasklist_lock. If this task is not the natural child and it is traced, we change its state back to EXIT_ZOMBIE for ->real_parent. The last transition is racy, this is even documented in 50b8d257 "ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE race". wait_consider_task() tries to detect this transition and clear ->notask_error but we can't rely on ptrace_reparented(), debugger can exit and do ptrace_unlink() before its sub-thread sets EXIT_ZOMBIE. And there is another problem which were missed before: this transition can also race with reparent_leader() which doesn't reset >exit_signal if EXIT_DEAD, assuming that this task must be reaped by someone else. So the tracee can be re-parented with ->exit_signal != SIGCHLD, and if /sbin/init doesn't use __WALL it becomes unreapable. Change reparent_leader() to update ->exit_signal even if EXIT_DEAD. Note: this is the simple temporary hack for -stable, it doesn't try to solve all problems, it will be reverted by the next changes. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Reported-by:
Jan Kratochvil <jan.kratochvil@redhat.com> Reported-by:
Michal Schmidt <mschmidt@redhat.com> Tested-by:
Michal Schmidt <mschmidt@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Lennart Poettering <lpoetter@redhat.com> Cc: Roland McGrath <roland@hack.frob.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit dfccbb5e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mizuma, Masayoshi authored
When I decrease the value of nr_hugepage in procfs a lot, softlockup happens. It is because there is no chance of context switch during this process. On the other hand, when I allocate a large number of hugepages, there is some chance of context switch. Hence softlockup doesn't happen during this process. So it's necessary to add the context switch in the freeing process as same as allocating process to avoid softlockup. When I freed 12 TB hugapages with kernel-2.6.32-358.el6, the freeing process occupied a CPU over 150 seconds and following softlockup message appeared twice or more. $ echo 6000000 > /proc/sys/vm/nr_hugepages $ cat /proc/sys/vm/nr_hugepages 6000000 $ grep ^Huge /proc/meminfo HugePages_Total: 6000000 HugePages_Free: 6000000 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB $ echo 0 > /proc/sys/vm/nr_hugepages BUG: soft lockup - CPU#16 stuck for 67s! [sh:12883] ... Pid: 12883, comm: sh Not tainted 2.6.32-358.el6.x86_64 #1 Call Trace: free_pool_huge_page+0xb8/0xd0 set_max_huge_pages+0x128/0x190 hugetlb_sysctl_handler_common+0x113/0x140 hugetlb_sysctl_handler+0x1e/0x20 proc_sys_call_handler+0x97/0xd0 proc_sys_write+0x14/0x20 vfs_write+0xb8/0x1a0 sys_write+0x51/0x90 __audit_syscall_exit+0x265/0x290 system_call_fastpath+0x16/0x1b I have not confirmed this problem with upstream kernels because I am not able to prepare the machine equipped with 12TB memory now. However I confirmed that the amount of decreasing hugepages was directly proportional to the amount of required time. I measured required times on a smaller machine. It showed 130-145 hugepages decreased in a millisecond. Amount of decreasing Required time Decreasing rate hugepages (msec) (pages/msec) ------------------------------------------------------------ 10,000 pages == 20GB 70 - 74 135-142 30,000 pages == 60GB 208 - 229 131-144 It means decrement of 6TB hugepages will trigger softlockup with the default threshold 20sec, in this decreasing rate. Signed-off-by:
Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 55f67141) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Vlastimil Babka authored
A BUG_ON(!PageLocked) was triggered in mlock_vma_page() by Sasha Levin fuzzing with trinity. The call site try_to_unmap_cluster() does not lock the pages other than its check_page parameter (which is already locked). The BUG_ON in mlock_vma_page() is not documented and its purpose is somewhat unclear, but apparently it serializes against page migration, which could otherwise fail to transfer the PG_mlocked flag. This would not be fatal, as the page would be eventually encountered again, but NR_MLOCK accounting would become distorted nevertheless. This patch adds a comment to the BUG_ON in mlock_vma_page() and munlock_vma_page() to that effect. The call site try_to_unmap_cluster() is fixed so that for page != check_page, trylock_page() is attempted (to avoid possible deadlocks as we already have check_page locked) and mlock_vma_page() is performed only upon success. If the page lock cannot be obtained, the page is left without PG_mlocked, which is again not a problem in the whole unevictable memory design. Signed-off-by:
Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Bob Liu <bob.liu@oracle.com> Reported-by:
Sasha Levin <sasha.levin@oracle.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by:
Rik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 57e68e9c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Nicholas Bellinger authored
This patch fixes a long-standing bug in iscsit_build_conn_drop_async_message() where during ERL=2 connection recovery, a bogus conn_p pointer could end up being used to send the ISCSI_OP_ASYNC_EVENT + DROPPING_CONNECTION notifying the initiator that cmd->logout_cid has failed. The bug was manifesting itself as an OOPs in iscsit_allocate_cmd() with a bogus conn_p pointer in iscsit_build_conn_drop_async_message(). Reported-by:
Arshad Hussain <arshad.hussain@calsoftinc.com> Reported-by:
santosh kulkarni <santosh.kulkarni@calsoftinc.com> Cc: <stable@vger.kernel.org> #3.1+ Signed-off-by:
Nicholas Bellinger <nab@linux-iscsi.org> (cherry picked from commit d444edc6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Giacomo Comes authored
The Dell XPS 8700 has a onboard Display port and HDMI port and no VGA port. The call intel_crt_init freeze the machine, so skip such call. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73559 Signed-off-by: Giacomo Comes <comes at naic.edu> Cc: stable@vger.kernel.org Signed-off-by:
Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit 10b6ee4a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
alex chen authored
Do not put bh when buffer_uptodate failed in ocfs2_write_block and ocfs2_write_super_or_backup, because it will put bh in b_end_io. Otherwise it will hit a warning "VFS: brelse: Trying to free free buffer". Signed-off-by:
Alex Chen <alex.chen@huawei.com> Reviewed-by:
Joseph Qi <joseph.qi@huawei.com> Reviewed-by:
Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by:
Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit f7cf4f5b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Junxiao Bi authored
There is a race window in dlm_do_recovery() between dlm_remaster_locks() and dlm_reset_recovery() when the recovery master nearly finish the recovery process for a dead node. After the master sends FINALIZE_RECO message in dlm_remaster_locks(), another node may become the recovery master for another dead node, and then send the BEGIN_RECO message to all the nodes included the old master, in the handler of this message dlm_begin_reco_handler() of old master, dlm->reco.dead_node and dlm->reco.new_master will be set to the second dead node and the new master, then in dlm_reset_recovery(), these two variables will be reset to default value. This will cause new recovery master can not finish the recovery process and hung, at last the whole cluster will hung for recovery. old recovery master: new recovery master: dlm_remaster_locks() become recovery master for another dead node. dlm_send_begin_reco_message() dlm_begin_reco_handler() { if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) { return -EAGAIN; } dlm_set_reco_master(dlm, br->node_idx); dlm_set_reco_dead_node(dlm, br->dead_node); } dlm_reset_recovery() { dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM); dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM); } will hang in dlm_remaster_locks() for request dlm locks info Before send FINALIZE_RECO message, recovery master should set DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done, this can break the race windows as the BEGIN_RECO messages will not be handled before DLM_RECO_STATE_FINALIZE flag is cleared. A similar race may happen between new recovery master and normal node which is in dlm_finalize_reco_handler(), also fix it. Signed-off-by:
Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by:
Srinivas Eeda <srinivas.eeda@oracle.com> Reviewed-by:
Wengang Wang <wen.gang.wang@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit ded2cf71) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Junxiao Bi authored
This issue was introduced by commit 800deef3 ("ocfs2: use list_for_each_entry where benefical") in 2007 where it replaced list_for_each with list_for_each_entry. The variable "lock" will point to invalid data if "tmpq" list is empty and a panic will be triggered due to this. Sunil advised reverting it back, but the old version was also not right. At the end of the outer for loop, that list_for_each_entry will also set "lock" to an invalid data, then in the next loop, if the "tmpq" list is empty, "lock" will be an stale invalid data and cause the panic. So reverting the list_for_each back and reset "lock" to NULL to fix this issue. Another concern is that this seemes can not happen because the "tmpq" list should not be empty. Let me describe how. old lock resource owner(node 1): migratation target(node 2): image there's lockres with a EX lock from node 2 in granted list, a NR lock from node x with convert_type EX in converting list. dlm_empty_lockres() { dlm_pick_migration_target() { pick node 2 as target as its lock is the first one in granted list. } dlm_migrate_lockres() { dlm_mark_lockres_migrating() { res->state |= DLM_LOCK_RES_BLOCK_DIRTY; wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res)); //after the above code, we can not dirty lockres any more, // so dlm_thread shuffle list will not run downconvert lock from EX to NR upconvert lock from NR to EX <<< migration may schedule out here, then <<< node 2 send down convert request to convert type from EX to <<< NR, then send up convert request to convert type from NR to <<< EX, at this time, lockres granted list is empty, and two locks <<< in the converting list, node x up convert lock followed by <<< node 2 up convert lock. // will set lockres RES_MIGRATING flag, the following // lock/unlock can not run dlm_lockres_release_ast(dlm, res); } dlm_send_one_lockres() dlm_process_recovery_data() for (i=0; i<mres->num_locks; i++) if (ml->node == dlm->node_num) for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) { list_for_each_entry(lock, tmpq, list) if (lock) break; <<< lock is invalid as grant list is empty. } if (lock->ml.node != ml->node) BUG() >>> crash here } I see the above locks status from a vmcore of our internal bug. Signed-off-by:
Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by:
Wengang Wang <wen.gang.wang@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Reviewed-by:
Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 34aa8dac) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Matt Fleming authored
Kees reported the following error: arch/sh/kernel/dumpstack.c: In function 'print_trace_address': arch/sh/kernel/dumpstack.c:118:2: error: format not a string literal and no format arguments [-Werror=format-security] Use the "%s" format so that it's impossible to interpret 'data' as a format string. Signed-off-by:
Matt Fleming <matt.fleming@intel.com> Reported-by:
Kees Cook <keescook@chromium.org> Acked-by:
Kees Cook <keescook@chromium.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit a0c32761) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
This needs to be done to update some of the fields in the connector structure used by the audio code. Noticed by several users on irc. Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit 16086279) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Christopher Friedt authored
Previously, the vmwgfx_fb driver would allow users to call FBIOSET_VINFO, but it would not adjust the FINFO properly, resulting in distorted screen rendering. The patch corrects that behaviour. See https://bugs.gentoo.org/show_bug.cgi?id=494794 for examples. Cc: stable@vger.kernel.org Signed-off-by:
Christopher Friedt <chrisfriedt@gmail.com> Reviewed-by:
Thomas Hellstrom <thellstrom@vmware.com> (cherry picked from commit aa6de142) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Oleg Nesterov authored
pidns_get()->get_pid_ns() can hit ns == NULL. This task_struct can't go away, but task_active_pid_ns(task) is NULL if release_task(task) was already called. Alternatively we could change get_pid_ns(ns) to check ns != NULL, but it seems that other callers are fine. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Cc: Eric W. Biederman ebiederm@xmission.com> Cc: stable@kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit d2308225) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jeff Mahoney authored
jdm-20004 reiserfs_delete_xattrs: Couldn't delete all xattrs (-2) The -ENOENT is due to readdir calling dir_emit on the same entry twice. If the dir_emit callback sleeps and the tree is changed underneath us, we won't be able to trust deh_offset(deh) anymore. We need to save next_pos before we might sleep so we can find the next entry. CC: stable@vger.kernel.org Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Signed-off-by:
Jan Kara <jack@suse.cz> (cherry picked from commit 01d88857) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Al Viro authored
it's pointless and actually leads to wrong behaviour in at least one moderately convoluted case (pipe(), close one end, try to get to another via /proc/*/fd and run into ETXTBUSY). Cc: stable@vger.kernel.org Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit dd20908a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Yann Droneaud authored
In case of error when writing to userspace, function ehca_create_cq() does not set an error code before following its error path. This patch sets the error code to -EFAULT when ib_copy_to_udata() fails. This was caught when using spatch (aka. coccinelle) to rewrite call to ib_copy_{from,to}_udata(). Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Signed-off-by:
Yann Droneaud <ydroneaud@opteya.com> Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit 5bdb0f02) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Yann Droneaud authored
In case of error when writing to userspace, the function mthca_create_cq() does not set an error code before following its error path. This patch sets the error code to -EFAULT when ib_copy_to_udata() fails. This was caught when using spatch (aka. coccinelle) to rewrite call to ib_copy_{from,to}_udata(). Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com Cc: <stable@vger.kernel.org> Signed-off-by:
Yann Droneaud <ydroneaud@opteya.com> Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit 08e74c4b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Stanislav Kinsbursky authored
There could be a case, when NFSd file system is mounted in network, different to socket's one, like below: "ip netns exec" creates new network and mount namespace, which duplicates NFSd mount point, created in init_net context. And thus NFS server stop in nested network context leads to RPCBIND client destruction in init_net. Then, on NFSd start in nested network context, rpc.nfsd process creates socket in nested net and passes it into "write_ports", which leads to RPCBIND sockets creation in init_net context because of the same reason (NFSd monut point was created in init_net context). An attempt to register passed socket in nested net leads to panic, because no RPCBIND client present in nexted network namespace. This patch add check that passed socket's net matches NFSd superblock's one. And returns -EINVAL error to user psace otherwise. v2: Put socket on exit. Reported-by:
Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by:
Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by:
J. Bruce Fields <bfields@redhat.com> (cherry picked from commit 30646394) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Neil Horman authored
Commit 03bbcb2e (iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets) properly disables irq remapping on the 5500/5520 chipsets that don't correctly perform that feature. However, when I wrote it, I followed the errata sheet linked in that commit too closely, and explicitly tied the activation of the quirk to revision 0x13 of the chip, under the assumption that earlier revisions were not in the field. Recently a system was reported to be suffering from this remap bug and the quirk hadn't triggered, because the revision id register read at a lower value that 0x13, so the quirk test failed improperly. Given this, it seems only prudent to adjust this quirk so that any revision less than 0x13 has the quirk asserted. [ tglx: Removed the 0x12 comparison of pci id 3405 as this is covered by the <= 0x13 check already ] Signed-off-by:
Neil Horman <nhorman@tuxdriver.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1394649873-14913-1-git-send-email-nhorman@tuxdriver.comSigned-off-by:
Thomas Gleixner <tglx@linutronix.de> (cherry picked from commit 6f8a1b33) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
W. Trevor King authored
The `lspci -nnvv` output contains (wrapped for line length): 00:1b.0 Audio device [0403]: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller [8086:1e20] (rev 04) Subsystem: ASUSTeK Computer Inc. Device [1043:115d] Signed-off-by:
W. Trevor King <wking@tremily.us> Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> (cherry picked from commit a4b7f21d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Huacai Chen authored
The original MIPS hibernate code flushes cache and TLB entries in swsusp_arch_resume(). But they are removed in Commit 44eeab67 (MIPS: Hibernation: Remove SMP TLB and cacheflushing code.). A cross- CPU flush is surely unnecessary because all but the local CPU have already been disabled. But a local flush (at least the TLB flush) is needed. When we do hibernation on Loongson-3 with an E1000E NIC, it is very easy to produce a kernel panic (kernel page fault, or unaligned access). The root cause is E1000E driver use vzalloc_node() to allocate pages, the stale TLB entries of the booting kernel will be misused by the resumed target kernel. Signed-off-by:
Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/6643/Signed-off-by:
Ralf Baechle <ralf@linux-mips.org> (cherry picked from commit c14af233) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-