- 23 Feb, 2017 40 commits
-
-
Manfred Spraul authored
commit 5864a2fd upstream. Commit 6d07b68c ("ipc/sem.c: optimize sem_lock()") introduced a race: sem_lock has a fast path that allows parallel simple operations. There are two reasons why a simple operation cannot run in parallel: - a non-simple operations is ongoing (sma->sem_perm.lock held) - a complex operation is sleeping (sma->complex_count != 0) As both facts are stored independently, a thread can bypass the current checks by sleeping in the right positions. See below for more details (or kernel bugzilla 105651). The patch fixes that by creating one variable (complex_mode) that tracks both reasons why parallel operations are not possible. The patch also updates stale documentation regarding the locking. With regards to stable kernels: The patch is required for all kernels that include the commit 6d07b68c ("ipc/sem.c: optimize sem_lock()") (3.10?) The alternative is to revert the patch that introduced the race. The patch is safe for backporting, i.e. it makes no assumptions about memory barriers in spin_unlock_wait(). Background: Here is the race of the current implementation: Thread A: (simple op) - does the first "sma->complex_count == 0" test Thread B: (complex op) - does sem_lock(): This includes an array scan. But the scan can't find Thread A, because Thread A does not own sem->lock yet. - the thread does the operation, increases complex_count, drops sem_lock, sleeps Thread A: - spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock) - sleeps before the complex_count test Thread C: (complex op) - does sem_lock (no array scan, complex_count==1) - wakes up Thread B. - decrements complex_count Thread A: - does the complex_count test Bug: Now both thread A and thread C operate on the same array, without any synchronization. Fixes: 6d07b68c ("ipc/sem.c: optimize sem_lock()") Link: http://lkml.kernel.org/r/1469123695-5661-1-git-send-email-manfred@colorfullife.com Reported-by: <felixh@informatik.uni-bremen.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: <1vier1@web.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: - We missed out on some earlier memory barrier changes - Use set_mb instead of smp_store_mb] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Joe Perches authored
commit 7f032d6e upstream. The seq_printf return value, because it's frequently misused, will eventually be converted to void. See: commit 1f33c41c ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Paul E. McKenney authored
commit 536fa402 upstream. CPUs without single-byte and double-byte loads and stores place some "interesting" requirements on concurrent code. For example (adapted from Peter Hurley's test code), suppose we have the following structure: struct foo { spinlock_t lock1; spinlock_t lock2; char a; /* Protected by lock1. */ char b; /* Protected by lock2. */ }; struct foo *foop; Of course, it is common (and good) practice to place data protected by different locks in separate cache lines. However, if the locks are rarely acquired (for example, only in rare error cases), and there are a great many instances of the data structure, then memory footprint can trump false-sharing concerns, so that it can be better to place them in the same cache cache line as above. But if the CPU does not support single-byte loads and stores, a store to foop->a will do a non-atomic read-modify-write operation on foop->b, which will come as a nasty surprise to someone holding foop->lock2. So we now require CPUs to support single-byte and double-byte loads and stores. Therefore, this commit adjusts the definition of __native_word() to allow these sizes to be used by smp_load_acquire() and smp_store_release(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Hidehiro Kawai authored
commit 54c721b8 upstream. Daniel Walker reported problems which happens when crash_kexec_post_notifiers kernel option is enabled (https://lkml.org/lkml/2015/6/24/44). In that case, smp_send_stop() is called before entering kdump routines which assume other CPUs are still online. As the result, kdump routines fail to save other CPUs' registers. Additionally for MIPS OCTEON, it misses to stop the watchdog timer. To fix this problem, call a new kdump friendly function, crash_smp_send_stop(), instead of the smp_send_stop() when crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a weak function, and it just call smp_send_stop(). Architecture codes should override it so that kdump can work appropriately. This patch provides MIPS version. Fixes: f06e5153 (kernel/panic.c: add "crash_kexec_post_notifiers" option) Link: http://lkml.kernel.org/r/20160810080950.11028.28000.stgit@sysi4-13.yrl.intra.hitachi.co.jpSigned-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Reported-by: Daniel Walker <dwalker@fifo99.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Walker <dwalker@fifo99.com> Cc: Xunlei Pang <xpang@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: "Steven J. Hill" <steven.hill@cavium.com> Cc: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Hidehiro Kawai authored
commit 0ee59413 upstream. Daniel Walker reported problems which happens when crash_kexec_post_notifiers kernel option is enabled (https://lkml.org/lkml/2015/6/24/44). In that case, smp_send_stop() is called before entering kdump routines which assume other CPUs are still online. As the result, for x86, kdump routines fail to save other CPUs' registers and disable virtualization extensions. To fix this problem, call a new kdump friendly function, crash_smp_send_stop(), instead of the smp_send_stop() when crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a weak function, and it just call smp_send_stop(). Architecture codes should override it so that kdump can work appropriately. This patch only provides x86-specific version. For Xen's PV kernel, just keep the current behavior. NOTES: - Right solution would be to place crash_smp_send_stop() before __crash_kexec() invocation in all cases and remove smp_send_stop(), but we can't do that until all architectures implement own crash_smp_send_stop() - crash_smp_send_stop()-like work is still needed by machine_crash_shutdown() because crash_kexec() can be called without entering panic() Fixes: f06e5153 (kernel/panic.c: add "crash_kexec_post_notifiers" option) Link: http://lkml.kernel.org/r/20160810080948.11028.15344.stgit@sysi4-13.yrl.intra.hitachi.co.jpSigned-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Reported-by: Daniel Walker <dwalker@fifo99.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Walker <dwalker@fifo99.com> Cc: Xunlei Pang <xpang@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: "Steven J. Hill" <steven.hill@cavium.com> Cc: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Paul Mackerras authored
commit 1a34439e upstream. Debugging a data corruption issue with virtio-net/vhost-net led to the observation that __copy_tofrom_user was occasionally returning a value 16 larger than it should. Since the return value from __copy_tofrom_user is the number of bytes not copied, this means that __copy_tofrom_user can occasionally return a value larger than the number of bytes it was asked to copy. In turn this can cause higher-level copy functions such as copy_page_to_iter_iovec to corrupt memory by copying data into the wrong memory locations. It turns out that the failing case involves a fault on the store at label 79, and at that point the first unmodified byte of the destination is at R3 + 16. Consequently the exception handler for that store needs to add 16 to R3 before using it to work out how many bytes were not copied, but in this one case it was not adding the offset to R3. To fix it, this moves the label 179 to the point where we add 16 to R3. I have checked manually all the exception handlers for the loads and stores in this code and the rest of them are correct (it would be excellent to have an automated test of all the exception cases). This bug has been present since this code was initially committed in May 2002 to Linux version 2.5.20. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Laurent Dufour authored
commit 05af40e8 upstream. This commit fixes a stack corruption in the pseries specific code dealing with the huge pages. In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments to the hypervisor is not large enough. This leads to a stack corruption where a previously saved register could be corrupted leading to unexpected result in the caller, like the following panic: Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2048 NUMA pSeries Modules linked in: virtio_balloon ip_tables x_tables autofs4 virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 #76 task: c000000005394880 task.stack: c000000005570000 NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000 REGS: c000000005573820 TRAP: 0300 Not tainted (4.8.0) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 84822884 XER: 20000000 CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1 GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964 GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104 GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5 GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000 GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000 GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000 GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964 GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0 NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470 LR [c00000000027bf64] zap_huge_pmd+0x44/0x470 Call Trace: [c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable) [c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0 [c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120 [c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0 [c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0 [c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0 [c000000005573e30] [c000000000009560] system_call+0x38/0x108 Instruction dump: fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1 7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378 Most of the time, the bug is surfacing in a caller up in the stack from __pSeries_lpar_hugepage_invalidate() which is quite confusing. This bug is pending since v3.11 but was hidden if a caller of the caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped register (r18 in this case) in the stack and is not using it until restoring it. GCC 6.2.0 seems to raise it more frequently. This commit also change the definition of the parameter buffer in pSeries_lpar_flush_hash_range() to rely on the global define PLPAR_HCALL9_BUFSIZE (no functional change here). Fixes: 1a527286 ("powerpc: Optimize hugepage invalidate") Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Wei Fang authored
commit c2a9737f upstream. We triggered a deadloop in truncate_inode_pages_range() on 32 bits architecture with the test case bellow: ... fd = open(); write(fd, buf, 4096); preadv64(fd, &iovec, 1, 0xffffffff000); ftruncate(fd, 0); ... Then ftruncate() will not return forever. The filesystem used in this case is ubifs, but it can be triggered on many other filesystems. When preadv64() is called with offset=0xffffffff000, a page with index=0xffffffff will be added to the radix tree of ->mapping. Then this page can be found in ->mapping with pagevec_lookup(). After that, truncate_inode_pages_range(), which is called in ftruncate(), will fall into an infinite loop: - find a page with index=0xffffffff, since index>=end, this page won't be truncated - index++, and index become 0 - the page with index=0xffffffff will be found again The data type of index is unsigned long, so index won't overflow to 0 on 64 bits architecture in this case, and the dead loop won't happen. Since truncate_inode_pages_range() is executed with holding lock of inode->i_rwsem, any operation related with this lock will be blocked, and a hung task will happen, e.g.: INFO: task truncate_test:3364 blocked for more than 120 seconds. ... call_rwsem_down_write_failed+0x17/0x30 generic_file_write_iter+0x32/0x1c0 ubifs_write_iter+0xcc/0x170 __vfs_write+0xc4/0x120 vfs_write+0xb2/0x1b0 SyS_write+0x46/0xa0 The page with index=0xffffffff added to ->mapping is useless. Fix this by checking the read position before allocating pages. Link: http://lkml.kernel.org/r/1475151010-40166-1-git-send-email-fangwei1@huawei.comSigned-off-by: Wei Fang <fangwei1@huawei.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Gerald Schaefer authored
commit 082d5b6b upstream. In dissolve_free_huge_pages(), free hugepages will be dissolved without making sure that there are enough of them left to satisfy hugepage reservations. Fix this by adding a return value to dissolve_free_huge_pages() and checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may lead to the situation where dissolve_free_huge_page() returns an error and all free hugepages that were dissolved before that error are lost, while the memory block still cannot be set offline. Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Link: http://lkml.kernel.org/r/20160926172811.94033-3-gerald.schaefer@de.ibm.comSigned-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rui Teng <rui.teng@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Gerald Schaefer authored
commit 2247bb33 upstream. Patch series "mm/hugetlb: memory offline issues with hugepages", v4. This addresses several issues with hugepages and memory offline. While the first patch fixes a panic, and is therefore rather important, the last patch is just a performance optimization. The second patch fixes a theoretical issue with reserved hugepages, while still leaving some ugly usability issue, see description. This patch (of 3): dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a list corruption and addressing exception when trying to set a memory block offline that is part (but not the first part) of a "gigantic" hugetlb page with a size > memory block size. When no other smaller hugetlb page sizes are present, the VM_BUG_ON() will trigger directly. In the other case we will run into an addressing exception later, because dissolve_free_huge_page() will not work on the head page of the compound hugetlb page which will result in a NULL hstate from page_hstate(). To fix this, first remove the VM_BUG_ON() because it is wrong, and then use the compound head page in dissolve_free_huge_page(). This means that an unused pre-allocated gigantic page that has any part of itself inside the memory block that is going offline will be dissolved completely. Losing an unused gigantic hugepage is preferable to failing the memory offline, for example in the situation where a (possibly faulty) memory DIMM needs to go offline. Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Link: http://lkml.kernel.org/r/20160926172811.94033-2-gerald.schaefer@de.ibm.comSigned-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rui Teng <rui.teng@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Dmitry Torokhov authored
commit 62837b3c upstream. Another Lifebook machine that needs the same quirk as other similar models to make the driver working. Also let's reorder elantech_dmi_force_crc_enabled list so LIfebook enries are in alphabetical order. Reported-by: William Linna <william.linna@gmail.com> Tested-by: William Linna <william.linna@gmail.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Bart Van Assche authored
commit 681cc360 upstream. Avoid that mapping an sg-list in which the first element has a non-zero offset triggers an infinite loop when using FMR. This patch makes the FMR mapping code similar to that of ib_sg_to_pages(). Note: older Mellanox HCAs do not support non-zero offsets for FMR. See also commit 8c4037b5 ("IB/srp: always avoid non-zero offsets into an FMR"). Reported-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Guenter Roeck authored
commit 35d04077 upstream. The definition of atomic_dec_if_positive() assumes that atomic_sub_if_positive() exists, which is only the case if metag specific atomics are used. This results in the following build error when trying to build metag1_defconfig. kernel/ucount.c: In function 'dec_ucount': kernel/ucount.c:211: error: implicit declaration of function 'atomic_sub_if_positive' Moving the definition of atomic_dec_if_positive() into the metag conditional code fixes the problem. Fixes: 6006c0d8 ("metag: Atomics, locks and bitops") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Eric Dumazet authored
commit d35c99ff upstream. Since linux-3.15, netlink_dump() can use up to 16384 bytes skb allocations. Due to struct skb_shared_info ~320 bytes overhead, we end up using order-3 (on x86) page allocations, that might trigger direct reclaim and add stress. The intent was really to attempt a large allocation but immediately fallback to a smaller one (order-1 on x86) in case of memory stress. On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to meet the goal. Old kernels would need to remove __GFP_WAIT While we are at it, since we do an order-3 allocation, allow to use all the allocated bytes instead of 16384 to reduce syscalls during large dumps. iproute2 already uses 32KB recvmsg() buffer sizes. Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384) Fixes: 9063e21f ("netlink: autosize skb lengthes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexei Starovoitov <ast@kernel.org> Cc: Greg Thelen <gthelen@google.com> Reviewed-by: Greg Rose <grose@lightfleet.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.16: - Adjust context - Mask out __GFP_WAIT, as suggested] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Johannes Weiner authored
commit 3ddf40e8 upstream. Commit 22f2ac51 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()") switched replace_page_cache() from raw radix tree operations to page_cache_tree_insert() but didn't take into account that the latter function, unlike the raw radix tree op, handles mapping->nrpages. As a result, that counter is bumped for each page replacement rather than balanced out even. The mapping->nrpages counter is used to skip needless radix tree walks when invalidating, truncating, syncing inodes without pages, as well as statistics for userspace. Since the error is positive, we'll do more page cache tree walks than necessary; we won't miss a necessary one. And we'll report more buffer pages to userspace than there are. The error is limited to fuse inodes. Fixes: 22f2ac51 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Justin Maggard authored
commit c8475090 upstream. Add missing dmaengine_unmap_put(), so we don't OOM during RAID6 sync. Fixes: 1786b943 ("async_pq_val: convert to dmaengine_unmap_data") Signed-off-by: Justin Maggard <jmaggard@netgear.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Lu Baolu authored
commit 8dcc5ff8 upstream. Member "status" of struct usb_sg_request is managed by usb core. A spin lock is used to serialize the change of it. The driver could check the value of req->status, but should avoid changing it without the hold of the spinlock. Otherwise, it could cause race or error in usb core. This patch could be backported to stable kernels with version later than v3.14. Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Roger Tseng <rogerable@realtek.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Dan Carpenter authored
commit 9a6dc644 upstream. set_bit() and clear_bit() take the bit number so this code is really doing "1 << (1 << irq)" which is a double shift bug. It's done consistently so it won't cause a problem unless "irq" is more than 4. Fixes: 70c6cce0 ('mfd: Support 88pm80x in 80x driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Uwe Kleine-König authored
commit 88003fb1 upstream. This fixes a compile failure: drivers/built-in.o: In function `wm8350_i2c_probe': core.c:(.text+0x828b0): undefined reference to `__devm_regmap_init_i2c' Makefile:953: recipe for target 'vmlinux' failed Fixes: 52b461b8 ("mfd: Add regmap cache support for wm8350") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Gavin Shan authored
commit 5adaf862 upstream. This fixes the warnings reported from sparse: pci.c:312:33: warning: restricted __be64 degrades to integer pci.c:313:33: warning: restricted __be64 degrades to integer Fixes: cee72d5b ("powerpc/powernv: Display diag data on p7ioc EEH errors") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Guilherme G Piccoli authored
commit edfc23ee upstream. Although rare, it's possible to hit PCI error early on device probe, meaning possibly some structs are not entirely initialized, and some might even be completely uninitialized, leading to NULL pointer dereference. The i40e driver currently presents a "bad" behavior if device hits such early PCI error: firstly, the struct i40e_pf might not be attached to pci_dev yet, leading to a NULL pointer dereference on access to pf->state. Even checking if the struct is NULL and avoiding the access in that case isn't enough, since the driver cannot recover from PCI error that early; in our experiments we saw multiple failures on kernel log, like: [549.664] i40e 0007:01:00.1: Initial pf_reset failed: -15 [549.664] i40e: probe of 0007:01:00.1 failed with error -15 [...] [871.644] i40e 0007:01:00.1: The driver for the device stopped because the device firmware failed to init. Try updating your NVM image. [871.644] i40e: probe of 0007:01:00.1 failed with error -32 [...] [872.516] i40e 0007:01:00.0: ARQ: Unknown event 0x0000 ignored Between the first probe failure (error -15) and the second (error -32) another PCI error happened due to the first bad probe. Also, driver started to flood console with those ARQ event messages. This patch will prevent these issues by allowing error recovery mechanism to remove the failed device from the system instead of trying to recover from early PCI errors during device probe. Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Richard Weinberger authored
commit f7d11b33 upstream. Usually Fastmap is free to consider every PEB in one of the pools as newer than the existing PEB. Since PEBs in a pool are by definition newer than everything else. But update_vol() missed the case that a pool can contain more than one candidate. Fixes: dbb7d2a8 ("UBI: Add fastmap core") Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Richard Weinberger authored
commit 2e8f08de upstream. When writing a new Fastmap the first thing that happens is refilling the pools in memory. At this stage it is possible that new PEBs from the new pools get already claimed and written with data. If this happens before the new Fastmap data structure hits the flash and we face power cut the freshly written PEB will not scanned and unnoticed. Solve the issue by locking the pools until Fastmap is written. Fixes: dbb7d2a8 ("UBI: Add fastmap core") Signed-off-by: Richard Weinberger <richard@nod.at> [bwh: Backported to 3.16: - Adjust filename, context, indentation - s/ubi->fm_eba_sem/ubi->fm_sem/] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Richard Weinberger authored
commit 23654188 upstream. When Fastmap is used we can face here an -EBADMSG since Fastmap cannot know about unmaps. If the erasure was interrupted the PEB may show ECC errors and UBI would go to ro-mode as it assumes that the PEB was check during attach time, which is not the case with Fastmap. Fixes: dbb7d2a8 ("UBI: Add fastmap core") Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Boris Brezillon authored
commit ecbfa8ea upstream. scan_pool() does not mark the PEB for scrubing when bitflips are detected in the EC header of a free PEB (VID header region left to 0xff). Make sure we scrub the PEB in this case. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Fixes: dbb7d2a8 ("UBI: Add fastmap core") Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Wei Yongjun authored
commit 23bf4042 upstream. Fix following static checker warning: drivers/staging/rtl8188eu/os_dep/usb_intf.c:311 rtw_resume_process() error: double unlock 'mutex:&pwrpriv->mutex_lock' Fixes: eaf47b71 ("staging: rtl8188eu: fix missing unlock on error in rtw_resume_process()") Reported-By: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - Adjust context - Unlock pwrctrl_priv::lock] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Wei Yongjun authored
commit eaf47b71 upstream. Add the missing unlock before return from function rtw_resume_process() in the error handling case. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - Adjust context - Unlock pwrctrl_priv::lock] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Ondrej Mosnáček authored
commit 50d2e6dc upstream. The cipher block size for GCM is 16 bytes, and thus the CTR transform used in crypto_gcm_setkey() will also expect a 16-byte IV. However, the code currently reserves only 8 bytes for the IV, causing an out-of-bounds access in the CTR transform. This patch fixes the issue by setting the size of the IV buffer to 16 bytes. Fixes: 84c91152 ("[CRYPTO] gcm: Add support for async ciphers") Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Miklos Szeredi authored
commit cb3ae6d2 upstream. Make sure userspace filesystem is returning a well formed list of xattr names (zero or more nonzero length, null terminated strings). [Michael Theall: only verify in the nonzero size case] Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> [bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Miklos Szeredi authored
commit a09f99ed upstream. Fuse allowed VFS to set mode in setattr in order to clear suid/sgid on chown and truncate, and (since writeback_cache) write. The problem with this is that it'll potentially restore a stale mode. The poper fix would be to let the filesystems do the suid/sgid clearing on the relevant operations. Possibly some are already doing it but there's no way we can detect this. So fix this by refreshing and recalculating the mode. Do this only if ATTR_KILL_S[UG]ID is set to not destroy performance for writes. This is still racy but the size of the window is reduced. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Miklos Szeredi authored
commit 5e2b8828 upstream. Without "default_permissions" the userspace filesystem's lookup operation needs to perform the check for search permission on the directory. If directory does not allow search for everyone (this is quite rare) then userspace filesystem has to set entry timeout to zero to make sure permissions are always performed. Changing the mode bits of the directory should also invalidate the (previously cached) dentry to make sure the next lookup will have a chance of updating the timeout, if needed. Reported-by: Jean-Pierre André <jean-pierre.andre@wanadoo.fr> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Sascha Silbe authored
commit 6cd997db upstream. con3270 contains an optimisation that reduces the amount of data to be transmitted to the 3270 terminal by putting a Repeat to Address (RA) order into the data stream. The RA order itself takes up space, so con3270 only uses it if there's enough space left in the line buffer. Otherwise it just pads out the line manually. For lines that were _just_ short enough that the RA order still fit in the line buffer, the line was instead padded with an insufficient amount of spaces. This was caused by examining the size of the allocated line buffer rather than the length of the string to be displayed. For con3270_cline_end(), we just compare against the line length. For con3270_update_string() however that isn't available anymore, so we check whether the Repeat to Address order is present. Fixes: f51320a5 ("[PATCH] s390: new 3270 driver.") (tglx/history.git) Tested-by: Jing Liu <liujbjl@linux.vnet.ibm.com> Tested-by: Yang Chen <bjcyang@linux.vnet.ibm.com> Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Sascha Silbe authored
commit c14f2aac upstream. con3270 contains an optimisation that reduces the amount of data to be transmitted to the 3270 terminal by putting a Repeat to Address (RA) order into the data stream. The RA order itself takes up space, so con3270 only uses it if there's enough space left in the line buffer. Otherwise it just pads out the line manually. For lines too long to include the RA order, one byte was left uninitialised. This was caused by an off-by-one bug in the loop that pads out the line. Since the buffer is allocated from a common pool, the single byte left uninitialised contained some previous buffer content. Usually this was just a space or some character (which can result in clutter but is otherwise harmless). Sometimes, however, it was a Repeat to Address order, messing up the entire screen layout and causing the display to send the entire buffer content on every keystroke. Fixes: f51320a5 ("[PATCH] s390: new 3270 driver.") (tglx/history.git) Reported-by: Liu Jing <liujbjl@linux.vnet.ibm.com> Tested-by: Jing Liu <liujbjl@linux.vnet.ibm.com> Tested-by: Yang Chen <bjcyang@linux.vnet.ibm.com> Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
gmail authored
commit e81d4477 upstream. The commit 6050d47a: "ext4: bail out from make_indexed_dir() on first error" could end up leaking bh2 in the error path. [ Also avoid renaming bh2 to bh, which just confuses things --tytso ] Signed-off-by: yangsheng <yngsion@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Anton Blanchard authored
commit 5045ea37 upstream. __kernel_get_syscall_map() and __kernel_clock_getres() use cmpli to check if the passed in pointer is non zero. cmpli maps to a 32 bit compare on binutils, so we ignore the top 32 bits. A simple test case can be created by passing in a bogus pointer with the bottom 32 bits clear. Using a clk_id that is handled by the VDSO, then one that is handled by the kernel shows the problem: printf("%d\n", clock_getres(CLOCK_REALTIME, (void *)0x100000000)); printf("%d\n", clock_getres(CLOCK_BOOTTIME, (void *)0x100000000)); And we get: 0 -1 The bigger issue is if we pass a valid pointer with the bottom 32 bits clear, in this case we will return success but won't write any data to the pointer. I stumbled across this issue because the LLVM integrated assembler doesn't accept cmpli with 3 arguments. Fix this by converting them to cmpldi. Fixes: a7f290da ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Alex Deucher authored
commit 42792029 upstream. Used the wrong index to setup the phase shedding mask. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Greg Kroah-Hartman authored
commit ab21b63e upstream. This reverts commit e6c7efdc. Turns out it was totally wrong. The memory is supposed to be bound to the kref, as the original code was doing correctly, not the device/driver binding as the devm_kzalloc() would cause. This fixes an oops when read would be called after the device was unbound from the driver. Reported-by: Ladislav Michl <ladis@linux-mips.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Trond Myklebust authored
commit 304020fe upstream. If the file permissions change on the server, then we may not be able to recover open state. If so, we need to ensure that we mark the file descriptor appropriately. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Kyle Jones authored
commit decc5360 upstream. Signed-off-by: Kyle Jones <kyle@kf5jwc.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Thomas Huth authored
commit fa73c3b2 upstream. The MMCR2 register is available twice, one time with number 785 (privileged access), and one time with number 769 (unprivileged, but it can be disabled completely). In former times, the Linux kernel was using the unprivileged register 769 only, but since commit 8dd75ccb ("powerpc: Use privileged SPR number for MMCR2"), it uses the privileged register 785 instead. The KVM-PR code then of course also switched to use the SPR 785, but this is causing older guest kernels to crash, since these kernels still access 769 instead. So to support older kernels with KVM-PR again, we have to support register 769 in KVM-PR, too. Fixes: 8dd75ccbSigned-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-