- 20 Nov, 2016 40 commits
-
-
Ben Hutchings authored
-
Johannes Weiner authored
commit d3798ae8 upstream. When the underflow checks were added to workingset_node_shadow_dec(), they triggered immediately: kernel BUG at ./include/linux/swap.h:276! invalid opcode: 0000 [#1] SMP Modules linked in: isofs usb_storage fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_REJECT nf_reject_ipv6 soundcore wmi acpi_als pinctrl_sunrisepoint kfifo_buf tpm_tis industrialio acpi_pad pinctrl_intel tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt CPU: 0 PID: 20929 Comm: blkid Not tainted 4.8.0-rc8-00087-gbe67d60b #1 Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016 task: ffff8faa93ecd940 task.stack: ffff8faa7f478000 RIP: page_cache_tree_insert+0xf1/0x100 Call Trace: __add_to_page_cache_locked+0x12e/0x270 add_to_page_cache_lru+0x4e/0xe0 mpage_readpages+0x112/0x1d0 blkdev_readpages+0x1d/0x20 __do_page_cache_readahead+0x1ad/0x290 force_page_cache_readahead+0xaa/0x100 page_cache_sync_readahead+0x3f/0x50 generic_file_read_iter+0x5af/0x740 blkdev_read_iter+0x35/0x40 __vfs_read+0xe1/0x130 vfs_read+0x96/0x130 SyS_read+0x55/0xc0 entry_SYSCALL_64_fastpath+0x13/0x8f Code: 03 00 48 8b 5d d8 65 48 33 1c 25 28 00 00 00 44 89 e8 75 19 48 83 c4 18 5b 41 5c 41 5d 41 5e 5d c3 0f 0b 41 bd ef ff ff ff eb d7 <0f> 0b e8 88 68 ef ff 0f 1f 84 00 RIP page_cache_tree_insert+0xf1/0x100 This is a long-standing bug in the way shadow entries are accounted in the radix tree nodes. The shrinker needs to know when radix tree nodes contain only shadow entries, no pages, so node->count is split in half to count shadows in the upper bits and pages in the lower bits. Unfortunately, the radix tree implementation doesn't know of this and assumes all entries are in node->count. When there is a shadow entry directly in root->rnode and the tree is later extended, the radix tree implementation will copy that entry into the new node and and bump its node->count, i.e. increases the page count bits. Once the shadow gets removed and we subtract from the upper counter, node->count underflows and triggers the warning. Afterwards, without node->count reaching 0 again, the radix tree node is leaked. Limit shadow entries to when we have actual radix tree nodes and can count them properly. That means we lose the ability to detect refaults from files that had only the first page faulted in at eviction time. Fixes: 449dd698 ("mm: keep page cache radix tree nodes in check") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [Johannes Weiner: it's drastically different than the upstream change, but a lot simpler because it predates the DAX stuff.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Linus Torvalds authored
commit 21f54dda upstream. That just generally kills the machine, and makes debugging only much harder, since the traces may long be gone. Debugging by assert() is a disease. Don't do it. If you can continue, you're much better off doing so with a live machine where you have a much higher chance that the report actually makes it to the system logs, rather than result in a machine that is just completely dead. The only valid situation for BUG_ON() is when continuing is not an option, because there is massive corruption. But if you are just verifying that something is true, you warn about your broken assumptions (preferably just once), and limp on. Fixes: 22f2ac51 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()") Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
James Hogan authored
commit 91e4f1b6 upstream. When a guest TLB entry is replaced by TLBWI or TLBWR, we only invalidate TLB entries on the local CPU. This doesn't work correctly on an SMP host when the guest is migrated to a different physical CPU, as it could pick up stale TLB mappings from the last time the vCPU ran on that physical CPU. Therefore invalidate both user and kernel host ASIDs on other CPUs, which will cause new ASIDs to be generated when it next runs on those CPUs. We're careful only to do this if the TLB entry was already valid, and only for the kernel ASID where the virtual address it mapped is outside of the guest user address range. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org [james.hogan@imgtec.com: Backport to 3.10..3.16] Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Xiaolong Ye authored
commit 5f25f066 upstream. time_in_state in struct devfreq is defined as unsigned long, so devm_kzalloc should use sizeof(unsigned long) as argument instead of sizeof(unsigned int), otherwise it will cause unexpected result in 64bit system. Signed-off-by: Xiaolong Ye <yexl@marvell.com> Signed-off-by: Kevin Liu <kliu5@marvell.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Paolo Bonzini authored
commit 95272c29 upstream. -ftracer can duplicate asm blocks causing compilation to fail in noclone functions. For example, KVM declares a global variable in an asm like asm("2: ... \n .pushsection data \n .global vmx_return \n vmx_return: .long 2b"); and -ftracer causes a double declaration. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: kvm@vger.kernel.org Reported-by: Linda Walsh <lkml@tlinx.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Philip Müller <philm@manjaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jan Beulich authored
commit 9a035a40 upstream. This should really only be done for XS_TRANSACTION_END messages, or else at least some of the xenstore-* tools don't work anymore. Fixes: 0beef634 ("xenbus: don't BUG() on user mode induced condition") Reported-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jan Beulich authored
commit 0beef634 upstream. Inability to locate a user mode specified transaction ID should not lead to a kernel crash. For other than XS_TRANSACTION_START also don't issue anything to xenbus if the specified ID doesn't match that of any active transaction. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Ed Swierk <eswierk@skyportsystems.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Ian Abbott authored
commit 5ca05345 upstream. For counter subdevices, the `s->insn_write` handler is being set to the wrong function, `ni_tio_insn_read()`. It should be `ni_tio_insn_write()`. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Reported-by: Éric Piel <piel@delmic.com> Fixes: 10f74377 ("staging: comedi: ni_tio: make ni_tio_winsn() a proper comedi (*insn_write)") Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Vladis Dronov authored
commit d5468d7a upstream. Commit 588afcc1 ("[media] usbvision fix overflow of interfaces array")' should be reverted, because: * "!dev->actconfig->interface[ifnum]" won't catch a case where the value is not NULL but some garbage. This way the system may crash later with GPF. * "(ifnum >= USB_MAXINTERFACES)" does not cover all the error conditions. "ifnum" should be compared to "dev->actconfig-> desc.bNumInterfaces", i.e. compared to the number of "struct usb_interface" kzalloc()-ed, not to USB_MAXINTERFACES. * There is a "struct usb_device" leak in this error path, as there is usb_get_dev(), but no usb_put_dev() on this path. * There is a bug of the same type several lines below with number of endpoints. The code is accessing hard-coded second endpoint ("interface->endpoint[1].desc") which may not exist. It would be great to handle this in the same patch too. * All the concerns above are resolved by already-accepted commit fa52bd50 ("[media] usbvision: fix crash on detecting device with invalid configuration") * Mailing list message: http://www.spinics.net/lists/linux-media/msg94832.htmlSigned-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Vineet Gupta authored
commit a6416f57 upstream. ARCompact and ARCv2 only have ASL, while binutils used to support LSL as a alias mnemonic. Newer binutils (upstream) don't want to do that so replace it. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jan Kara authored
commit 07393101 upstream. When file permissions are modified via chmod(2) and the user is not in the owning group or capable of CAP_FSETID, the setgid bit is cleared in inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file permissions as well as the new ACL, but doesn't clear the setgid bit in a similar way; this allows to bypass the check in chmod(2). Fix that. References: CVE-2016-7097 Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> [bwh: Backported to 3.16: - Drop changes to orangefs - Adjust context - Update ext3 as well] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jan Kara authored
commit 030b533c upstream. Currently, notify_change() clears capabilities or IMA attributes by calling security_inode_killpriv() before calling into ->setattr. Thus it happens before any other permission checks in inode_change_ok() and user is thus allowed to trigger clearing of capabilities or IMA attributes for any file he can look up e.g. by calling chown for that file. This is unexpected and can lead to user DoSing a system. Fix the problem by calling security_inode_killpriv() at the end of inode_change_ok() instead of from notify_change(). At that moment we are sure user has permissions to do the requested change. References: CVE-2015-1350 Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jan Kara authored
commit 31051c85 upstream. inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> [bwh: Backported to 3.16: - Drop changes to orangefs, overlayfs - Adjust filenames, context - In fuse, pass dentry to fuse_do_setattr() - In nfsd, pass dentry to nfsd_sanitize_attrs() - In xfs, pass dentry to xfs_setattr_nonsize() and xfs_setattr_size() - Update ext3 as well] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Vlad Tsyrklevich authored
commit 05692d70 upstream. The VFIO_DEVICE_SET_IRQS ioctl did not sufficiently sanitize user-supplied integers, potentially allowing memory corruption. This patch adds appropriate integer overflow checks, checks the range bounds for VFIO_IRQ_SET_DATA_NONE, and also verifies that only single element in the VFIO_IRQ_SET_DATA_TYPE_MASK bitmask is set. VFIO_IRQ_SET_ACTION_TYPE_MASK is already correctly checked later in vfio_pci_set_irqs_ioctl(). Furthermore, a kzalloc is changed to a kcalloc because the use of a kzalloc with an integer multiplication allowed an integer overflow condition to be reached without this patch. kcalloc checks for overflow and should prevent a similar occurrence. Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Arend Van Spriel authored
commit ded89912 upstream. User-space can choose to omit NL80211_ATTR_SSID and only provide raw IE TLV data. When doing so it can provide SSID IE with length exceeding the allowed size. The driver further processes this IE copying it into a local variable without checking the length. Hence stack can be corrupted and used as exploit. Reported-by: Daxing Guo <freener.gdx@gmail.com> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Stefan Richter authored
commit 667121ac upstream. The IP-over-1394 driver firewire-net lacked input validation when handling incoming fragmented datagrams. A maliciously formed fragment with a respectively large datagram_offset would cause a memcpy past the datagram buffer. So, drop any packets carrying a fragment with offset + length larger than datagram_size. In addition, ensure that - GASP header, unfragmented encapsulation header, or fragment encapsulation header actually exists before we access it, - the encapsulated datagram or fragment is of nonzero size. Reported-by: Eyal Itkin <eyal.itkin@gmail.com> Reviewed-by: Eyal Itkin <eyal.itkin@gmail.com> Fixes: CVE 2016-8633 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Dan Carpenter authored
commit 7bc2b55a upstream. We need to put an upper bound on "user_len" so the memcpy() doesn't overflow. Reported-by: Marco Grassi <marco.gra@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [bwh: Backported to 3.16: - Adjust context - Use literal 1032 insetad of ARCMSR_API_DATA_BUFLEN] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
David Howells authored
commit 03dab869 upstream. This fixes CVE-2016-7042. Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector is turned on, this can cause a panic due to stack corruption. The problem is that xbuf[] is not big enough to hold a 64-bit timeout rendered as weeks: (gdb) p 0xffffffffffffffffULL/(60*60*24*7) $2 = 30500568904943 That's 14 chars plus NUL, not 11 chars plus NUL. Expand the buffer to 16 chars. I think the unpatched code apparently works if the stack-protector is not enabled because on a 32-bit machine the buffer won't be overflowed and on a 64-bit machine there's a 64-bit aligned pointer at one side and an int that isn't checked again on the other side. The panic incurred looks something like: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679 Call Trace: [<ffffffff813d941f>] dump_stack+0x63/0x84 [<ffffffff811b2cb6>] panic+0xde/0x22a [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0 [<ffffffff81350410>] ? key_validate+0x50/0x50 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20 [<ffffffff8126b31c>] seq_read+0x2cc/0x390 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70 [<ffffffff81244fc7>] __vfs_read+0x37/0x150 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0 [<ffffffff81246156>] vfs_read+0x96/0x130 [<ffffffff81247635>] SyS_read+0x55/0xc0 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4 Reported-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jaganath Kanakkassery authored
commit 951b6a07 upstream. addr can be NULL and it should not be dereferenced before NULL checking. Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Suzuki K. Poulose authored
commit 8fff105e upstream. The perf core implicitly rejects events spanning multiple HW PMUs, as in these cases the event->ctx will differ. However this validation is performed after pmu::event_init() is called in perf_init_event(), and thus pmu::event_init() may be called with a group leader from a different HW PMU. The ARM64 PMU driver does not take this fact into account, and when validating groups assumes that it can call to_arm_pmu(event->pmu) for any HW event. When the event in question is from another HW PMU this is wrong, and results in dereferencing garbage. This patch updates the ARM64 PMU driver to first test for and reject events from other PMUs, moving the to_arm_pmu and related logic after this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with a CCI PMU present: Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL) CPU: 0 PID: 1371 Comm: perf_fuzzer Not tainted 3.19.0+ #249 Hardware name: V2F-1XV7 Cortex-A53x2 SMM (DT) task: ffffffc07c73a280 ti: ffffffc07b0a0000 task.ti: ffffffc07b0a0000 PC is at 0x0 LR is at validate_event+0x90/0xa8 pc : [<0000000000000000>] lr : [<ffffffc000090228>] pstate: 00000145 sp : ffffffc07b0a3ba0 [< (null)>] (null) [<ffffffc0000907d8>] armpmu_event_init+0x174/0x3cc [<ffffffc00015d870>] perf_try_init_event+0x34/0x70 [<ffffffc000164094>] perf_init_event+0xe0/0x10c [<ffffffc000164348>] perf_event_alloc+0x288/0x358 [<ffffffc000164c5c>] SyS_perf_event_open+0x464/0x98c Code: bad PC value Also cleans up the code to use the arm_pmu only when we know that we are dealing with an arm pmu event. Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Ziljstra (Intel) <peterz@infradead.org> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Srinivas Ramana authored
commit 117e5e9c upstream. If the bootloader uses the long descriptor format and jumps to kernel decompressor code, TTBCR may not be in a right state. Before enabling the MMU, it is required to clear the TTBCR.PD0 field to use TTBR0 for translation table walks. The commit dbece458 ("ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores") does the reset of TTBCR.N, but doesn't consider all the bits for the size of TTBCR.N. Clear TTBCR.PD0 field and reset all the three bits of TTBCR.N to indicate the use of TTBR0 and the correct base address width. Fixes: dbece458 ("ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores") Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Johannes Weiner authored
commit 22f2ac51 upstream. Antonio reports the following crash when using fuse under memory pressure: kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346! invalid opcode: 0000 [#1] SMP Modules linked in: all of them CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000 RIP: shadow_lru_isolate+0x181/0x190 Call Trace: __list_lru_walk_one.isra.3+0x8f/0x130 list_lru_walk_one+0x23/0x30 scan_shadow_nodes+0x34/0x50 shrink_slab.part.40+0x1ed/0x3d0 shrink_zone+0x2ca/0x2e0 kswapd+0x51e/0x990 kthread+0xd8/0xf0 ret_from_fork+0x3f/0x70 which corresponds to the following sanity check in the shadow node tracking: BUG_ON(node->count & RADIX_TREE_COUNT_MASK); The workingset code tracks radix tree nodes that exclusively contain shadow entries of evicted pages in them, and this (somewhat obscure) line checks whether there are real pages left that would interfere with reclaim of the radix tree node under memory pressure. While discussing ways how fuse might sneak pages into the radix tree past the workingset code, Miklos pointed to replace_page_cache_page(), and indeed there is a problem there: it properly accounts for the old page being removed - __delete_from_page_cache() does that - but then does a raw raw radix_tree_insert(), not accounting for the replacement page. Eventually the page count bits in node->count underflow while leaving the node incorrectly linked to the shadow node LRU. To address this, make sure replace_page_cache_page() uses the tracked page insertion code, page_cache_tree_insert(). This fixes the page accounting and makes sure page-containing nodes are properly unlinked from the shadow node LRU again. Also, make the sanity checks a bit less obscure by using the helpers for checking the number of pages and shadows in a radix tree node. Fixes: 449dd698 ("mm: keep page cache radix tree nodes in check") Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Antonio SJ Musumeci <trapexit@spawn.link> Debugged-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: - Implementation of page_cache_tree_insert() is different - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Paul Burton authored
commit 305723ab upstream. Malta boards used with CPU emulators feature a switch to disable use of an IOCU. Software has to check this switch & ignore any present IOCU if the switch is closed. The read used to do this was unsafe for 64 bit kernels, as it simply casted the address 0xbf403000 to a pointer & dereferenced it. Whilst in a 32 bit kernel this would access kseg1, in a 64 bit kernel this attempts to access xuseg & results in an address error exception. Fix by accessing a correctly formed ckseg1 address generated using the CKSEG1ADDR macro. Whilst modifying this code, define the name of the register and the bit we care about within it, which indicates whether PCI DMA is routed to the IOCU or straight to DRAM. The code previously checked that bit 0 was also set, but the least significant 7 bits of the CONFIG_GEN0 register contain the value of the MReqInfo signal provided to the IOCU OCP bus, so singling out bit 0 makes little sense & that part of the check is dropped. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: b6d92b4a ("MIPS: Add option to disable software I/O coherency.") Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Kees Cook <keescook@chromium.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/14187/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Roger Quadros authored
commit d248220f upstream. Since commit 6ce0d200 ("ARM: dma: Use dma_pfn_offset for dma address translation"), dma_to_pfn() already returns the PFN with the physical memory start offset so we don't need to add it again. This fixes USB mass storage lock-up problem on systems that can't do DMA over the entire physical memory range (e.g.) Keystone 2 systems with 4GB RAM can only do DMA over the first 2GB. [K2E-EVM]. What happens there is that without this patch SCSI layer sets a wrong bounce buffer limit in scsi_calculate_bounce_limit() for the USB mass storage device. dma_max_pfn() evaluates to 0x8fffff and bounce_limit is set to 0x8fffff000 whereas maximum DMA'ble physical memory on Keystone 2 is 0x87fffffff. This results in non DMA'ble pages being given to the USB controller and hence the lock-up. NOTE: in the above case, USB-SCSI-device's dma_pfn_offset was showing as 0. This should have really been 0x780000 as on K2e, LOWMEM_START is 0x80000000 and HIGHMEM_START is 0x800000000. DMA zone is 2GB so dma_max_pfn should be 0x87ffff. The incorrect dma_pfn_offset for the USB storage device is because USB devices are not correctly inheriting the dma_pfn_offset from the USB host controller. This will be fixed by a separate patch. Fixes: 6ce0d200 ("ARM: dma: Use dma_pfn_offset for dma address translation") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Olof Johansson <olof@lixom.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Linus Walleij <linus.walleij@linaro.org> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
zhong jiang authored
commit 5b398e41 upstream. I hit the following hung task when runing a OOM LTP test case with 4.1 kernel. Call trace: [<ffffffc000086a88>] __switch_to+0x74/0x8c [<ffffffc000a1bae0>] __schedule+0x23c/0x7bc [<ffffffc000a1c09c>] schedule+0x3c/0x94 [<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350 [<ffffffc000a1e32c>] down_write+0x64/0x80 [<ffffffc00021f794>] __ksm_exit+0x90/0x19c [<ffffffc0000be650>] mmput+0x118/0x11c [<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74 [<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4 [<ffffffc0000d0f34>] get_signal+0x444/0x5e0 [<ffffffc000089fcc>] do_signal+0x1d8/0x450 [<ffffffc00008a35c>] do_notify_resume+0x70/0x78 The oom victim cannot terminate because it needs to take mmap_sem for write while the lock is held by ksmd for read which loops in the page allocator ksm_do_scan scan_get_next_rmap_item down_read get_next_rmap_item alloc_rmap_item #ksmd will loop permanently. There is no way forward because the oom victim cannot release any memory in 4.1 based kernel. Since 4.6 we have the oom reaper which would solve this problem because it would release the memory asynchronously. Nevertheless we can relax alloc_rmap_item requirements and use __GFP_NORETRY because the allocation failure is acceptable as ksm_do_scan would just retry later after the lock got dropped. Such a patch would be also easy to backport to older stable kernels which do not have oom_reaper. While we are at it add GFP_NOWARN so the admin doesn't have to be alarmed by the allocation failure. Link: http://lkml.kernel.org/r/1474165570-44398-1-git-send-email-zhongjiang@huawei.comSigned-off-by: zhong jiang <zhongjiang@huawei.com> Suggested-by: Hugh Dickins <hughd@google.com> Suggested-by: Michal Hocko <mhocko@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Alex Deucher authored
commit 670bb4fd upstream. Add clock quirks for Jet parts. Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Tested-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Nikolay Aleksandrov authored
commit 2cf75070 upstream. Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid instead of the previous dst_pid which was copied from in_skb's portid. Since the skb is new the portid is 0 at that point so the packets are sent to the kernel and we get scheduling while atomic or a deadlock (depending on where it happens) by trying to acquire rtnl two times. Also since this is RTM_GETROUTE, it can be triggered by a normal user. Here's the sleeping while atomic trace: [ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 [ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0 [ 7858.212881] 2 locks held by swapper/0/0: [ 7858.213013] #0: (((&mrt->ipmr_expire_timer))){+.-...}, at: [<ffffffff810fbbf5>] call_timer_fn+0x5/0x350 [ 7858.213422] #1: (mfc_unres_lock){+.....}, at: [<ffffffff8161e005>] ipmr_expire_process+0x25/0x130 [ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179 [ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 7858.214108] 0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000 [ 7858.214412] ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e [ 7858.214716] 000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f [ 7858.215251] Call Trace: [ 7858.215412] <IRQ> [<ffffffff813a7804>] dump_stack+0x85/0xc1 [ 7858.215662] [<ffffffff810a4a72>] ___might_sleep+0x192/0x250 [ 7858.215868] [<ffffffff810a4b9f>] __might_sleep+0x6f/0x100 [ 7858.216072] [<ffffffff8165bea3>] mutex_lock_nested+0x33/0x4d0 [ 7858.216279] [<ffffffff815a7a5f>] ? netlink_lookup+0x25f/0x460 [ 7858.216487] [<ffffffff8157474b>] rtnetlink_rcv+0x1b/0x40 [ 7858.216687] [<ffffffff815a9a0c>] netlink_unicast+0x19c/0x260 [ 7858.216900] [<ffffffff81573c70>] rtnl_unicast+0x20/0x30 [ 7858.217128] [<ffffffff8161cd39>] ipmr_destroy_unres+0xa9/0xf0 [ 7858.217351] [<ffffffff8161e06f>] ipmr_expire_process+0x8f/0x130 [ 7858.217581] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.217785] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.217990] [<ffffffff810fbc95>] call_timer_fn+0xa5/0x350 [ 7858.218192] [<ffffffff810fbbf5>] ? call_timer_fn+0x5/0x350 [ 7858.218415] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.218656] [<ffffffff810fde10>] run_timer_softirq+0x260/0x640 [ 7858.218865] [<ffffffff8166379b>] ? __do_softirq+0xbb/0x54f [ 7858.219068] [<ffffffff816637c8>] __do_softirq+0xe8/0x54f [ 7858.219269] [<ffffffff8107a948>] irq_exit+0xb8/0xc0 [ 7858.219463] [<ffffffff81663452>] smp_apic_timer_interrupt+0x42/0x50 [ 7858.219678] [<ffffffff816625bc>] apic_timer_interrupt+0x8c/0xa0 [ 7858.219897] <EOI> [<ffffffff81055f16>] ? native_safe_halt+0x6/0x10 [ 7858.220165] [<ffffffff810d64dd>] ? trace_hardirqs_on+0xd/0x10 [ 7858.220373] [<ffffffff810298e3>] default_idle+0x23/0x190 [ 7858.220574] [<ffffffff8102a20f>] arch_cpu_idle+0xf/0x20 [ 7858.220790] [<ffffffff810c9f8c>] default_idle_call+0x4c/0x60 [ 7858.221016] [<ffffffff810ca33b>] cpu_startup_entry+0x39b/0x4d0 [ 7858.221257] [<ffffffff8164f995>] rest_init+0x135/0x140 [ 7858.221469] [<ffffffff81f83014>] start_kernel+0x50e/0x51b [ 7858.221670] [<ffffffff81f82120>] ? early_idt_handler_array+0x120/0x120 [ 7858.221894] [<ffffffff81f8243f>] x86_64_start_reservations+0x2a/0x2c [ 7858.222113] [<ffffffff81f8257c>] x86_64_start_kernel+0x13b/0x14a Fixes: 2942e900 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Steven Rostedt (Red Hat) authored
commit 1245800c upstream. The iter->seq can be reset outside the protection of the mutex. So can reading of user data. Move the mutex up to the beginning of the function. Fixes: d7350c3f ("tracing/core: make the read callbacks reentrants") Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Lance Richardson authored
commit db32e4e4 upstream. Similar to commit 3be07244 ("ip6_gre: fix flowi6_proto value in xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup. Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value. This affected output route lookup for packets sent on an ip6gretap device in cases where routing was dependent on the value of flowi6_proto. Since the correct proto is already set in the tunnel flowi6 template via commit 252f3f5a ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit path."), simply delete the line setting the incorrect flowi6_proto value. Suggested-by: Jiri Benc <jbenc@redhat.com> Fixes: c12b395a ("gre: Support GRE over IPv6") Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Haishuang Yan authored
commit 252f3f5a upstream. In gre6 xmit path, we are sending a GRE packet, so set fl6 proto to IPPROTO_GRE properly. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Eric Dumazet authored
commit 019b1c9f upstream. If DBGUNDO() is enabled (FASTRETRANS_DEBUG > 1), a compile error will happen, since inet6_sk(sk)->daddr became sk->sk_v6_daddr Fixes: efe4208f ("ipv6: make lookups simpler and faster") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Sudeep Holla authored
commit 331dcf42 upstream. If the i2c device is already runtime suspended, if qup_i2c_suspend is executed during suspend-to-idle or suspend-to-ram it will result in the following splat: WARNING: CPU: 3 PID: 1593 at drivers/clk/clk.c:476 clk_core_unprepare+0x80/0x90 Modules linked in: CPU: 3 PID: 1593 Comm: bash Tainted: G W 4.8.0-rc3 #14 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) PC is at clk_core_unprepare+0x80/0x90 LR is at clk_unprepare+0x28/0x40 pc : [<ffff0000086eecf0>] lr : [<ffff0000086f0c58>] pstate: 60000145 Call trace: clk_core_unprepare+0x80/0x90 qup_i2c_disable_clocks+0x2c/0x68 qup_i2c_suspend+0x10/0x20 platform_pm_suspend+0x24/0x68 ... This patch fixes the issue by executing qup_i2c_pm_suspend_runtime conditionally in qup_i2c_suspend. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Sergei Miroshnichenko authored
commit 9abefcb1 upstream. A timer was used to restart after the bus-off state, leading to a relatively large can_restart() executed in an interrupt context, which in turn sets up pinctrl. When this happens during system boot, there is a high probability of grabbing the pinctrl_list_mutex, which is locked already by the probe() of other device, making the kernel suspect a deadlock condition [1]. To resolve this issue, the restart_timer is replaced by a delayed work. [1] https://github.com/victronenergy/venus/issues/24Signed-off-by: Sergei Miroshnichenko <sergeimir@emcraft.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jeff Mahoney authored
commit 325c50e3 upstream. If the subvol/snapshot create/destroy ioctls are passed a regular file with execute permissions set, we'll eventually Oops while trying to do inode->i_op->lookup via lookup_one_len. This patch ensures that the file descriptor refers to a directory. Fixes: cb8e7090 (Btrfs: Fix subvolume creation locking rules) Fixes: 76dda93c (Btrfs: add snapshot/subvolume destroy ioctl) Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Peter Rosin authored
commit 463e8f84 upstream. The cached value of the last selected channel prevents retries on the next call, even on failure to update the selected channel. Fix that. Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Yadi.hu authored
commit 371a0153 upstream. the eg20t driver call request_irq() function before the pch_base_address, base address of i2c controller's register, is assigned an effective value. there is one possible scenario that an interrupt which isn't inside eg20t arrives immediately after request_irq() is executed when i2c controller shares an interrupt number with others. since the interrupt handler pch_i2c_handler() has already active as shared action, it will be called and read its own register to determine if this interrupt is from itself. At that moment, since base address of i2c registers is not remapped in kernel space yet,so the INT handler will access an illegal address and then a error occurs. Signed-off-by: Yadi.hu <yadi.hu@windriver.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Al Viro authored
commit e23d4159 upstream. Switching iov_iter fault-in to multipages variants has exposed an old bug in underlying fault_in_multipages_...(); they break if the range passed to them wraps around. Normally access_ok() done by callers will prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into such a range and they should not point to any valid objects). However, on architectures where userland and kernel live in different MMU contexts (e.g. s390) access_ok() is a no-op and on those a range with a wraparound can reach fault_in_multipages_...(). Since any wraparound means EFAULT there, the fix is trivial - turn those while (uaddr <= end) ... into if (unlikely(uaddr > end)) return -EFAULT; do ... while (uaddr <= end); Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Ashish Samant authored
commit d21c353d upstream. If we punch a hole on a reflink such that following conditions are met: 1. start offset is on a cluster boundary 2. end offset is not on a cluster boundary 3. (end offset is somewhere in another extent) or (hole range > MAX_CONTIG_BYTES(1MB)), we dont COW the first cluster starting at the start offset. But in this case, we were wrongly passing this cluster to ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster in place and zero it in the source too. Fix this by skipping this cluster in such a scenario. To reproduce: 1. Create a random file of say 10 MB xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile 2. Reflink it reflink -f 10MBfile reflnktest 3. Punch a hole at starting at cluster boundary with range greater that 1MB. You can also use a range that will put the end offset in another extent. fallocate -p -o 0 -l 1048615 reflnktest 4. sync 5. Check the first cluster in the source file. (It will be zeroed out). dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C Link: http://lkml.kernel.org/r/1470957147-14185-1-git-send-email-ashish.samant@oracle.comSigned-off-by: Ashish Samant <ashish.samant@oracle.com> Reported-by: Saar Maoz <saar.maoz@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Eric Ren <zren@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jan Kara authored
commit 96d41019 upstream. fanotify_get_response() calls fsnotify_remove_event() when it finds that group is being released from fanotify_release() (bypass_perm is set). However the event it removes need not be only in the group's notification queue but it can have already moved to access_list (userspace read the event before closing the fanotify instance fd) which is protected by a different lock. Thus when fsnotify_remove_event() races with fanotify_release() operating on access_list, the list can get corrupted. Fix the problem by moving all the logic removing permission events from the lists to one place - fanotify_release(). Fixes: 5838d444 ("fanotify: fix double free of pending permission events") Link: http://lkml.kernel.org/r/1473797711-14111-3-git-send-email-jack@suse.czSigned-off-by: Jan Kara <jack@suse.cz> Reported-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.16: - s/fsnotify_remove_first_event/fsnotify_remove_notify_event/ - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-