- 17 Jul, 2018 9 commits
-
-
Hans de Goede authored
commit 240630e6 upstream. There have been several reports of LPM related hard freezes about once a day on multiple Lenovo 50 series models. Strange enough these reports where not disk model specific as LPM issues usually are and some users with the exact same disk + laptop where seeing them while other users where not seeing these issues. It turns out that enabling LPM triggers a firmware bug somewhere, which has been fixed in later BIOS versions. This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM for dealing with this. The ahci_broken_lpm() function contains DMI match info for the 4 models which are known to be affected by this and the DMI BIOS date field for known good BIOS versions. If the BIOS date is older then the one in the table LPM will be disabled and a warning will be printed. Note the BIOS dates are for known good versions, some older versions may work too, but we don't know for sure, the table is using dates from BIOS versions for which users have confirmed that upgrading to that version makes the problem go away. Unfortunately I've been unable to get hold of the reporter who reported that BIOS version 2.35 fixed the problems on the W541 for him. I've been able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older dmidecode, but I don't know the exact BIOS date as reported in the DMI. Lenovo keeps a changelog with dates in their release notes, but the dates there are the release dates not the build dates which are in DMI. So I've chosen to set the date to which we compare to one day past the release date of the 2.34 BIOS. I plan to fix this with a follow up commit once I've the necessary info. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nadav Amit authored
commit 90d72ce0 upstream. Embarrassingly, the recent fix introduced worse problem than it solved, causing the balloon not to inflate. The VM informed the hypervisor that the pages for lock/unlock are sitting in the wrong address, as it used the page that is used the uninitialized page variable. Fixes: b23220fe ("vmw_balloon: fixing double free when batching mode is off") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Damien Le Moal authored
commit 6edf1d4c upstream. If the ALL bit is set in the ZBC_OUT command, the command zone ID field (block) should be ignored. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Damien Le Moal authored
commit b320a0a9 upstream. The block (LBA) specified must not exceed the last addressable LBA, which is dev->nr_sectors - 1. So fix the correct check is "if (block >= dev->n_sectors)" and not "if (block > dev->n_sectords)". Additionally, the asc/ascq to return for an LBA that is not a zone start LBA should be ILLEGAL REQUEST, regardless if the bad LBA is out of range. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jann Horn authored
commit a0341fc1 upstream. This read handler had a lot of custom logic and wrote outside the bounds of the provided buffer. This could lead to kernel and userspace memory corruption. Just use simple_read_from_buffer() with a stack buffer. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
x00270170 authored
commit 7a6b9f4d upstream. Card write threshold control is supposed to be set since controller version 2.80a for data write in HS400 mode and data read in HS200/HS400/SDR104 mode. However the current code returns without configuring it in the case of data writing in HS400 mode. Meanwhile the patch fixes that the current code goes to 'disable' when doing data reading in HS400 mode. Fixes: 7e4bf1bc ("mmc: dw_mmc: add the card write threshold for HS400 mode") Signed-off-by: Qing Xia <xiaqing17@hisilicon.com> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Burton authored
commit 523402fa upstream. We currently attempt to check whether a physical address range provided to __ioremap() may be in use by the page allocator by examining the value of PageReserved for each page in the region - lowmem pages not marked reserved are presumed to be in use by the page allocator, and requests to ioremap them fail. The way we check this has been broken since commit 92923ca3 ("mm: meminit: only set page reserved in the memblock region"), because memblock will typically not have any knowledge of non-RAM pages and therefore those pages will not have the PageReserved flag set. Thus when we attempt to ioremap a region outside of RAM we incorrectly fail believing that the region is RAM that may be in use. In most cases ioremap() on MIPS will take a fast-path to use the unmapped kseg1 or xkphys virtual address spaces and never hit this path, so the only way to hit it is for a MIPS32 system to attempt to ioremap() an address range in lowmem with flags other than _CACHE_UNCACHED. Perhaps the most straightforward way to do this is using ioremap_uncached_accelerated(), which is how the problem was discovered. Fix this by making use of walk_system_ram_range() to test the address range provided to __ioremap() against only RAM pages, rather than all lowmem pages. This means that if we have a lowmem I/O region, which is very common for MIPS systems, we're free to ioremap() address ranges within it. A nice bonus is that the test is no longer limited to lowmem. The approach here matches the way x86 performed the same test after commit c81c8a1e ("x86, ioremap: Speed up check for RAM pages") until x86 moved towards a slightly more complicated check using walk_mem_res() for unrelated reasons with commit 0e4c12b4 ("x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages"). Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Serge Semin <fancer.lancer@gmail.com> Tested-by: Serge Semin <fancer.lancer@gmail.com> Fixes: 92923ca3 ("mm: meminit: only set page reserved in the memblock region") Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.2+ Patchwork: https://patchwork.linux-mips.org/patch/19786/Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Burton authored
commit 5a267832 upstream. The generic nmi_cpu_backtrace() function calls show_regs() when a struct pt_regs is available, and dump_stack() otherwise. If we were to make use of the generic nmi_cpu_backtrace() with MIPS' current implementation of show_regs() this would mean that we see only register data with no accompanying stack information, in contrast with our current implementation which calls dump_stack() regardless of whether register state is available. In preparation for making use of the generic nmi_cpu_backtrace() to implement arch_trigger_cpumask_backtrace(), have our implementation of show_regs() call dump_stack() and drop the explicit dump_stack() call in arch_dump_stack() which is invoked by arch_trigger_cpumask_backtrace(). This will allow the output we produce to remain the same after a later patch switches to using nmi_cpu_backtrace(). It may mean that we produce extra stack output in other uses of show_regs(), but this: 1) Seems harmless. 2) Is good for consistency between arch_trigger_cpumask_backtrace() and other users of show_regs(). 3) Matches the behaviour of the ARM & PowerPC architectures. Marked for stable back to v4.9 as a prerequisite of the following patch "MIPS: Call dump_stack() from show_regs()". Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/19596/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Huacai Chen <chenhc@lemote.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Scott Bauer authored
commit 7dd1ab16 upstream. With a misbehaving controller it's possible we'll never enter the live state and create an admin queue. When we fail out of reset work it's possible we failed out early enough without setting up the admin queue. We tear down queues after a failed reset, but needed to do some more sanitization. Fixes 443bd90f: "nvme: host: unquiesce queue in nvme_kill_queues()" [ 189.650995] nvme nvme1: pci function 0000:0b:00.0 [ 317.680055] nvme nvme0: Device not ready; aborting reset [ 317.680183] nvme nvme0: Removing after probe failure status: -19 [ 317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 317.681397] general protection fault: 0000 [#1] SMP KASAN [ 317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5 [ 317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 [ 317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme] [ 317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000 [ 317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0 [ 317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282 [ 317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3 [ 317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000 [ 317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000 [ 317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0 [ 317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0 [ 317.684371] FS: 0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000 [ 317.684519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0 [ 317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 317.685018] Call Trace: [ 317.685084] nvme_kill_queues+0x4d/0x170 [nvme_core] [ 317.685185] nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme] [ 317.685289] process_one_work+0x771/0x1170 [ 317.685372] worker_thread+0xde/0x11e0 [ 317.685452] ? pci_mmcfg_check_reserved+0x110/0x110 [ 317.685550] kthread+0x2d3/0x3d0 [ 317.685617] ? process_one_work+0x1170/0x1170 [ 317.685704] ? kthread_create_on_node+0xc0/0xc0 [ 317.685785] ret_from_fork+0x25/0x30 [ 317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3 [ 317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40 [ 317.685908] ---[ end trace a3f8704150b1e8b4 ]--- Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [ adapted for 4.9: added check around blk_mq_start_hw_queues() call instead of upstream blk_mq_unquiesce_queue() ] Fixes: 4aae4388 ("nvme: fix hang in remove path") Signed-off-by: Simon Veith <sveith@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Amit Shah <aams@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 11 Jul, 2018 31 commits
-
-
Greg Kroah-Hartman authored
-
Dan Carpenter authored
commit 1376b0a2 upstream. There is a '>' vs '<' typo so this loop is a no-op. Fixes: d35dcc89 ("staging: comedi: quatech_daqp_cs: fix daqp_ao_insn_write()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jann Horn authored
commit ce00bf07 upstream. The old code would indefinitely block other users of nf_log_mutex if a userspace access in proc_dostring() blocked e.g. due to a userfaultfd region. Fix it by moving proc_dostring() out of the locked region. This is a followup to commit 266d07cb ("netfilter: nf_log: fix sleeping function called from invalid context"), which changed this code from using rcu_read_lock() to taking nf_log_mutex. Fixes: 266d07cb ("netfilter: nf_log: fix sleeping function calle[...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tokunori Ikegami authored
commit 79ca484b upstream. Currently the functions use to check both chip ready and good. But the chip ready is not enough to check the operation status. So change this to check the chip good instead of this. About the retry functions to make sure the error handling remain it. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tokunori Ikegami authored
commit 45f75b8a upstream. For the word write functions it is retried for error. But it is not implemented to retry for the erase functions. To make sure for the erase functions change to retry as same. This is needed to prevent the flash erase error caused only once. It was caused by the error case of chip_good() in the do_erase_oneblock(). Also it was confirmed on the MACRONIX flash device MX29GL512FHT2I-11G. But the error issue behavior is not able to reproduce at this moment. The flash controller is parallel Flash interface integrated on BCM53003. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tokunori Ikegami authored
commit 85a82e28 upstream. The definition can be used for other program and erase operations also. So change the naming to MAX_RETRIES from MAX_WORD_RETRIES. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mikulas Patocka authored
commit d12067f4 upstream. dm_bufio_shrink_count() is called from do_shrink_slab to find out how many freeable objects are there. The reported value doesn't have to be precise, so we don't need to take the dm-bufio lock. Suggested-by: David Rientjes <rientjes@google.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Martin Kaiser authored
commit 3f77f244 upstream. The v21 version of the NAND flash controller contains a Spare Area Size Register (SPAS) at offset 0x10. Its setting defaults to the maximum spare area size of 218 bytes. The size that is set in this register is used by the controller when it calculates the ECC bytes internally in hardware. Usually, this register is updated from settings in the IIM fuses when the system is booting from NAND flash. For other boot media, however, the SPAS register remains at the default setting, which may not work for the particular flash chip on the board. The same goes for flash chips whose configuration cannot be set in the IIM fuses (e.g. chips with 2k sector size and 128 bytes spare area size can't be configured in the IIM fuses on imx25 systems). Set the SPAS register explicitly during the preset operation. Derive the register value from mtd->oobsize that was detected during probe by decoding the flash chip's ID bytes. While at it, rename the define for the spare area register's offset to NFC_V21_RSLTSPARE_AREA. The register at offset 0x10 on v1 controllers is different from the register on v21 controllers. Fixes: d4840180 ("mtd: mxc_nand: set NFC registers after reset") Cc: stable@vger.kernel.org Signed-off-by: Martin Kaiser <martin@kaiser.cx> Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mikulas Patocka authored
commit 41c73a49 upstream. If the first allocation attempt using GFP_NOWAIT fails, drop the lock and retry using GFP_NOIO allocation (lock is dropped because the allocation can take some time). Note that we won't do GFP_NOIO allocation when we loop for the second time, because the lock shouldn't be dropped between __wait_for_free_buffer and __get_unclaimed_buffer. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Douglas Anderson authored
commit 9ea61cac upstream. We've seen in-field reports showing _lots_ (18 in one case, 41 in another) of tasks all sitting there blocked on: mutex_lock+0x4c/0x68 dm_bufio_shrink_count+0x38/0x78 shrink_slab.part.54.constprop.65+0x100/0x464 shrink_zone+0xa8/0x198 In the two cases analyzed, we see one task that looks like this: Workqueue: kverityd verity_prefetch_io __switch_to+0x9c/0xa8 __schedule+0x440/0x6d8 schedule+0x94/0xb4 schedule_timeout+0x204/0x27c schedule_timeout_uninterruptible+0x44/0x50 wait_iff_congested+0x9c/0x1f0 shrink_inactive_list+0x3a0/0x4cc shrink_lruvec+0x418/0x5cc shrink_zone+0x88/0x198 try_to_free_pages+0x51c/0x588 __alloc_pages_nodemask+0x648/0xa88 __get_free_pages+0x34/0x7c alloc_buffer+0xa4/0x144 __bufio_new+0x84/0x278 dm_bufio_prefetch+0x9c/0x154 verity_prefetch_io+0xe8/0x10c process_one_work+0x240/0x424 worker_thread+0x2fc/0x424 kthread+0x10c/0x114 ...and that looks to be the one holding the mutex. The problem has been reproduced on fairly easily: 0. Be running Chrome OS w/ verity enabled on the root filesystem 1. Pick test patch: http://crosreview.com/412360 2. Install launchBalloons.sh and balloon.arm from http://crbug.com/468342 ...that's just a memory stress test app. 3. On a 4GB rk3399 machine, run nice ./launchBalloons.sh 4 900 100000 ...that tries to eat 4 * 900 MB of memory and keep accessing. 4. Login to the Chrome web browser and restore many tabs With that, I've seen printouts like: DOUG: long bufio 90758 ms ...and stack trace always show's we're in dm_bufio_prefetch(). The problem is that we try to allocate memory with GFP_NOIO while we're holding the dm_bufio lock. Instead we should be using GFP_NOWAIT. Using GFP_NOIO can cause us to sleep while holding the lock and that causes the above problems. The current behavior explained by David Rientjes: It will still try reclaim initially because __GFP_WAIT (or __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO. This is the cause of contention on dm_bufio_lock() that the thread holds. You want to pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a mutex that can be contended by a concurrent slab shrinker (if count_objects didn't use a trylock, this pattern would trivially deadlock). This change significantly increases responsiveness of the system while in this state. It makes a real difference because it unblocks kswapd. In the bug report analyzed, kswapd was hung: kswapd0 D ffffffc000204fd8 0 72 2 0x00000000 Call trace: [<ffffffc000204fd8>] __switch_to+0x9c/0xa8 [<ffffffc00090b794>] __schedule+0x440/0x6d8 [<ffffffc00090bac0>] schedule+0x94/0xb4 [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44 [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68 [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78 [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464 [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198 [<ffffffc00030e578>] balance_pgdat+0x328/0x508 [<ffffffc00030eb7c>] kswapd+0x424/0x51c [<ffffffc00023f06c>] kthread+0x10c/0x114 [<ffffffc000203dd0>] ret_from_fork+0x10/0x40 By unblocking kswapd memory pressure should be reduced. Suggested-by: David Rientjes <rientjes@google.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vlastimil Babka authored
commit 7810e678 upstream. In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for allocations that can ignore memory policies. The zonelist is obtained from current CPU's node. This is a problem for __GFP_THISNODE allocations that want to allocate on a different node, e.g. because the allocating thread has been migrated to a different CPU. This has been observed to break SLAB in our 4.4-based kernel, because there it relies on __GFP_THISNODE working as intended. If a slab page is put on wrong node's list, then further list manipulations may corrupt the list because page_to_nid() is used to determine which node's list_lock should be locked and thus we may take a wrong lock and race. Current SLAB implementation seems to be immune by luck thanks to commit 511e3a05 ("mm/slab: make cache_grow() handle the page allocated on arbitrary node") but there may be others assuming that __GFP_THISNODE works as promised. We can fix it by simply removing the zonelist reset completely. There is actually no reason to reset it, because memory policies and cpusets don't affect the zonelist choice in the first place. This was different when commit 183f6371 ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their own restricted zonelists. We might consider this for 4.17 although I don't know if there's anything currently broken. SLAB is currently not affected, but in kernels older than 4.7 that don't yet have 511e3a05 ("mm/slab: make cache_grow() handle the page allocated on arbitrary node") it is. That's at least 4.4 LTS. Older ones I'll have to check. So stable backports should be more important, but will have to be reviewed carefully, as the code went through many changes. BTW I think that also the ac->preferred_zoneref reset is currently useless if we don't also reset ac->nodemask from a mempolicy to NULL first (which we probably should for the OOM victims etc?), but I would leave that for a separate patch. Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.czSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 183f6371 ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK") Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Brad Love authored
commit 3ee9bc12 upstream. The cx25840 driver currently configures 885, 887, and 888 using default divisors for each chip. This check to see if the cx23885 driver has passed the cx25840 a non-default clock rate for a specific chip. If a cx23885 board has left clk_freq at 0, the clock default values will be used to configure the PLLs. This patch only has effect on 888 boards who set clk_freq to 25M. Signed-off-by: Brad Love <brad@nextdimension.cc> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rasmus Villemoes authored
commit 9564a8cf upstream. I tried building using a freshly built Make (4.2.1-69-g8a731d1), but already the objtool build broke with orc_dump.c: In function ‘orc_dump’: orc_dump.c:106:2: error: ‘elf_getshnum’ is deprecated [-Werror=deprecated-declarations] if (elf_getshdrnum(elf, &nr_sections)) { Turns out that with that new Make, the backslash was not removed, so cpp didn't see a #include directive, grep found nothing, and -DLIBELF_USE_DEPRECATED was wrongly put in CFLAGS. Now, that new Make behaviour is documented in their NEWS file: * WARNING: Backward-incompatibility! Number signs (#) appearing inside a macro reference or function invocation no longer introduce comments and should not be escaped with backslashes: thus a call such as: foo := $(shell echo '#') is legal. Previously the number sign needed to be escaped, for example: foo := $(shell echo '\#') Now this latter will resolve to "\#". If you want to write makefiles portable to both versions, assign the number sign to a variable: C := \# foo := $(shell echo '$C') This was claimed to be fixed in 3.81, but wasn't, for some reason. To detect this change search for 'nocomment' in the .FEATURES variable. This also fixes up the two make-cmd instances to replace # with $(pound) rather than with \#. There might very well be other places that need similar fixup in preparation for whatever future Make release contains the above change, but at least this builds an x86_64 defconfig with the new make. Link: https://bugzilla.kernel.org/show_bug.cgi?id=197847 Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Waldemar Rymarkiewicz authored
commit c5c2a97b upstream. This commit fixes a rare but possible case when the clk rate is updated without update of the regulator voltage. At boot up, CPUfreq checks if the system is running at the right freq. This is a sanity check in case a bootloader set clk rate that is outside of freq table present with cpufreq core. In such cases system can be unstable so better to change it to a freq that is preset in freq-table. The CPUfreq takes next freq that is >= policy->cur and this is our target_freq that needs to be set now. dev_pm_opp_set_rate(dev, target_freq) checks the target_freq and the old_freq (a current rate). If these are equal it returns early. If not, it searches for OPP (old_opp) that fits best to old_freq (not listed in the table) and updates old_freq (!). Here, we can end up with old_freq = old_opp.rate = target_freq, which is not handled in _generic_set_opp_regulator(). It's supposed to update voltage only when freq > old_freq || freq > old_freq. if (freq > old_freq) { ret = _set_opp_voltage(dev, reg, new_supply); [...] if (freq < old_freq) { ret = _set_opp_voltage(dev, reg, new_supply); if (ret) It results in, no voltage update while clk rate is updated. Example: freq-table = { 1000MHz 1.15V 666MHZ 1.10V 333MHz 1.05V } boot-up-freq = 800MHz # not listed in freq-table freq = target_freq = 1GHz old_freq = 800Mhz old_opp = _find_freq_ceil(opp_table, &old_freq); #(old_freq is modified!) old_freq = 1GHz Fixes: 6a0712f6 ("PM / OPP: Add dev_pm_opp_set_rate()") Cc: 4.6+ <stable@vger.kernel.org> # v4.6+ Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Rosenberg authored
commit 717adfda upstream. If our length is greater than the size of the buffer, we overflow the buffer Cc: stable@vger.kernel.org Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gustavo A. R. Silva authored
commit 4f65245f upstream. uref->field_index, uref->usage_index, finfo.field_index and cinfo.index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/hid/usbhid/hiddev.c:473 hiddev_ioctl_usage() warn: potential spectre issue 'report->field' (local cap) drivers/hid/usbhid/hiddev.c:477 hiddev_ioctl_usage() warn: potential spectre issue 'field->usage' (local cap) drivers/hid/usbhid/hiddev.c:757 hiddev_ioctl() warn: potential spectre issue 'report->field' (local cap) drivers/hid/usbhid/hiddev.c:801 hiddev_ioctl() warn: potential spectre issue 'hid->collection' (local cap) Fix this by sanitizing such structure fields before using them to index report->field, field->usage and hid->collection Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jason Andryuk authored
commit ef6eaf27 upstream. Commit ac75a041 ("HID: i2c-hid: fix size check and type usage") started writing messages when the ret_size is <= 2 from i2c_master_recv. However, my device i2c-DLL07D1 returns 2 for a short period of time (~0.5s) after I stop moving the pointing stick or touchpad. It varies, but you get ~50 messages each time which spams the log hard. [ 95.925055] i2c_hid i2c-DLL07D1:01: i2c_hid_get_input: incomplete report (83/2) This has also been observed with a i2c-ALP0017. [ 1781.266353] i2c_hid i2c-ALP0017:00: i2c_hid_get_input: incomplete report (30/2) Only print the message when ret_size is totally invalid and less than 2 to cut down on the log spam. Fixes: ac75a041 ("HID: i2c-hid: fix size check and type usage") Reported-by: John Smith <john-s-84@gmx.net> Cc: stable@vger.kernel.org Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ido Schimmel authored
Jiri Slaby noticed that the backport of upstream commit 25cc72a3 ("mlxsw: spectrum: Forbid linking to devices that have uppers") to kernel 4.9.y introduced the same check twice in the same function instead of in two different places. Fix this by relocating one of the checks to its intended place, thus preventing unsupported configurations as described in the original commit. Fixes: 73ee5a73 ("mlxsw: spectrum: Forbid linking to devices that have uppers") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jon Derrick authored
commit a17712c8 upstream. This patch attempts to close a hole leading to a BUG seen with hot removals during writes [1]. A block device (NVME namespace in this test case) is formatted to EXT4 without partitions. It's mounted and write I/O is run to a file, then the device is hot removed from the slot. The superblock attempts to be written to the drive which is no longer present. The typical chain of events leading to the BUG: ext4_commit_super() __sync_dirty_buffer() submit_bh() submit_bh_wbc() BUG_ON(!buffer_mapped(bh)); This fix checks for the superblock's buffer head being mapped prior to syncing. [1] https://www.spinics.net/lists/linux-ext4/msg56527.htmlSigned-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit bfe0a5f4 upstream. The kernel's ext4 mount-time checks were more permissive than e2fsprogs's libext2fs checks when opening a file system. The superblock is considered too insane for debugfs or e2fsck to operate on it, the kernel has no business trying to mount it. This will make file system fuzzing tools work harder, but the failure cases that they find will be more useful and be easier to evaluate. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit c37e9e01 upstream. If there is a directory entry pointing to a system inode (such as a journal inode), complain and declare the file system to be corrupted. Also, if the superblock's first inode number field is too small, refuse to mount the file system. This addresses CVE-2018-10882. https://bugzilla.kernel.org/show_bug.cgi?id=200069Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit 6e8ab72a upstream. When converting from an inode from storing the data in-line to a data block, ext4_destroy_inline_data_nolock() was only clearing the on-disk copy of the i_blocks[] array. It was not clearing copy of the i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually used by ext4_map_blocks(). This didn't matter much if we are using extents, since the extents header would be invalid and thus the extents could would re-initialize the extents tree. But if we are using indirect blocks, the previous contents of the i_blocks array will be treated as block numbers, with potentially catastrophic results to the file system integrity and/or user data. This gets worse if the file system is using a 1k block size and s_first_data is zero, but even without this, the file system can get quite badly corrupted. This addresses CVE-2018-10881. https://bugzilla.kernel.org/show_bug.cgi?id=200015Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit bdbd6ce0 upstream. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit bc890a60 upstream. If there is a corupted file system where the claimed depth of the extent tree is -1, this can cause a massive buffer overrun leading to sadness. This addresses CVE-2018-10877. https://bugzilla.kernel.org/show_bug.cgi?id=199417Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit 8844618d upstream. The bg_flags field in the block group descripts is only valid if the uninit_bg or metadata_csum feature is enabled. We were not consistently looking at this field; fix this. Also block group #0 must never have uninitialized allocation bitmaps, or need to be zeroed, since that's where the root inode, and other special inodes are set up. Check for these conditions and mark the file system as corrupted if they are detected. This addresses CVE-2018-10876. https://bugzilla.kernel.org/show_bug.cgi?id=199403Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit 819b23f1 upstream. Regardless of whether the flex_bg feature is set, we should always check to make sure the bits we are setting in the block bitmap are within the block group bounds. https://bugzilla.kernel.org/show_bug.cgi?id=199865Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit 77260807 upstream. It's really bad when the allocation bitmaps and the inode table overlap with the block group descriptors, since it causes random corruption of the bg descriptors. So we really want to head those off at the pass. https://bugzilla.kernel.org/show_bug.cgi?id=199865Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit e09463f2 upstream. Do not set the b_modified flag in block's journal head should not until after we're sure that jbd2_journal_dirty_metadat() will not abort with an error due to there not being enough space reserved in the jbd2 handle. Otherwise, future attempts to modify the buffer may lead a large number of spurious errors and warnings. This addresses CVE-2018-10883. https://bugzilla.kernel.org/show_bug.cgi?id=200071Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mikulas Patocka authored
commit 99ec9e77 upstream. The displaylink hardware has such a peculiarity that it doesn't render a command until next command is received. This produces occasional corruption, such as when setting 22x11 font on the console, only the first line of the cursor will be blinking if the cursor is located at some specific columns. When we end up with a repeating pixel, the driver has a bug that it leaves one uninitialized byte after the command (and this byte is enough to flush the command and render it - thus it fixes the screen corruption), however whe we end up with a non-repeating pixel, there is no byte appended and this results in temporary screen corruption. This patch fixes the screen corruption by always appending a byte 0xAF at the end of URB. It also removes the uninitialized byte. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paulo Alcantara authored
commit 7ffbe655 upstream. For every request we send, whether it is SMB1 or SMB2+, we attempt to reconnect tcon (cifs_reconnect_tcon or smb2_reconnect) before carrying out the request. So, while server->tcpStatus != CifsNeedReconnect, we wait for the reconnection to succeed on wait_event_interruptible_timeout(). If it returns, that means that either the condition was evaluated to true, or timeout elapsed, or it was interrupted by a signal. Since we're not handling the case where the process woke up due to a received signal (-ERESTARTSYS), the next call to wait_event_interruptible_timeout() will _always_ fail and we end up looping forever inside either cifs_reconnect_tcon() or smb2_reconnect(). Here's an example of how to trigger that: $ mount.cifs //foo/share /mnt/test -o username=foo,password=foo,vers=1.0,hard (break connection to server before executing bellow cmd) $ stat -f /mnt/test & sleep 140 [1] 2511 $ ps -aux -q 2511 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2511 0.0 0.0 12892 1008 pts/0 S 12:24 0:00 stat -f /mnt/test $ kill -9 2511 (wait for a while; process is stuck in the kernel) $ ps -aux -q 2511 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2511 83.2 0.0 12892 1008 pts/0 R 12:24 30:01 stat -f /mnt/test By using 'hard' mount point means that cifs.ko will keep retrying indefinitely, however we must allow the process to be killed otherwise it would hang the system. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Cc: stable@vger.kernel.org Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lars Ellenberg authored
commit 64dafbc9 upstream. We have struct drbd_requests { ... struct bio *private_bio; ... } to hold a bio clone for local submission. On local IO completion, we put that bio, and in case we want to use the result later, we overload that member to hold the ERR_PTR() of the completion result, Which, before v4.3, used to be the passed in "int error", so we could first bio_put(), then assign. v4.3-rc1~100^2~21 4246a0b6 block: add a bi_error field to struct bio changed that: bio_put(req->private_bio); - req->private_bio = ERR_PTR(error); + req->private_bio = ERR_PTR(bio->bi_error); Which introduces an access after free, because it was non obvious that req->private_bio == bio. Impact of that was mostly unnoticable, because we only use that value in a multiple-failure case, and even then map any "unexpected" error code to EIO, so worst case we could potentially mask a more specific error with EIO in a multiple failure case. Unless the pointed to memory region was unmapped, as is the case with CONFIG_DEBUG_PAGEALLOC, in which case this results in BUG: unable to handle kernel paging request v4.13-rc1~70^2~75 4e4cbee9 block: switch bios to blk_status_t changes it further to bio_put(req->private_bio); req->private_bio = ERR_PTR(blk_status_to_errno(bio->bi_status)); And blk_status_to_errno() now contains a WARN_ON_ONCE() for unexpected values, which catches this "sometimes", if the memory has been reused quickly enough for other things. Should also go into stable since 4.3, with the trivial change around 4.13. Cc: stable@vger.kernel.org Fixes: 4246a0b6 block: add a bi_error field to struct bio Reported-by: Sarah Newman <srn@prgmr.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-