- 13 Mar, 2018 40 commits
-
-
Bin Liu authored
BugLink: http://bugs.launchpad.net/bugs/1745052 commit bd3486de upstream. When babble condition happens, the musb controller might automatically turns off VBUS. On DA8xx platform, the controller generates drvvbus interrupt for turning off VBUS along with the babble interrupt. In this case, we should handle the babble interrupt first and recover from the babble condition. This change ignores the drvvbus interrupt if babble interrupt is also generated at the same time, so the babble recovery routine works properly. Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
nixiaoming authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit c79dde62 ] After rmmod 8250.ko tty_kref_put starts kwork (release_one_tty) to release proc interface oops when accessing driver->driver_name in proc_tty_unregister_driver Use jprobe, found driver->driver_name point to 8250.ko static static struct uart_driver serial8250_reg .driver_name= serial, Use name in proc_dir_entry instead of driver->driver_name to fix oops test on linux 4.1.12: BUG: unable to handle kernel paging request at ffffffffa01979de IP: [<ffffffff81310f40>] strchr+0x0/0x30 PGD 1a0d067 PUD 1a0e063 PMD 851c1f067 PTE 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ... ... [last unloaded: 8250] CPU: 7 PID: 116 Comm: kworker/7:1 Tainted: G O 4.1.12 #1 Hardware name: Insyde RiverForest/Type2 - Board Product Name1, BIOS NE5KV904 12/21/2015 Workqueue: events release_one_tty task: ffff88085b684960 ti: ffff880852884000 task.ti: ffff880852884000 RIP: 0010:[<ffffffff81310f40>] [<ffffffff81310f40>] strchr+0x0/0x30 RSP: 0018:ffff880852887c90 EFLAGS: 00010282 RAX: ffffffff81a5eca0 RBX: ffffffffa01979de RCX: 0000000000000004 RDX: ffff880852887d10 RSI: 000000000000002f RDI: ffffffffa01979de RBP: ffff880852887cd8 R08: 0000000000000000 R09: ffff88085f5d94d0 R10: 0000000000000195 R11: 0000000000000000 R12: ffffffffa01979de R13: ffff880852887d00 R14: ffffffffa01979de R15: ffff88085f02e840 FS: 0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa01979de CR3: 0000000001a0c000 CR4: 00000000001406e0 Stack: ffffffff812349b1 ffff880852887cb8 ffff880852887d10 ffff88085f5cd6c2 ffff880852800a80 ffffffffa01979de ffff880852800a84 0000000000000010 ffff88085bb28bd8 ffff880852887d38 ffffffff812354f0 ffff880852887d08 Call Trace: [<ffffffff812349b1>] ? __xlate_proc_name+0x71/0xd0 [<ffffffff812354f0>] remove_proc_entry+0x40/0x180 [<ffffffff815f6811>] ? _raw_spin_lock_irqsave+0x41/0x60 [<ffffffff813be520>] ? destruct_tty_driver+0x60/0xe0 [<ffffffff81237c68>] proc_tty_unregister_driver+0x28/0x40 [<ffffffff813be548>] destruct_tty_driver+0x88/0xe0 [<ffffffff813be5bd>] tty_driver_kref_put+0x1d/0x20 [<ffffffff813becca>] release_one_tty+0x5a/0xd0 [<ffffffff81074159>] process_one_work+0x139/0x420 [<ffffffff810745a1>] worker_thread+0x121/0x450 [<ffffffff81074480>] ? process_scheduled_works+0x40/0x40 [<ffffffff8107a16c>] kthread+0xec/0x110 [<ffffffff81080000>] ? tg_rt_schedulable+0x210/0x220 [<ffffffff8107a080>] ? kthread_freezable_should_stop+0x80/0x80 [<ffffffff815f7292>] ret_from_fork+0x42/0x70 [<ffffffff8107a080>] ? kthread_freezable_should_stop+0x80/0x80 Signed-off-by: nixiaoming <nixiaoming@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Michael Ellerman authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 05c14c03 ] In the hv-24x7 code there is a function memord() which tries to implement a sort function return -1, 0, 1. However one of the conditions is incorrect, such that it can never be true, because we will have already returned. I don't believe there is a bug in practice though, because the comparisons are an optimisation prior to calling memcmp(). Fix it by swapping the second comparision, so it can be true. Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Martin Wilck authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit dfb2e6f4 ] This patch cleans up a lot of warnings when unloading the driver. A current example of the stack trace starts with: [ 142.570715] sysfs group 'power' not found for kobject 'port-5:0' There can be hundreds of these messages during a driver unload. I am resubmitting this patch on behalf of Martin Wilck with his permission. His original patch can be found here: https://www.spinics.net/lists/linux-scsi/msg102085.html This patch did not help until Hannes's commit 9441284fbc39 ("scsi-fixup-kernel-warning-during-rmmod") was applied to the kernel. --------------------------- Original patch description: --------------------------- Unloading the hpsa driver causes warnings [ 1063.793652] WARNING: CPU: 1 PID: 4850 at ../fs/sysfs/group.c:237 device_del+0x54/0x240() [ 1063.793659] sysfs group ffffffff81cf21a0 not found for kobject 'port-2:0' with two different stacks: 1) [ 1063.793774] [<ffffffff81448af4>] device_del+0x54/0x240 [ 1063.793780] [<ffffffff8145178a>] transport_remove_classdev+0x4a/0x60 [ 1063.793784] [<ffffffff81451216>] attribute_container_device_trigger+0xa6/0xb0 [ 1063.793802] [<ffffffffa0105d46>] sas_port_delete+0x126/0x160 [scsi_transport_sas] [ 1063.793819] [<ffffffffa036ebcc>] hpsa_free_sas_port+0x3c/0x70 [hpsa] 2) [ 1063.797103] [<ffffffff81448af4>] device_del+0x54/0x240 [ 1063.797118] [<ffffffffa0105d4e>] sas_port_delete+0x12e/0x160 [scsi_transport_sas] [ 1063.797134] [<ffffffffa036ebcc>] hpsa_free_sas_port+0x3c/0x70 [hpsa] This is caused by the fact that host device hostX is deleted before the SAS transport devices hostX/port-a:b. This patch fixes this by reverting the order of device deletions. Tested-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin Wilck <mwilck@suse.de> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Martin Wilck authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 55ca38b4 ] I am resubmitting this patch on behalf of Martin Wilck with his permission. The original patch can be found here: https://www.spinics.net/lists/linux-scsi/msg102083.html This patch did not help until Hannes's commit 9441284fbc39 ("scsi-fixup-kernel-warning-during-rmmod") was applied to the kernel. -------------------------------------- Original patch description from Martin: -------------------------------------- When the hpsa module is unloaded using rmmod, dangling symlinks remain under /sys/class/sas_phy. Fix this by calling sas_phy_delete() rather than sas_phy_free (which, according to comments, should not be called for PHYs that have been set up successfully, anyway). Tested-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin Wilck <mwilck@suse.de> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Alex Williamson authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 16b6c8bb ] When removing a device, for example a VF being removed due to SR-IOV teardown, a "soft" hot-unplug via 'echo 1 > remove' in sysfs, or an actual hot-unplug, we first remove the procfs and sysfs attributes for the device before attempting to release the device from any driver bound to it. Unbinding the driver from the device can take time. The device might need to write out data or it might be actively in use. If it's in use by userspace through a vfio driver, the unbind might block until the user releases the device. This leads to a potentially non-trivial amount of time where the device exists, but we've torn down the interfaces that userspace uses to examine devices, for instance lspci might generate this sort of error: pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config lspci: Unable to read the standard configuration space header of device 0000:01:0a.3 We don't seem to have any dependence on this teardown ordering in the kernel, so let's unbind the driver first, which is also more symmetric with the instantiation of the device in pci_bus_add_device(). Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Christoph Hellwig authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 5e422f5e ] There was one spot in xfs_bmap_add_extent_unwritten_real that didn't use the passed in new extent state but always converted to normal, leading to wrong behavior when converting from normal to unwritten. Only found by code inspection, it seems like this code path to move partial extent from written to unwritten while merging it with the next extent is rarely exercised. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Brian Foster authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 9f2a4505 ] It is possible for mkfs to format very small filesystems with too small of an internal log with respect to the various minimum size and block count requirements. If this occurs when the log happens to be smaller than the scan window used for cycle verification and the scan wraps the end of the log, the start_blk calculation in xlog_find_head() underflows and leads to an attempt to scan an invalid range of log blocks. This results in log recovery failure and a failed mount. Since there may be filesystems out in the wild with this kind of geometry, we cannot simply refuse to mount. Instead, cap the scan window for cycle verification to the size of the physical log. This ensures that the cycle verification proceeds as expected when the scan wraps the end of the log. Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Jiri Slaby authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 4dc12ffe ] l2tp_tunnel_delete does not return anything since commit 62b982ee ("l2tp: fix race condition in l2tp_tunnel_delete"). But call sites of l2tp_tunnel_delete still do casts to void to avoid unused return value warnings. Kill these now useless casts. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: Guillaume Nault <g.nault@alphalink.fr> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
tang.junhui authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit c1573137 ] Currently, Cache missed IOs are identified by s->cache_miss, but actually, there are many situations that missed IOs are not assigned a value for s->cache_miss in cached_dev_cache_miss(), for example, a bypassed IO (s->iop.bypass = 1), or the cache_bio allocate failed. In these situations, it will go to out_put or out_submit, and s->cache_miss is null, which leads bch_mark_cache_accounting() to treat this IO as a hit IO. [ML: applied by 3-way merge] Signed-off-by: tang.junhui <tang.junhui@zte.com.cn> Reviewed-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Liang Chen authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 330a4db8 ] mutex_destroy does nothing most of time, but it's better to call it to make the code future proof and it also has some meaning for like mutex debug. As Coly pointed out in a previous review, bcache_exit() may not be able to handle all the references properly if userspace registers cache and backing devices right before bch_debug_init runs and bch_debug_init failes later. So not exposing userspace interface until everything is ready to avoid that issue. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Coly Li <colyli@suse.de> Reviewed-by: Eric Wheeler <bcache@linux.ewheeler.net> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Bob Peterson authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit cc555b09 ] This patch fixes a deadlock caused when the jdata flag is set for inodes that are already on the ordered write list. Since it is on the ordered write list, log_flush calls gfs2_ordered_write which calls filemap_fdatawrite. But since the inode had the jdata flag set, that calls gfs2_jdata_writepages, which tries to start a new transaction. A new transaction cannot be started because it tries to acquire the log_flush rwsem which is already locked by the log flush operation. The bottom line is: We cannot switch an inode from ordered to jdata until we eliminate any ordered data pages (via log flush) or any log_flush operation afterward will create the circular dependency above. So we need to flush the log before setting the diskflags to switch the file mode, then we need to remove the inode from the ordered writes list. Before this patch, the log flush was done for jdata->ordered, but that's wrong. If we're going from jdata to ordered, we don't need to call gfs2_log_flush because the call to filemap_fdatawrite will do it for us: filemap_fdatawrite() -> __filemap_fdatawrite_range() __filemap_fdatawrite_range() -> do_writepages() do_writepages() -> gfs2_jdata_writepages() gfs2_jdata_writepages() -> gfs2_log_flush() This patch modifies function do_gfs2_set_flags so that if a file has its jdata flag set, and it's already on the ordered write list, the log will be flushed and it will be removed from the list before setting the flag. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Abhijith Das <adas@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Daniel Lezcano authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 07209fcf ] There is a particular situation when the cooling device is cpufreq and the heat dissipation is not efficient enough where the temperature increases little by little until reaching the critical threshold and leading to a SoC reset. The behavior is reproducible on a hikey6220 with bad heat dissipation (eg. stacked with other boards). Running a simple C program doing while(1); for each CPU of the SoC makes the temperature to reach the passive regulation trip point and ends up to the maximum allowed temperature followed by a reset. This issue has been also reported by running the libhugetlbfs test suite. What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz while the temperature continues to grow. It appears the step wise governor calls get_target_state() the first time with the throttle set to true and the trend to 'raising'. The code selects logically the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so good. The temperature decreases immediately but still stays greater than the trip point, then get_target_state() is called again, this time with the throttle set to true *and* the trend to 'dropping'. From there the algorithm assumes we have to step down the state and the cpu frequency jumps back to 1.2GHz. But the temperature is still higher than the trip point, so get_target_state() is called with throttle=1 and trend='raising' again, we jump to 900MHz, then get_target_state() is called with throttle=1 and trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not stabilizes and continues to increase. [ 237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 237.922690] thermal cooling_device0: cur_state=0 [ 237.922701] thermal cooling_device0: old_target=0, target=1 [ 238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1 [ 238.026694] thermal cooling_device0: cur_state=1 [ 238.026707] thermal cooling_device0: old_target=1, target=0 [ 238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 238.134679] thermal cooling_device0: cur_state=0 [ 238.134690] thermal cooling_device0: old_target=0, target=1 In this situation the temperature continues to increase while the trend is oscillating between 'dropping' and 'raising'. We need to keep the current state untouched if the throttle is set, so the temperature can decrease or a higher state could be selected, thus preventing this oscillation. Keeping the next_target untouched when 'throttle' is true at 'dropping' time fixes the issue. The following traces show the governor does not change the next state if trend==2 (dropping) and throttle==1. [ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 2306.128021] thermal cooling_device0: cur_state=0 [ 2306.128031] thermal cooling_device0: old_target=0, target=1 [ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1 [ 2306.232030] thermal cooling_device0: cur_state=1 [ 2306.232042] thermal cooling_device0: old_target=1, target=1 [ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1 [ 2306.336021] thermal cooling_device0: cur_state=1 [ 2306.336034] thermal cooling_device0: old_target=1, target=1 [ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0 [ 2306.440022] thermal cooling_device0: cur_state=1 [ 2306.440034] thermal cooling_device0: old_target=1, target=0 [ ... ] After a while, if the temperature continues to increase, the next state becomes 2 which is 720MHz on the hikey. That results in the temperature stabilizing around the trip point. [ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0 [ 2455.832019] thermal cooling_device0: cur_state=1 [ 2455.832032] thermal cooling_device0: old_target=1, target=1 [ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0 [ 2455.936027] thermal cooling_device0: cur_state=1 [ 2455.936040] thermal cooling_device0: old_target=1, target=1 [ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0 [ 2456.044023] thermal cooling_device0: cur_state=1 [ 2456.044036] thermal cooling_device0: old_target=1, target=1 [ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 2456.148042] thermal cooling_device0: cur_state=1 [ 2456.148055] thermal cooling_device0: old_target=1, target=2 [ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0 [ 2456.252058] thermal cooling_device0: cur_state=2 [ 2456.252075] thermal cooling_device0: old_target=2, target=1 IOW, this change is needed to keep the state for a cooling device if the temperature trend is oscillating while the temperature increases slightly. Without this change, the situation above leads to a catastrophic crash by a hardware reset on hikey. This issue has been reported to happen on an OMAP dra7xx also. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Keerthy <j-keerthy@ti.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Tested-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Gao Feng authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit f02b2320 ] The mutex_destroy only makes sense when enable DEBUG_MUTEX. For the good readbility, it's better to invoke it in exit func when the init func invokes mutex_init. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Michał Mirosław authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 54eff226 ] According to comments in code and common sense, cclk_lp uses its own divisor, not cclk_g's. Fixes: b08e8c0e ("clk: tegra: add clock support for Tegra30") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-By: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Sébastien Szymanski authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit c68ee58d ] On i.MX6 SoCs without VPU (in my case MCIMX6D4AVT10AC), the hdmi driver fails to probe: [ 2.540030] dwhdmi-imx 120000.hdmi: Unsupported HDMI controller (0000:00:00) [ 2.548199] imx-drm display-subsystem: failed to bind 120000.hdmi (ops dw_hdmi_imx_ops): -19 [ 2.557403] imx-drm display-subsystem: master bind failed: -19 That's because hdmi_isfr's parent, video_27m, is not correctly ungated. As explained in commit 5ccc248c ("ARM: imx6q: clk: Add support for mipi_core_cfg clock as a shared clock gate"), video_27m is gated by CCM_CCGR3[CG8]. On i.MX6 SoCs with VPU, the hdmi is working thanks to the CCM_CMEOR[mod_en_ov_vpu] bit which makes the video_27m ungated whatever is in CCM_CCGR3[CG8]. The issue can be reproduced by setting CCMEOR[mod_en_ov_vpu] to 0. Make the HDMI work in every case by setting hdmi_isfr's parent to mipi_core_cfg. Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Chen Zhong authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit c955bf39 ] Since the previous setup always sets the PLL using crystal 26MHz, this doesn't always happen in every MediaTek platform. So the patch added flexibility for assigning extra member for determining the PLL source clock. Signed-off-by: Chen Zhong <chen.zhong@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Jan Kara authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 592e2545 ] _calc_vm_trans() does not handle the situation when some of the passed flags are 0 (which can happen if these VM flags do not make sense for the architecture). Improve the _calc_vm_trans() macro to return 0 in such situation. Since all passed flags are constant, this does not add any runtime overhead. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Robert Baronescu authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 7aacbfcb ] Fix the way the length of the buffers used for encryption / decryption are computed. For e.g. in case of encryption, input buffer does not contain an authentication tag. Signed-off-by: Robert Baronescu <robert.baronescu@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Suzuki K Poulose authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit c7f5828b ] When the PMU driver is built as a module, the perf expects the pmu->module to be valid, so that the driver is prevented from being unloaded while it is in use. Fix the CCN pmu driver to fill in this field. Fixes: a33b0daa ("bus: ARM CCN PMU driver") Cc: Pawel Moll <pawel.moll@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Jiang Yi authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 594e25e7 ] The function fd_execute_unmap() in target_core_file.c calles ret = file->f_op->fallocate(file, mode, pos, len); Some filesystems implement fallocate() to return error if length is zero (e.g. btrfs) but according to SCSI Block Commands spec UNMAP should return success for zero length. Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
tangwenji authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 24528f08 ] When is pr_reg->isid_present_at_reg is false,this function should return. This fixes a regression originally introduced by: commit d2843c17 Author: Andy Grover <agrover@redhat.com> Date: Thu May 16 10:40:55 2013 -0700 target: Alter core_pr_dump_initiator_port for ease of use Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
tangwenji authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 12d5a43b ] tpg must free when call core_tpg_register() return fail Signed-off-by: tangwenji <tang.wenji@zte.com.cn> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Bart Van Assche authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit cfe2b621 ] Avoid that cmd->se_cmd.se_tfo is read after a command has already been freed. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Christophe Leroy authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 6b148a7c ] IPIC Status is provided by register IPIC_SERSR and not by IPIC_SERMR which is the mask register. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
William A. Kennington III authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 71e24d77 ] The current code checks the completion map to look for the first token that is complete. In some cases, a completion can come in but the token can still be on lease to the caller processing the completion. If this completed but unreleased token is the first token found in the bitmap by another tasks trying to acquire a token, then the __test_and_set_bit call will fail since the token will still be on lease. The acquisition will then fail with an EBUSY. This patch reorganizes the acquisition code to look at the opal_async_token_map for an unleased token. If the token has no lease it must have no outstanding completions so we should never see an EBUSY, unless we have leased out too many tokens. Since opal_async_get_token_inrerruptible is protected by a semaphore, we will practically never see EBUSY anymore. Fixes: 8d724823 ("powerpc/powernv: Infrastructure to support OPAL async completion") Signed-off-by: William A. Kennington III <wak@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
KUWAZAWA Takuya authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit c5504f72 ] Information about ipvs in different network namespace can be seen via procfs. How to reproduce: # ip netns add ns01 # ip netns add ns02 # ip netns exec ns01 ip a add dev lo 127.0.0.1/8 # ip netns exec ns02 ip a add dev lo 127.0.0.1/8 # ip netns exec ns01 ipvsadm -A -t 10.1.1.1:80 # ip netns exec ns02 ipvsadm -A -t 10.1.1.2:80 The ipvsadm displays information about its own network namespace only. # ip netns exec ns01 ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.1.1:80 wlc # ip netns exec ns02 ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.1.1.2:80 wlc But I can see information about other network namespace via procfs. # ip netns exec ns01 cat /proc/net/ip_vs IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 0A010101:0050 wlc TCP 0A010102:0050 wlc # ip netns exec ns02 cat /proc/net/ip_vs IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 0A010102:0050 wlc Signed-off-by: KUWAZAWA Takuya <albatross0@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Shriya authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit cd77b5ce ] The call to /proc/cpuinfo in turn calls cpufreq_quick_get() which returns the last frequency requested by the kernel, but may not reflect the actual frequency the processor is running at. This patch makes a call to cpufreq_get() instead which returns the current frequency reported by the hardware. Fixes: fb5153d0 ("powerpc: powernv: Implement ppc_md.get_proc_freq()") Signed-off-by: Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Qiang authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 3ad3f8ce ] PCIe PME and native hotplug share the same interrupt number, so hotplug interrupts are also processed by PME. In some cases, e.g., a Link Down interrupt, a device may be present but unreachable, so when we try to read its Root Status register, the read fails and we get all ones data (0xffffffff). Previously, we interpreted that data as PCI_EXP_RTSTA_PME being set, i.e., "some device has asserted PME," so we scheduled pcie_pme_work_fn(). This caused an infinite loop because pcie_pme_work_fn() tried to handle PME requests until PCI_EXP_RTSTA_PME is cleared, but with the link down, PCI_EXP_RTSTA_PME can't be cleared. Check for the invalid 0xffffffff data everywhere we read the Root Status register. 1469d17d ("PCI: pciehp: Handle invalid data when reading from non-existent devices") added similar checks in the hotplug driver. Signed-off-by: Qiang Zheng <zhengqiang10@huawei.com> [bhelgaas: changelog, also check in pcie_pme_work_fn(), use "~0" to follow other similar checks] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Peter Ujfalusi authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 288e7560 ] The used 0x1f mask is only valid for am335x family of SoC, different family using this type of crossbar might have different number of electable events. In case of am43xx family 0x3f mask should have been used for example. Instead of trying to handle each family's mask, just use u8 type to store the mux value since the event offsets are aligned to byte offset. Fixes: 42dbdcc6 ("dmaengine: ti-dma-crossbar: Add support for crossbar on AM33xx/AM43xx") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Philipp Zabel authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit a3350f9c ] The pcf8563_clkout_recalc_rate function erroneously ignores the frequency index read from the CLKO register and always returns 32768 Hz. Fixes: a39a6405 ("rtc: pcf8563: add CLKOUT to common clock framework") Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Christophe JAILLET authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 8cae353e ] 'ret' is known to be 0 at this point. In case of memory allocation error in 'framebuffer_alloc()', return -ENOMEM instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Christophe JAILLET authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 451f1306 ] We should go through the error handling code instead of returning -ENOMEM directly. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Ladislav Michl authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit c9876947 ] While usb_control_msg function expects timeout in miliseconds, a value of HZ is used. Replace it with USB_CTRL_GET_TIMEOUT and also fix error message which looks like: udlfb: Read EDID byte 78 failed err ffffff92 as error is either negative errno or number of bytes transferred use %d format specifier. Returned EDID is in second byte, so return error when less than two bytes are received. Fixes: 18dffdf8 ("staging: udlfb: enhance EDID and mode handling support") Signed-off-by: Ladislav Michl <ladis@linux-mips.org> Cc: Bernie Thompson <bernie@plugable.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Geert Uytterhoeven authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit ac831a37 ] Dan's static analysis says: drivers/video/fbdev/controlfb.c:560 control_setup() error: buffer overflow 'control_mac_modes' 20 <= 21 Indeed, control_mac_modes[] has only 20 elements, while VMODE_MAX is 22, which may lead to an out of bounds read when parsing vmode commandline options. The bug was introduced in v2.4.5.6, when 2 new modes were added to macmodes.h, but control_mac_modes[] wasn't updated: https://kernel.opensuse.org/cgit/kernel/diff/include/video/macmodes.h?h=v2.5.2&id=29f279c764808560eaceb88fef36cbc35c529aad Augment control_mac_modes[] with the two new video modes to fix this. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Robert Stonehouse authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit cbad52e9 ] Fixes: 535a6177 ("sfc: suppress handled MCDI failures when changing the MAC address") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Mike Christie authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 760bf578 ] This fixes the following races: 1. core_alua_do_transition_tg_pt could have read tg_pt_gp_alua_access_state and gone into this if chunk: if (!explicit && atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == ALUA_ACCESS_STATE_TRANSITION) { and then core_alua_do_transition_tg_pt_work could update the state. core_alua_do_transition_tg_pt would then only set tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would not get updated with the second calls state. 2. core_alua_do_transition_tg_pt could be setting tg_pt_gp_transition_complete while the tg_pt_gp_transition_work is already completing. core_alua_do_transition_tg_pt then waits on the completion that will never be called. To handle these issues, we just call flush_work which will return when core_alua_do_transition_tg_pt_work has completed so there is no need to do the complete/wait. And, if core_alua_do_transition_tg_pt_work was running, instead of trying to sneak in the state change, we just schedule up another core_alua_do_transition_tg_pt_work call. Note that this does not handle a possible race where there are multiple threads call core_alua_do_transition_tg_pt at the same time. I think we need a mutex in target_tg_pt_gp_alua_access_state_store. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Mike Christie authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit d7175373 ] The implicit transition time tells initiators the min time to wait before timing out a transition. We currently schedule the transition to occur in tg_pt_gp_implicit_trans_secs seconds so there is no room for delays. If core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata needs to write out info to a remote file, then the initiator can easily time out the operation. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Mike Christie authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit 207ee841 ] If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Zygo Blaxell authored
BugLink: http://bugs.launchpad.net/bugs/1745052 [ Upstream commit e1699d2d ] This is a story about 4 distinct (and very old) btrfs bugs. Commit c8b97818 ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit 93c82d57 ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read. The fix was to add a memset in btrfs_get_extent to zero out the hole. Commit 166ae5a4 ("btrfs: fix inline compressed read err corruption") fixed bug #2: compressed inline extents which contained non-zero bytes might be replaced with zero bytes in some cases. This patch removed an unhelpful memset from uncompress_inline, but the case where memset is required was missed. There is also a memset in the decompression code, but this only covers decompressed data that is shorter than the ram_bytes from the extent ref record. This memset doesn't cover the region between the end of the decompressed data and the end of the page. It has also moved around a few times over the years, so there's no single patch to refer to. This patch fixes bug #3: compressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read (i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/). The fix is the same: zero out the hole in the compressed case too, by putting a memset back in uncompress_inline, but this time with correct parameters. The last and oldest bug, bug #0, is the cause of the offending inline extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk of behavior somewhere in the btrfs write code. In a few special cases, an inline extent and hole are allowed to persist where they normally would be combined with later extents in the file. A fast reproducer for bug #0 is presented below. A few offending extents are also created in the wild during large rsync transfers with the -S flag. A Linux kernel build (git checkout; make allyesconfig; make -j8) will produce a handful of offending files as well. Once an offending file is created, it can present different content to userspace each time it is read. Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y kernel back to v3.5 has this behavior. There are fossil records of this bug's effects in commits all the way back to v2.6.32. I have no reason to believe bug #0 wasn't present at the beginning of btrfs compression support in v2.6.29, but I can't easily test kernels that old to be sure. It is not clear whether bug #0 is worth fixing. A fix would likely require injecting extra reads into currently write-only paths, and most of the exceptional cases caused by bug #0 are already handled now. Whether we like them or not, bug #0's inline extents followed by holes are part of the btrfs de-facto disk format now, and we need to be able to read them without data corruption or an infoleak. So enough about bug #0, let's get back to bug #3 (this patch). An example of on-disk structure leading to data corruption found in the wild: item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160 inode generation 50 transid 50 size 47424 nbytes 49141 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20 inode ref index 3 namelen 10 name: DB_File.so item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362 inline extent data size 1341 ram 4085 compress(zlib) item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53 extent data disk byte 5367308288 nr 20480 extent data offset 0 nr 45056 ram 45056 extent compression(zlib) Different data appears in userspace during each read of the 11 bytes between 4085 and 4096. The extent in item 63 is not long enough to fill the first page of the file, so a memset is required to fill the space between item 63 (ending at 4085) and item 64 (beginning at 4096) with zero. Here is a reproducer from Liu Bo, which demonstrates another method of creating the same inline extent and hole pattern: Using 'page_poison=on' kernel command line (or enable CONFIG_PAGE_POISONING) run the following: # touch foo # chattr +c foo # xfs_io -f -c "pwrite -W 0 1000" foo # xfs_io -f -c "falloc 4 8188" foo # od -x foo # echo 3 >/proc/sys/vm/drop_caches # od -x foo This produce the following on my box: Correct output: file contains 1000 data bytes followed by zeros: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000 0001760 0000 0000 0000 0000 0000 0000 0000 0000 * 0020000 Actual output: the data after the first 1000 bytes will be different each run: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d 0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400 0002000 435f 0056 5f74 6164 7400 645f 0062 5f74 (...) Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-