- 13 Mar, 2018 38 commits
-
-
Andy Whitcroft authored
We currently disable the zfs module changes when we disable zfs builds as part of cross-compilation. We should disable the zfs module checks whenever zfs itself is disabled. Pull the zfs module disablement support such that it is always present. BugLink: http://bugs.launchpad.net/bugs/1737176Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin King <colin.king@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Tadeusz Struk authored
BugLink: http://bugs.launchpad.net/bugs/1735977 Convert asymmetric_verify to akcipher api. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David Howells <dhowells@redhat.com> (cherry picked from commit eb5798f2) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Kevin Cernekee authored
CVE-2017-17450 The capability check in nfnetlink_rcv() verifies that the caller has CAP_NET_ADMIN in the namespace that "owns" the netlink socket. However, xt_osf_fingers is shared by all net namespaces on the system. An unprivileged user can create user and net namespaces in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable() check: vpnns -- nfnl_osf -f /tmp/pf.os vpnns -- nfnl_osf -f /tmp/pf.os -d These non-root operations successfully modify the systemwide OS fingerprint list. Add new capable() checks so that they can't. Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> (cherry picked from commit 916a2790) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Eric W. Biederman authored
CVE-2017-15129 (I can trivially verify that that idr_remove in cleanup_net happens after the network namespace count has dropped to zero --EWB) Function get_net_ns_by_id() does not check for net::count after it has found a peer in netns_ids idr. It may dereference a peer, after its count has already been finaly decremented. This leads to double free and memory corruption: put_net(peer) rtnl_lock() atomic_dec_and_test(&peer->count) [count=0] ... __put_net(peer) get_net_ns_by_id(net, id) spin_lock(&cleanup_list_lock) list_add(&net->cleanup_list, &cleanup_list) spin_unlock(&cleanup_list_lock) queue_work() peer = idr_find(&net->netns_ids, id) | get_net(peer) [count=1] | ... | (use after final put) v ... cleanup_net() ... spin_lock(&cleanup_list_lock) ... list_replace_init(&cleanup_list, ..) ... spin_unlock(&cleanup_list_lock) ... ... ... ... put_net(peer) ... atomic_dec_and_test(&peer->count) [count=0] ... spin_lock(&cleanup_list_lock) ... list_add(&net->cleanup_list, &cleanup_list) ... spin_unlock(&cleanup_list_lock) ... queue_work() ... rtnl_unlock() rtnl_lock() ... for_each_net(tmp) { ... id = __peernet2id(tmp, peer) ... spin_lock_irq(&tmp->nsid_lock) ... idr_remove(&tmp->netns_ids, id) ... ... ... net_drop_ns() ... net_free(peer) ... } ... | v cleanup_net() ... (Second free of peer) Also, put_net() on the right cpu may reorder with left's cpu list_replace_init(&cleanup_list, ..), and then cleanup_list will be corrupted. Since cleanup_net() is executed in worker thread, while put_net(peer) can happen everywhere, there should be enough time for concurrent get_net_ns_by_id() to pick the peer up, and the race does not seem to be unlikely. The patch fixes the problem in standard way. (Also, there is possible problem in peernet2id_alloc(), which requires check for net::count under nsid_lock and maybe_get_net(peer), but in current stable kernel it's used under rtnl_lock() and it has to be safe. Openswitch begun to use peernet2id_alloc(), and possibly it should be fixed too. While this is not in stable kernel yet, so I'll send a separate message to netdev@ later). Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Fixes: 0c7aecd4 "netns: add rtnl cmd to add and get peer netns ids" Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> (backported from commit 21b59443) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Linus Torvalds authored
CVE-2018-5344 范龙飞 reports that KASAN can report a use-after-free in __lock_acquire. The reason is due to insufficient serialization in lo_release(), which will continue to use the loop device even after it has decremented the lo_refcnt to zero. In the meantime, another process can come in, open the loop device again as it is being shut down. Confusion ensues. Reported-by: 范龙飞 <long7573@126.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> (cherry picked from commit ae665016) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Paolo Bonzini authored
BugLink: https://bugs.launchpad.net/bugs/1724614 In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to increase heavily even in cases where the performance improvement was small. In particular, bandwidth divided by CPU usage was as much as 60% lower. To some extent this is the expected effect of the patch, and the additional CPU utilization is only visible when running the benchmarks. However, halving the threshold also halves the extra CPU utilization (from +30-130% to +20-70%) and has no negative effect on performance. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> (backported from commit b401ee0b) Signed-off-by: Victor Tapia <victor.tapia@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Wen-chien Jesse Sung authored
BugLink: https://launchpad.net/bugs/1744077Signed-off-by: Wen-chien Jesse Sung <jesse.sung@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Prameela Rani Garnepudi authored
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742090 BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742094 Two issues were observed, kernel warning at S4 restore and other is failing to wakeup at times. Kernel warning is because, at hibernate resume while mac80211 is resuming, driver is issuing mac80211 detach. The warning is as below: [ 374.972073] WARNING: CPU: 1 PID: 3725 at linux-4.4.0/net/mac80211/iface.c:1000 ieee80211_do_stop+0x6ea/0x810 [mac80211]() .... [ 374.972211] CPU: 1 PID: 3725 Comm: kworker/u4:44 Tainted: G W 4.4.0-98-generic #121-Ubuntu [ 374.972213] Hardware name: Dell Inc. Edge Gateway 3002/ , BIOS 01.00.05 11/22/2017 [ 374.972223] Workqueue: events_unbound async_run_entry_fn [ 374.972230] 0000000000000286 bf3948ba9db4c154 ffff88005a733ad8 ffffffff813fb2c3 [ 374.972235] 0000000000000000 ffffffffc04b8ac8 ffff88005a733b10 ffffffff810812e2 [ 374.972239] ffff8800787c0840 ffff88006f18e700 0000000000000000 ffff88006f18ee90 [ 374.972240] Call Trace: [ 374.972249] [<ffffffff813fb2c3>] dump_stack+0x63/0x90 [ 374.972256] [<ffffffff810812e2>] warn_slowpath_common+0x82/0xc0 [ 374.972260] [<ffffffff8108142a>] warn_slowpath_null+0x1a/0x20 [ 374.972305] [<ffffffffc045915a>] ieee80211_do_stop+0x6ea/0x810 [mac80211] [ 374.972312] [<ffffffff818441ee>] ? _raw_spin_unlock_bh+0x1e/0x20 [ 374.972317] [<ffffffff817608ba>] ? dev_deactivate_many+0x20a/0x240 [ 374.972359] [<ffffffffc045929a>] ieee80211_stop+0x1a/0x20 [mac80211] [ 374.972365] [<ffffffff81732a39>] __dev_close_many+0x99/0x100 [ 374.972369] [<ffffffff81732b31>] dev_close_many+0x91/0x140 [ 374.972374] [<ffffffff810e6171>] ? synchronize_sched_expedited+0x4e1/0x880 [ 374.972379] [<ffffffff81734e2a>] dev_close.part.79+0x4a/0x70 [ 374.972383] [<ffffffff81734e6a>] dev_close+0x1a/0x20 [ 374.972425] [<ffffffffc035fac1>] cfg80211_shutdown_all_interfaces+0x41/0xa0 [cfg80211] [ 374.972467] [<ffffffffc045a6c6>] ieee80211_remove_interfaces+0x56/0x1f0 [mac80211] [ 374.972506] [<ffffffffc0441bca>] ieee80211_unregister_hw+0x4a/0x120 [mac80211] This is avoided by calling ieee80211_restart_hw and reinitializing device as usual in sdio restore and waiting in mac80211_resume until device is ready. Other issue may be due to firmware assertion observed at times for the length of bgscan probe request at restore. To avoid this, unnecessary IEs are cut from the frame at end. Signed-off-by: Prameela Rani Garnepudi <prameela.j04cs@gmail.com> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Acked-by: Kai Heng Feng <kai.heng.feng@canonical.com> Acked-by: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Prameela Rani Garnepudi authored
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742090 BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742094 Sometimes we don't get response for SDIO commands during reset. Additional parameter 'expected response' is added to cmd52readbyte() and cmd52writebyte(). This parameter is false during reset (To avoid waiting for response) and true while disabling or enabling SDIO interrupts. Signed-off-by: Prameela Rani Garnepudi <prameela.j04cs@gmail.com> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Acked-by: Kai Heng Feng <kai.heng.feng@canonical.com> Acked-by: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Prameela Rani Garnepudi authored
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742090 BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742094 UAPSD parameter configuration in power save request should be under UAPSD bitmap check. Otherwise data block issue occurs with non-UAPSD APs . Signed-off-by: Prameela Rani Garnepudi <prameela.j04cs@gmail.com> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Acked-by: Kai Heng Feng <kai.heng.feng@canonical.com> Acked-by: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Pavani Muthyala authored
BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742090 BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1742094 It is observed that magic packet is sometimes missed by firmware which results in wakeup failure. This happens only in coex mode when power save is enabled. Issue is resolved by disabling power save to avoid radio loss for wlan Signed-off-by: Pavani Muthyala <pavanimuthyala1992@gmail.com> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Acked-by: Kai Heng Feng <kai.heng.feng@canonical.com> Acked-by: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Kai-Heng Feng authored
BugLink: https://bugs.launchpad.net/bugs/1664602 The NVMe device in question drops off the PCIe bus after system suspend. I've tried several approaches to workaround this issue, but none of them works: - NVME_QUIRK_DELAY_BEFORE_CHK_RDY - NVME_QUIRK_NO_DEEPEST_PS - Disable APST before controller shutdown - Delay between controller shutdown and system suspend - Explicitly set power state to 0 before controller shutdown Fortunately it's a desktop, so disable APST won't hurt the battery. Also, change the quirk function name to reflect it's for vendor combination quirks. BugLink: https://bugs.launchpad.net/bugs/1705748Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de> (cherry picked from commit 8427bbc2) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 They have known firmware bugs. A fix is apparently in the works -- once fixed firmware is available, someone from Intel (Hi, Keith!) can adjust the quirk accordingly. Cc: stable@vger.kernel.org # v4.11 Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Mario Limonciello <mario_limonciello@dell.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> (backported from commit 50af47d0) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Kai-Heng Feng authored
BugLink: https://bugs.launchpad.net/bugs/1664602 Christoph Hellwig suggests we should to make APST work out of the box. Hence relax the the default max latency to make them able to enter deepest power state on default. Here are id-ctrl excerpts from two high latency NVMes: vid : 0x14a4 ssvid : 0x1b4b mn : CX2-GB1024-Q11 NVMe LITEON 1024GB ps 3 : mp:0.1000W non-operational enlat:5000 exlat:5000 rrt:3 rrl:3 rwt:3 rwl:3 idle_power:- active_power:- ps 4 : mp:0.0100W non-operational enlat:50000 exlat:100000 rrt:4 rrl:4 rwt:4 rwl:4 idle_power:- active_power:- vid : 0x15b7 ssvid : 0x1b4b mn : A400 NVMe SanDisk 512GB ps 3 : mp:0.0500W non-operational enlat:51000 exlat:10000 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:- active_power:- ps 4 : mp:0.0055W non-operational enlat:1000000 exlat:100000 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:- active_power:- Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de> (cherry picked from commit 9947d6a0) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Kai-Heng Feng authored
BugLink: https://bugs.launchpad.net/bugs/1664602 When a NVMe is in non-op states, the latency is exlat. The latency will be enlat + exlat only when the NVMe tries to transit from operational state right atfer it begins to transit to non-operational state, which should be a rare case. Therefore, as Andy Lutomirski suggests, use exlat only when deciding power states to trainsit to. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de> (backported from commit da87591b) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 There's a report that it malfunctions with APST on. See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184 Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com> (cherry picked from commit be56945c) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 I got a couple more reports: the Samsung APST issues appears to affect multiple 950-series devices in Dell XPS 15 9550 and Precision 5510 laptops. Change the quirk: rather than blacklisting the firmware on the first problematic SSD that was reported, disable APST on all 144d:a802 devices if they're installed in the two affected Dell models. While we're at it, disable only the deepest sleep state instead of all of them -- the reporters say that this is sufficient to fix the problem. (I have a device that appears to be entirely identical to one of the affected devices, but I have a different Dell laptop, so it's not the case that all Samsung devices with firmware BXW75D0Q are broken under all circumstances.) Samsung engineers have an affected system, and hopefully they'll give us a better workaround some time soon. In the mean time, this should minimize regressions. See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184 Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com> (backported from commit ff5350a8) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 NVMe devices can advertise multiple power states. These states can be either "operational" (the device is fully functional but possibly slow) or "non-operational" (the device is asleep until woken up). Some devices can automatically enter a non-operational state when idle for a specified amount of time and then automatically wake back up when needed. The hardware configuration is a table. For each state, an entry in the table indicates the next deeper non-operational state, if any, to autonomously transition to and the idle time required before transitioning. This patch teaches the driver to program APST so that each successive non-operational state will be entered after an idle time equal to 100% of the total latency (entry plus exit) associated with that state. The maximum acceptable latency is controlled using dev_pm_qos (e.g. power/pm_qos_latency_tolerance_us in sysfs); non-operational states with total latency greater than this value will not be used. As a special case, setting the latency tolerance to 0 will disable APST entirely. On hardware without APST support, the sysfs file will not be exposed. The latency tolerance for newly-probed devices is set by the module parameter nvme_core.default_ps_max_latency_us. In theory, the device can expose "default" APST table, but this doesn't seem to function correctly on my device (Samsung 950), nor does it seem particularly useful. There is also an optional mechanism by which a configuration can be "saved" so it will be automatically loaded on reset. This can be configured from userspace, but it doesn't seem useful to support in the driver. On my laptop, enabling APST seems to save nearly 1W. The hardware tables can be decoded in userspace with nvme-cli. 'nvme id-ctrl /dev/nvmeN' will show the power state table and 'nvme get-feature -f 0x0c -H /dev/nvme0' will show the current APST configuration. This feature is quirked off on a known-buggy Samsung device. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com> (backported from commit c5552fde) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 Currently, all NVMe quirks are based on PCI IDs. Add a mechanism to define quirks based on identify_ctrl's vendor id, model number, and/or firmware revision. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com> (backported from commit bd4da3ab) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 Any user I can imagine that needs a buffer at all will want to pass a pointer directly. There are no currently callers that use buffers, so this change is painless, and it will make it much easier to start using features that use buffers (e.g. APST). Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Tested-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com> (cherry picked from commit 1a6fe74d) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 nvme_set_features() callers seem to expect that passing NULL as the result pointer is acceptable. Teach nvme_set_features() not to try to write to the NULL address. For symmetry, make the same change to nvme_get_features(), despite the fact that all current callers pass a valid result pointer. I assume that this bug hasn't been reported in practice because the callers that pass NULL are all in the SCSI translation layer and no one uses the relevant operations. Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com> (cherry picked from commit 9b47f77a) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Christoph Hellwig authored
BugLink: https://bugs.launchpad.net/bugs/1664602 NVMe over fabrics will use __nvme_submit_sync_cmd in the the transport and require a few tweaks to it. For that we export it and add a few more paramters: 1. allow passing a queue ID to the block layer For the NVMe over Fabrics connect command we need to able to specify a queue ID that we want to send the command on. Add a qid parameter to the relevant functions to enable this behavior. 2. allow submitting at_head commands In cases where we want to (re)connect to a controller where we have inflight queued commands we want to first connect and only then allow the other queued commands to be kicked. This will prevents failures in controller resets and reconnects. 3. allow passing flags to blk_mq_allocate_request Both for Fabrics connect the the keep-alive feature in NVMe 1.2.1 we want to be able to use reserved requests. Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com> (cherry picked from commit eb71f435) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Christoph Hellwig authored
BugLink: https://bugs.launchpad.net/bugs/1664602 Centralize the check if a given NVMe command reads or writes data. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com> (backported from commit 7a5abb4b) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Christoph Hellwig authored
BugLink: https://bugs.launchpad.net/bugs/1664602 Both LighNVM and NVMe over Fabrics need to look at more than just the status and result field. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matias Bj?rling <m@bjorling.me> Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com> (backported from commit 1cb3cce5) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Lutomirski authored
BugLink: https://bugs.launchpad.net/bugs/1664602 As far as I can tell, there is basically nothing correct about this code. It misinterprets npss (off-by-one). It hardcodes a bunch of power states, which is nonsense, because they're all just indices into a table that software needs to parse. It completely ignores the distinction between operational and non-operational states. And, until 4.8, if all of the above magically succeeded, it would dereference a NULL pointer and OOPS. Since this code appears to be useless, just delete it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Tested-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com> (cherry picked from commit 26501db8) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-By: AceLan Kao <acelan.kao@canonical.com> Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Alexei Starovoitov authored
when the verifier detects that register contains a runtime constant and it's compared with another constant it will prune exploration of the branch that is guaranteed not to be taken at runtime. This is all correct, but malicious program may be constructed in such a way that it always has a constant comparison and the other branch is never taken under any conditions. In this case such path through the program will not be explored by the verifier. It won't be taken at run-time either, but since all instructions are JITed the malicious program may cause JITs to complain about using reserved fields, etc. To fix the issue we have to track the instructions explored by the verifier and sanitize instructions that are dead at run time with NOPs. We cannot reject such dead code, since llvm generates it for valid C code, since it doesn't do as much data flow analysis as the verifier does. Fixes: 17a52670 ("bpf: verifier (add verifier core)") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> (backported from commit c131187d) CVE-2017-17862 [ saf: Add partial backport of 3df126f3 ("bpf: don't (ab)use instructions to store state") to add bpf_insn_aux_data state to verifier_env ] Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jann Horn authored
[ Upstream commit 95a762e2 ] Distinguish between BPF_ALU64|BPF_MOV|BPF_K (load 32-bit immediate, sign-extended to 64-bit) and BPF_ALU|BPF_MOV|BPF_K (load 32-bit immediate, zero-padded to 64-bit); only perform sign extension in the first case. Starting with v4.14, this is exploitable by unprivileged users as long as the unprivileged_bpf_disabled sysctl isn't set. Debian assigned CVE-2017-16995 for this issue. v3: - add CVE number (Ben Hutchings) Fixes: 48461135 ("bpf: allow access into map value arrays") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> CVE-2017-16995 [ saf: Backport to 4.4. Include partial backports of 4923ec0b ("bpf: simplify verifier register state assignments") and 969bf05e ("bpf: direct packet access") to extend reg_state.imm to 64-bit. ] Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Wanpeng Li authored
CVE-2017-17741 Reported by syzkaller: BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm] Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298 CPU: 6 PID: 32298 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #18 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: dump_stack+0xab/0xe1 print_address_description+0x6b/0x290 kasan_report+0x28a/0x370 write_mmio+0x11e/0x270 [kvm] emulator_read_write_onepage+0x311/0x600 [kvm] emulator_read_write+0xef/0x240 [kvm] emulator_fix_hypercall+0x105/0x150 [kvm] em_hypercall+0x2b/0x80 [kvm] x86_emulate_insn+0x2b1/0x1640 [kvm] x86_emulate_instruction+0x39a/0xb90 [kvm] handle_exception+0x1b4/0x4d0 [kvm_intel] vcpu_enter_guest+0x15a0/0x2640 [kvm] kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall) to the guest memory, however, write_mmio tracepoint always prints 8 bytes through *(u64 *)val since kvm splits the mmio access into 8 bytes. This leaks 5 bytes from the kernel stack (CVE-2017-17741). This patch fixes it by just accessing the bytes which we operate on. Before patch: syz-executor-5567 [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f After patch: syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (backported from e39d200f) Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Mohamed Ghannam authored
set rm->atomic.op_active to 0 when rds_pin_pages() fails or the user supplied address is invalid, this prevents a NULL pointer usage in rds_atomic_free_op() Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> CVE-2018-5333 (cherry picked from commit 7d11f77f) Signed-off-by: Benjamin M Romer <benjamin.romer@canonical.com> Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Benjamin M Romer <benjamin.romer@canonical.com>
-
Ido Schimmel authored
BugLink: http://bugs.launchpad.net/bugs/1738219 When the 'ignore_routes_with_linkdown' sysctl is set, we should not consider linkdown nexthops during route lookup. While the code correctly verifies that the initially selected route ('match') has a carrier, it does not perform the same check in the subsequent multipath selection, resulting in a potential packet loss. In case the chosen route does not have a carrier and the sysctl is set, choose the initially selected route. Fixes: 35103d11 ("net: ipv6 sysctl option to ignore routes when nexthop link is down") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit bbfcd776) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Ryan Harper authored
BugLink: http://bugs.launchpad.net/bugs/1729145 - decouple emitting a cached_dev CHANGE uevent which includes dev.uuid and dev.label from bch_cached_dev_run() which only happens when a bcacheX device is bound to the actual backing block device (bcache0 -> vdb) - update bch_cached_dev_run() to invoke bch_cached_dev_emit_change() as needed; no functional code path changes here - Modify register_bcache to detect a re-registering of a bcache cached_dev, and in that case call bcache_cached_dev_emit_change() to Signed-off-by: Ryan Harper <ryan.harper@canonical.com> Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Benjamin Poirier authored
BugLink: http://bugs.launchpad.net/bugs/1730550 Lennart reported the following race condition: \ e1000_watchdog_task \ e1000e_has_link \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link /* link is up */ mac->get_link_status = false; /* interrupt */ \ e1000_msix_other hw->mac.get_link_status = true; link_active = !hw->mac.get_link_status /* link_active is false, wrongly */ This problem arises because the single flag get_link_status is used to signal two different states: link status needs checking and link status is down. Avoid the problem by using the return value of .check_for_link to signal the link status to e1000e_has_link(). Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> (cherry picked from commit 19110cfb) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Benjamin Poirier authored
BugLink: http://bugs.launchpad.net/bugs/1730550 When e1000e_poll() is not fast enough to keep up with incoming traffic, the adapter (when operating in msix mode) raises the Other interrupt to signal Receiver Overrun. This is a double problem because 1) at the moment e1000_msix_other() assumes that it is only called in case of Link Status Change and 2) if the condition persists, the interrupt is repeatedly raised again in quick succession. Ideally we would configure the Other interrupt to not be raised in case of receiver overrun but this doesn't seem possible on this adapter. Instead, we handle the first part of the problem by reverting to the practice of reading ICR in the other interrupt handler, like before commit 16ecba59 ("e1000e: Do not read ICR in Other interrupt"). Thanks to commit 0a8047ac ("e1000e: Fix msi-x interrupt automask") which cleared IAME from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts anymore. We handle the second part of the problem by not re-enabling the Other interrupt right away when there is overrun. Instead, we wait until traffic subsides, napi polling mode is exited and interrupts are re-enabled. Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Fixes: 16ecba59 ("e1000e: Do not read ICR in Other interrupt") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> (cherry picked from commit 4aea7a5c) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Alan Liu authored
BugLink: https://bugs.launchpad.net/bugs/1736317 QCA6174 WLAN.RM.2.0 firmware uses max_tx_power instead of using max_reg_power to set transmission power. The tx power was about -50dbm, after applying this change, it become -32dbm. Signed-off-by: Alan Liu <alanliu@qca.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> (cherry picked from commit 513527c8) Signed-off-by: Shrirang Bagul <shrirang.bagul@canonical.com> Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Acked-by: Kai Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Dan Carpenter authored
BugLink: https://bugs.launchpad.net/bugs/1720228 It's possible to use "err" without initializing it. If it happens to be a 2 which is SCSI_DH_RETRY then that could cause a bug. Bart Van Assche pointed out that we should probably re-initialize it for every iteration through the retry loop. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com> (cherry picked from commit a4bd8520) Signed-off-by: Dragan Stancevic <dragan.stancevic@canonical.com> Acked-by: Kleber Souza <kleber.souza@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Vinson Lee authored
BugLink: http://bugs.launchpad.net/bugs/1734757Suggested-by: Juerg Haefliger <juerg.haefliger@canonical.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Kleber Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Colin Ian King authored
BugLink: https://bugs.launchpad.net/bugs/1703742 The current configuration is set to always use transparent hugepages by default. There exists plenty of anecdotal evidence that this is less than perfect a choice and in some scenarios it leads to some performance issues. My own investigations with stress-ng stream and malloc tests show that the current default impacts performance. I ran various test scenarios on different MADVISE configurations, each result below is based on the average of 5 runs on an i7-3770 CPU @ 3.4GHz with 8GB memory, 8MB L3 cache, 256K L2 cache, 32K/32K L1 cache. All the above results are from an average of 5 rounds of tests. malloc allocation stressor: malloc always madvise size (MB) ops/sec ops/sec 32 1254.43 2422.49 64 2100.36 4300.28 128 3768.57 7215.38 256 7940.73 14893.85 512 17618.62 26861.29 1024 32777.17 48029.37 Clearly madvise is more performent. stream bandwidth/compute stressor: stream always madvise NOHUGEPAGE size (MB) MB/sec MB/sec 1 17713.54 18439.69 2 12460.34 13015.46 4 12195.81 12694.51 8 12085.11 12674.26 16 12054.09 12649.00 32 12082.42 12409.65 64 12262.88 12084.85 128 12235.25 11788.49 256 11808.69 11283.69 512 11970.01 12434.82 For small allocations, always is less performant. Large allocations can enable the more performant transparent huge pages with madvise(2) if we disable always as default. Other stress-ng memory allocation/writing/freeing and madvise operations showed little significant differences. I have also experimented with boot testing Ubuntu with kernels configured with different MADVISE configs and found there is little noticeable difference in performance, so I believe that there is little scope for any kitten killer performance regressions with this change. This change will by default not use transparent huge pages unless madvise(2) is used to instruct the kernel to do so on a memory mapping. According to the madvise(2) manual, this only takes effect on private anonymous mappings with MADV_HUGEPAGE. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Kleber Sacilotto de Souza authored
Ignore: yes Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 12 Feb, 2018 2 commits
-
-
Khalid Elmously authored
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Andy Whitcroft authored
Fix a miss-backport of the upstream commit. Fixes: 63da13a9 ("net: ipv4: fix for a race condition in raw_sendmsg -- fix backport") BugLink: http://bugs.launchpad.net/bugs/1748671Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-