Commits · 44fee75247182715c63af8dc8df2df0f12be98a7 · Kirill Smelkov / linux

13 Mar, 2018 22 commits

nvme: Adjust the Samsung APST quirk · 44fee752

Andy Lutomirski authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

I got a couple more reports: the Samsung APST issues appears to
affect multiple 950-series devices in Dell XPS 15 9550 and Precision
5510 laptops.  Change the quirk: rather than blacklisting the
firmware on the first problematic SSD that was reported, disable
APST on all 144d:a802 devices if they're installed in the two
affected Dell models.  While we're at it, disable only the deepest
sleep state instead of all of them -- the reporters say that this is
sufficient to fix the problem.

(I have a device that appears to be entirely identical to one of the
affected devices, but I have a different Dell laptop, so it's not
the case that all Samsung devices with firmware BXW75D0Q are broken
under all circumstances.)

Samsung engineers have an affected system, and hopefully they'll
give us a better workaround some time soon.  In the mean time, this
should minimize regressions.

See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1678184

Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
(backported from commit ff5350a8)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

44fee752

nvme: Enable autonomous power state transitions · 03ef54e6

Andy Lutomirski authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

NVMe devices can advertise multiple power states.  These states can
be either "operational" (the device is fully functional but possibly
slow) or "non-operational" (the device is asleep until woken up).
Some devices can automatically enter a non-operational state when
idle for a specified amount of time and then automatically wake back
up when needed.

The hardware configuration is a table.  For each state, an entry in
the table indicates the next deeper non-operational state, if any,
to autonomously transition to and the idle time required before
transitioning.

This patch teaches the driver to program APST so that each successive
non-operational state will be entered after an idle time equal to 100%
of the total latency (entry plus exit) associated with that state.
The maximum acceptable latency is controlled using dev_pm_qos
(e.g. power/pm_qos_latency_tolerance_us in sysfs); non-operational
states with total latency greater than this value will not be used.
As a special case, setting the latency tolerance to 0 will disable
APST entirely.  On hardware without APST support, the sysfs file will
not be exposed.

The latency tolerance for newly-probed devices is set by the module
parameter nvme_core.default_ps_max_latency_us.

In theory, the device can expose "default" APST table, but this
doesn't seem to function correctly on my device (Samsung 950), nor
does it seem particularly useful.  There is also an optional
mechanism by which a configuration can be "saved" so it will be
automatically loaded on reset.  This can be configured from
userspace, but it doesn't seem useful to support in the driver.

On my laptop, enabling APST seems to save nearly 1W.

The hardware tables can be decoded in userspace with nvme-cli.
'nvme id-ctrl /dev/nvmeN' will show the power state table and
'nvme get-feature -f 0x0c -H /dev/nvme0' will show the current APST
configuration.

This feature is quirked off on a known-buggy Samsung device.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
(backported from commit c5552fde)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

03ef54e6

nvme: Add a quirk mechanism that uses identify_ctrl · 73b4d470

Andy Lutomirski authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

Currently, all NVMe quirks are based on PCI IDs.  Add a mechanism to
define quirks based on identify_ctrl's vendor id, model number,
and/or firmware revision.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
(backported from commit bd4da3ab)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

73b4d470

nvme: Pass pointers, not dma addresses, to nvme_get/set_features() · 2b50a40b

Andy Lutomirski authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

Any user I can imagine that needs a buffer at all will want to pass
a pointer directly.  There are no currently callers that use
buffers, so this change is painless, and it will make it much easier
to start using features that use buffers (e.g. APST).
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jay Freyensee <james_p_freyensee@linux.intel.com>
Tested-by: Jay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 1a6fe74d)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

2b50a40b

nvme: Fix nvme_get/set_features() with a NULL result pointer · fa2aaddc

Andy Lutomirski authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

nvme_set_features() callers seem to expect that passing NULL as the
result pointer is acceptable.  Teach nvme_set_features() not to try to
write to the NULL address.

For symmetry, make the same change to nvme_get_features(), despite the
fact that all current callers pass a valid result pointer.

I assume that this bug hasn't been reported in practice because
the callers that pass NULL are all in the SCSI translation layer
and no one uses the relevant operations.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 9b47f77a)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

fa2aaddc

nvme: Modify and export sync command submission for fabrics · e8f85779

Christoph Hellwig authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

NVMe over fabrics will use __nvme_submit_sync_cmd in the the
transport and require a few tweaks to it.  For that we export it
and add a few more paramters:

1. allow passing a queue ID to the block layer

   For the NVMe over Fabrics connect command we need to able to specify a
   queue ID that we want to send the command on.  Add a qid parameter to
   the relevant functions to enable this behavior.

2. allow submitting at_head commands

   In cases where we want to (re)connect to a controller
   where we have inflight queued commands we want to first
   connect and only then allow the other queued commands to
   be kicked. This will prevents failures in controller resets
   and reconnects.

3. allow passing flags to blk_mq_allocate_request

   Both for Fabrics connect the the keep-alive feature in NVMe 1.2.1 we
   want to be able to use reserved requests.
Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit eb71f435)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

e8f85779

nvme: factor out a add nvme_is_write helper · e37c5cca

Christoph Hellwig authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

Centralize the check if a given NVMe command reads or writes data.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(backported from commit 7a5abb4b)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

e37c5cca

nvme: return the whole CQE through the request passthrough interface · 0aa7cb3a

Christoph Hellwig authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

Both LighNVM and NVMe over Fabrics need to look at more than just the
status and result field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matias Bj?rling <m@bjorling.me>
Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(backported from commit 1cb3cce5)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

0aa7cb3a

nvme/scsi: Remove power management support · 07e482c7

Andy Lutomirski authored Jan 05, 2018

BugLink: https://bugs.launchpad.net/bugs/1664602

As far as I can tell, there is basically nothing correct about this
code.  It misinterprets npss (off-by-one).  It hardcodes a bunch of
power states, which is nonsense, because they're all just indices
into a table that software needs to parse.  It completely ignores
the distinction between operational and non-operational states.
And, until 4.8, if all of the above magically succeeded, it would
dereference a NULL pointer and OOPS.

Since this code appears to be useless, just delete it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jay Freyensee <james_p_freyensee@linux.intel.com>
Tested-by: Jay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 26501db8)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Acked-By: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

07e482c7

bpf: fix branch pruning logic · 68dd63b2

Alexei Starovoitov authored Jan 04, 2018

when the verifier detects that register contains a runtime constant
and it's compared with another constant it will prune exploration
of the branch that is guaranteed not to be taken at runtime.
This is all correct, but malicious program may be constructed
in such a way that it always has a constant comparison and
the other branch is never taken under any conditions.
In this case such path through the program will not be explored
by the verifier. It won't be taken at run-time either, but since
all instructions are JITed the malicious program may cause JITs
to complain about using reserved fields, etc.
To fix the issue we have to track the instructions explored by
the verifier and sanitize instructions that are dead at run time
with NOPs. We cannot reject such dead code, since llvm generates
it for valid C code, since it doesn't do as much data flow
analysis as the verifier does.

Fixes: 17a52670 ("bpf: verifier (add verifier core)")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
(backported from commit c131187d)
CVE-2017-17862
[ saf: Add partial backport of 3df126f3 ("bpf: don't (ab)use
  instructions to store state") to add bpf_insn_aux_data state to
  verifier_env ]
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

68dd63b2

bpf: fix incorrect sign extension in check_alu_op() · 04e1c8db

Jann Horn authored Jan 04, 2018

[ Upstream commit 95a762e2 ]

Distinguish between
BPF_ALU64|BPF_MOV|BPF_K (load 32-bit immediate, sign-extended to 64-bit)
and BPF_ALU|BPF_MOV|BPF_K (load 32-bit immediate, zero-padded to 64-bit);
only perform sign extension in the first case.

Starting with v4.14, this is exploitable by unprivileged users as long as
the unprivileged_bpf_disabled sysctl isn't set.

Debian assigned CVE-2017-16995 for this issue.

v3:
 - add CVE number (Ben Hutchings)

Fixes: 48461135 ("bpf: allow access into map value arrays")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
CVE-2017-16995
[ saf: Backport to 4.4. Include partial backports of 4923ec0b ("bpf:
  simplify verifier register state assignments") and 969bf05e ("bpf:
  direct packet access") to extend reg_state.imm to 64-bit. ]
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

04e1c8db

KVM: Fix stack-out-of-bounds read in write_mmio · 48a03f4f

Wanpeng Li authored Jan 04, 2018

CVE-2017-17741

Reported by syzkaller:

  BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm]
  Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298

  CPU: 6 PID: 32298 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ #18
  Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
  Call Trace:
   dump_stack+0xab/0xe1
   print_address_description+0x6b/0x290
   kasan_report+0x28a/0x370
   write_mmio+0x11e/0x270 [kvm]
   emulator_read_write_onepage+0x311/0x600 [kvm]
   emulator_read_write+0xef/0x240 [kvm]
   emulator_fix_hypercall+0x105/0x150 [kvm]
   em_hypercall+0x2b/0x80 [kvm]
   x86_emulate_insn+0x2b1/0x1640 [kvm]
   x86_emulate_instruction+0x39a/0xb90 [kvm]
   handle_exception+0x1b4/0x4d0 [kvm_intel]
   vcpu_enter_guest+0x15a0/0x2640 [kvm]
   kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm]
   kvm_vcpu_ioctl+0x479/0x880 [kvm]
   do_vfs_ioctl+0x142/0x9a0
   SyS_ioctl+0x74/0x80
   entry_SYSCALL_64_fastpath+0x23/0x9a

The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall)
to the guest memory, however, write_mmio tracepoint always prints 8 bytes
through *(u64 *)val since kvm splits the mmio access into 8 bytes. This
leaks 5 bytes from the kernel stack (CVE-2017-17741).  This patch fixes
it by just accessing the bytes which we operate on.

Before patch:

syz-executor-5567  [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f

After patch:

syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(backported from e39d200f)
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

48a03f4f

RDS: null pointer dereference in rds_atomic_free_op · a0a14798

Mohamed Ghannam authored Jan 03, 2018

set rm->atomic.op_active to 0 when rds_pin_pages() fails
or the user supplied address is invalid,
this prevents a NULL pointer usage in rds_atomic_free_op()
Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

CVE-2018-5333
(cherry picked from commit 7d11f77f)
Signed-off-by: Benjamin M Romer <benjamin.romer@canonical.com>
Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Benjamin M Romer <benjamin.romer@canonical.com>

a0a14798

ipv6: Do not consider linkdown nexthops during multipath · b9625607

Ido Schimmel authored Dec 15, 2017

BugLink: http://bugs.launchpad.net/bugs/1738219

When the 'ignore_routes_with_linkdown' sysctl is set, we should not
consider linkdown nexthops during route lookup.

While the code correctly verifies that the initially selected route
('match') has a carrier, it does not perform the same check in the
subsequent multipath selection, resulting in a potential packet loss.

In case the chosen route does not have a carrier and the sysctl is set,
choose the initially selected route.

Fixes: 35103d11 ("net: ipv6 sysctl option to ignore routes when nexthop link is down")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bbfcd776)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

b9625607

UBUNTU: SAUCE: (no-up) bcache: decouple emitting a cached_dev CHANGE uevent · 577fff51

Ryan Harper authored Dec 11, 2017

BugLink: http://bugs.launchpad.net/bugs/1729145

- decouple emitting a cached_dev CHANGE uevent which includes dev.uuid
  and dev.label from bch_cached_dev_run() which only happens when a
  bcacheX device is bound to the actual backing block device (bcache0 -> vdb)

- update bch_cached_dev_run() to invoke bch_cached_dev_emit_change() as
  needed; no functional code path changes here

- Modify register_bcache to detect a re-registering of a bcache
  cached_dev, and in that case call bcache_cached_dev_emit_change() to
Signed-off-by: Ryan Harper <ryan.harper@canonical.com>
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

577fff51

e1000e: Separate signaling for link check/link up · 2ebc9658

Benjamin Poirier authored Dec 14, 2017

BugLink: http://bugs.launchpad.net/bugs/1730550

Lennart reported the following race condition:

\ e1000_watchdog_task
    \ e1000e_has_link
        \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link
            /* link is up */
            mac->get_link_status = false;

                            /* interrupt */
                            \ e1000_msix_other
                                hw->mac.get_link_status = true;

        link_active = !hw->mac.get_link_status
        /* link_active is false, wrongly */

This problem arises because the single flag get_link_status is used to
signal two different states: link status needs checking and link status is
down.

Avoid the problem by using the return value of .check_for_link to signal
the link status to e1000e_has_link().
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 19110cfb)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

2ebc9658

e1000e: Avoid receiver overrun interrupt bursts · 6d6b5fd1

Benjamin Poirier authored Dec 14, 2017

BugLink: http://bugs.launchpad.net/bugs/1730550

When e1000e_poll() is not fast enough to keep up with incoming traffic, the
adapter (when operating in msix mode) raises the Other interrupt to signal
Receiver Overrun.

This is a double problem because 1) at the moment e1000_msix_other()
assumes that it is only called in case of Link Status Change and 2) if the
condition persists, the interrupt is repeatedly raised again in quick
succession.

Ideally we would configure the Other interrupt to not be raised in case of
receiver overrun but this doesn't seem possible on this adapter. Instead,
we handle the first part of the problem by reverting to the practice of
reading ICR in the other interrupt handler, like before commit 16ecba59
("e1000e: Do not read ICR in Other interrupt"). Thanks to commit
0a8047ac ("e1000e: Fix msi-x interrupt automask") which cleared IAME
from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
anymore. We handle the second part of the problem by not re-enabling the
Other interrupt right away when there is overrun. Instead, we wait until
traffic subsides, napi polling mode is exited and interrupts are
re-enabled.
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Fixes: 16ecba59 ("e1000e: Do not read ICR in Other interrupt")
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 4aea7a5c)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

6d6b5fd1

ath10k: add max_tx_power for QCA6174 WLAN.RM.2.0 firmware · 357bf2a7

Alan Liu authored Dec 05, 2017

BugLink: https://bugs.launchpad.net/bugs/1736317

QCA6174 WLAN.RM.2.0 firmware uses max_tx_power instead of using max_reg_power
to set transmission power. The tx power was about -50dbm, after applying this
change, it become -32dbm.
Signed-off-by: Alan Liu <alanliu@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
(cherry picked from commit 513527c8)
Signed-off-by: Shrirang Bagul <shrirang.bagul@canonical.com>
Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Kai Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

357bf2a7

scsi_dh_alua: uninitialized variable in alua_rtpg() · 80a90e5b

Dan Carpenter authored Nov 29, 2017

BugLink: https://bugs.launchpad.net/bugs/1720228

It's possible to use "err" without initializing it.  If it happens to be
a 2 which is SCSI_DH_RETRY then that could cause a bug.  Bart Van Assche
pointed out that we should probably re-initialize it for every iteration
through the retry loop.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com>
(cherry picked from commit a4bd8520)
Signed-off-by: Dragan Stancevic <dragan.stancevic@canonical.com>
Acked-by: Kleber Souza <kleber.souza@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

80a90e5b

UBUNTU: d-i: Add bnxt_en_bpo to nic-modules. · c512ae78

Vinson Lee authored Nov 27, 2017

BugLink: http://bugs.launchpad.net/bugs/1734757Suggested-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Kleber Souza <kleber.souza@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

c512ae78

UBUNTU: SAUCE: use CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y as default · 0895a081

Colin Ian King authored Nov 23, 2017

BugLink: https://bugs.launchpad.net/bugs/1703742

The current configuration is set to always use transparent hugepages
by default. There exists plenty of anecdotal evidence that this is
less than perfect a choice and in some scenarios it leads to some
performance issues.

My own investigations with stress-ng stream and malloc tests show that
the current default impacts performance. I ran various test scenarios
on different MADVISE configurations, each result below is based on
the average of 5 runs on an i7-3770 CPU @ 3.4GHz with 8GB memory,
8MB L3 cache, 256K L2 cache, 32K/32K L1 cache.

All the above results are from an average of 5 rounds of tests.

malloc allocation stressor:

     malloc     always    madvise
    size (MB)   ops/sec   ops/sec
         32     1254.43   2422.49
         64     2100.36   4300.28
        128     3768.57   7215.38
        256     7940.73  14893.85
        512    17618.62  26861.29
       1024    32777.17  48029.37

Clearly madvise is more performent.

stream bandwidth/compute stressor:

    stream      always    madvise
                         NOHUGEPAGE
    size (MB)   MB/sec     MB/sec
          1   17713.54   18439.69
          2   12460.34   13015.46
          4   12195.81   12694.51
          8   12085.11   12674.26
         16   12054.09   12649.00
         32   12082.42   12409.65
         64   12262.88   12084.85
        128   12235.25   11788.49
        256   11808.69   11283.69
        512   11970.01   12434.82

For small allocations, always is less performant. Large
allocations can enable the more performant transparent
huge pages with madvise(2) if we disable always as default.

Other stress-ng memory allocation/writing/freeing and madvise
operations showed little significant differences.

I have also experimented with boot testing Ubuntu with kernels
configured with different MADVISE configs and found there is
little noticeable difference in performance, so I believe that
there is little scope for any kitten killer performance regressions
with this change.

This change will by default not use transparent huge pages unless
madvise(2) is used to instruct the kernel to do so on a memory
mapping.  According to the madvise(2) manual, this only takes
effect on private anonymous mappings with MADV_HUGEPAGE.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

0895a081

UBUNTU: Start new release · c40862fd
Kleber Sacilotto de Souza authored Mar 13, 2018
```
Ignore: yes
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
```
c40862fd

12 Feb, 2018 3 commits

UBUNTU: Ubuntu-4.4.0-116.140 · 855cff54
Khalid Elmously authored Feb 12, 2018
```
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
```
855cff54

UBUNTU: SAUCE: net: ipv4: fix for a race condition in raw_sendmsg -- fix backport · 0abf64f1

Andy Whitcroft authored Feb 12, 2018

Fix a miss-backport of the upstream commit.

Fixes: 63da13a9 ("net: ipv4: fix for a race condition in raw_sendmsg -- fix backport")
BugLink: http://bugs.launchpad.net/bugs/1748671Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

0abf64f1

UBUNTU: Start new release · 63fca73b
Khalid Elmously authored Feb 12, 2018
```
Ignore: yes
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
```
63fca73b

11 Feb, 2018 8 commits
- UBUNTU: Ubuntu-4.4.0-115.139 · 3f9f8be8
  Andy Whitcroft authored Feb 11, 2018
```
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  3f9f8be8
- UBUNTU: add missing ABI files · e3a11466
  Andy Whitcroft authored Feb 11, 2018
```
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  e3a11466
- UBUNTU: Ubuntu-4.4.0-115.138 · b642a6b7
  Andy Whitcroft authored Feb 11, 2018
```
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  b642a6b7
- UBUNTU: Start new release · cd20f9bb
  Andy Whitcroft authored Feb 11, 2018
```
Ignore: yes
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  cd20f9bb
- UBUNTU: [Packaging] pull in retpoline files · 5f3d410a
  Andy Whitcroft authored Feb 11, 2018
```
CVE-2017-5715 (Spectre v2 Intel)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  5f3d410a
- UBUNTU: [Packaging] retpoline files must be sorted · e1b6ef58
  Andy Whitcroft authored Feb 11, 2018
```
CVE-2017-5715 (Spectre v2 Intel)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  e1b6ef58
- UBUNTU: SAUCE: turn off IBRS when full retpoline is present · 8eee103a
  Andy Whitcroft authored Feb 10, 2018
```
CVE-2017-5715 (Spectre v2 Intel)

When we have full retpoline enabled then we do not actually need to toggle
IBRS on entering and leaving the kernel.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  8eee103a
- Revert "UBUNTU: SAUCE: turn off IBPB when full retpoline is present" · 238ae398
  Andy Whitcroft authored Feb 11, 2018
```
CVE-2017-5715 (Spectre v2 Intel)

This reverts commit d31a04f8.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
```
  238ae398
09 Feb, 2018 7 commits

UBUNTU: Ubuntu-4.4.0-114.137 · 41f4a476
Khalid Elmously authored Feb 09, 2018
```
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
```
41f4a476

ALSA: hda - Add missing NVIDIA GPU codec IDs to patch table · 86747369

Eric Desrochers authored Jan 23, 2018

BugLink: https://bugs.launchpad.net/bugs/1744117

Add codec IDs for several recently released, pending, and historical
NVIDIA GPU audio controllers to the patch table, to allow the correct
patch functions to be selected for them.
Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Andy Ritger <aritger@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 74ec1181)
Signed-off-by: Eric Desrochers <eric.desrochers@canonical.com>
Acked-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

86747369

scsi: libiscsi: Allow sd_shutdown on bad transport · d3eeae9e

Rafael David Tinoco authored Jan 23, 2018

BugLink: https://bugs.launchpad.net/bugs/1569925

If, for any reason, userland shuts down iscsi transport interfaces
before proper logouts - like when logging in to LUNs manually, without
logging out on server shutdown, or when automated scripts can't
umount/logout from logged LUNs - kernel will hang forever on its
sd_sync_cache() logic, after issuing the SYNCHRONIZE_CACHE cmd to all
still existent paths.

PID: 1 TASK: ffff8801a69b8000 CPU: 1 COMMAND: "systemd-shutdow"
 #0 [ffff8801a69c3a30] __schedule at ffffffff8183e9ee
 #1 [ffff8801a69c3a80] schedule at ffffffff8183f0d5
 #2 [ffff8801a69c3a98] schedule_timeout at ffffffff81842199
 #3 [ffff8801a69c3b40] io_schedule_timeout at ffffffff8183e604
 #4 [ffff8801a69c3b70] wait_for_completion_io_timeout at ffffffff8183fc6c
 #5 [ffff8801a69c3bd0] blk_execute_rq at ffffffff813cfe10
 #6 [ffff8801a69c3c88] scsi_execute at ffffffff815c3fc7
 #7 [ffff8801a69c3cc8] scsi_execute_req_flags at ffffffff815c60fe
 #8 [ffff8801a69c3d30] sd_sync_cache at ffffffff815d37d7
 #9 [ffff8801a69c3da8] sd_shutdown at ffffffff815d3c3c

This happens because iscsi_eh_cmd_timed_out(), the transport layer
timeout helper, would tell the queue timeout function (scsi_times_out)
to reset the request timer over and over, until the session state is
back to logged in state. Unfortunately, during server shutdown, this
might never happen again.

Other option would be "not to handle" the issue in the transport
layer. That would trigger the error handler logic, which would also need
the session state to be logged in again.

Best option, for such case, is to tell upper layers that the command was
handled during the transport layer error handler helper, marking it as
DID_NO_CONNECT, which will allow completion and inform about the
problem.

After the session was marked as ISCSI_STATE_FAILED, due to the first
timeout during the server shutdown phase, all subsequent cmds will fail
to be queued, allowing upper logic to fail faster.
Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry-picked from commit d7549412 next-20180117)
Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com>
Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

d3eeae9e

libata: apply MAX_SEC_1024 to all LITEON EP1 series devices · 6dd8531e

Xinyu Lin authored Feb 01, 2018

BugLink: http://bugs.launchpad.net/bugs/1743053

LITEON EP1 has the same timeout issues as CX1 series devices.

Revert max_sectors to the value of 1024.

'e0edc8c5 ("libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices")'
Signed-off-by: Xinyu Lin <xinyu0123@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
(cherry picked from commit db5ff909)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Souza <kleber.souza@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

6dd8531e

KVM: s390: Enable all facility bits that are known good for passthrough · f1dac4b9

Alexander Yarygin authored Feb 02, 2018

BugLink: http://bugs.launchpad.net/bugs/1747090

Some facility bits are in a range that is defined to be "ok for guests
without any necessary hypervisor changes". Enable those bits.
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit ed8dda0b)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

f1dac4b9

KVM: s390: wire up bpb feature · 94f92401

Christian Borntraeger authored Feb 02, 2018

BugLink: http://bugs.launchpad.net/bugs/1747090

The new firmware interfaces for branch prediction behaviour changes
are transparently available for the guest. Nevertheless, there is
new state attached that should be migrated and properly resetted.
Provide a mechanism for handling reset and migration.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(back ported from commit 35b3fde6)
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

94f92401

UBUNTU: SAUCE: turn off IBPB when full retpoline is present · d31a04f8

Andy Whitcroft authored Feb 08, 2018

CVE-2017-5715 (Spectre v2 Intel)

When we have full retpoline enabled then we do not actually require IBPB
flushes when entering the kernel.  Add a new use_ibpb bit to represent
when we have retpoline enabled.  Further split the enable bit into two
0x1 representing whether entry IBPB is enabled and 0x10 representing
whether kernel flushes for userspace/VMs etc are applied.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

d31a04f8