Commits · 5b1c2c5f9b749595a45c6b9b4fdab85bde0ceca4 · Kirill Smelkov / linux

28 Sep, 2018 22 commits

net: ena: fix potential double ena_destroy_device() · 5b1c2c5f

Netanel Belgazal authored Sep 09, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

ena_destroy_device() can potentially be called twice.
To avoid this, check that the device is running and
only then proceed destroying it.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fe870c77 linux-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

5b1c2c5f

net: ena: fix device destruction to gracefully free resources · 85aeb9d7

Netanel Belgazal authored Sep 09, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

When ena_destroy_device() is called from ena_suspend(), the device is
still reachable from the driver. Therefore, the driver can send a command
to the device to free all resources.
However, in all other cases of calling ena_destroy_device(), the device is
potentially in an error state and unreachable from the driver. In these
cases the driver must not send commands to the device.

The current implementation does not request resource freeing from the
device even when possible. We add the graceful parameter to
ena_destroy_device() to enable resource freeing when possible, and
use it in ena_suspend().
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cfa324a5 linux-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

85aeb9d7

net: ena: fix driver when PAGE_SIZE == 64kB · b9120b1e

Netanel Belgazal authored Sep 09, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

The buffer length field in the ena rx descriptor is 16 bit, and the
current driver passes a full page in each ena rx descriptor.
When PAGE_SIZE equals 64kB or more, the buffer length field becomes
zero.
To solve this issue, limit the ena Rx descriptor to use 16kB even
when allocating 64kB kernel pages. This change would not impact ena
device functionality, as 16kB is still larger than maximum MTU.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ef5b0771 linux-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

b9120b1e

net: ena: fix surprise unplug NULL dereference kernel crash · e5ec3324

Netanel Belgazal authored Sep 09, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

Starting with driver version 1.5.0, in case of a surprise device
unplug, there is a race caused by invoking ena_destroy_device()
from two different places. As a result, the readless register might
be accessed after it was destroyed.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 772ed869 linux-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

e5ec3324

net: ena: Fix use of uninitialized DMA address bits field · 04bca928

Gal Pressman authored Jul 26, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

UBSAN triggers the following undefined behaviour warnings:
[...]
[   13.236124] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:468:22
[   13.240043] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]
[   13.744769] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:373:4
[   13.748694] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]

When splitting the address to high and low, GENMASK_ULL is used to generate
a bitmask with dma_addr_bits field from io_sq (in ena_com_prepare_tx and
ena_com_add_single_rx_desc).
The problem is that dma_addr_bits is not initialized with a proper value
(besides being cleared in ena_com_create_io_queue).
Assign dma_addr_bits the correct value that is stored in ena_dev when
initializing the SQ.

Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Gal Pressman <pressmangal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 101f0cd4)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

04bca928

UBUNTU: SAUCE: ena: devm_kzalloc() -> devm_kcalloc() · c9063089

Kees Cook authored Sep 11, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

As implemented in upstream commit a86854d0
"treewide: devm_kzalloc() -> devm_kcalloc()", for ena only.

(backported from commit a86854d0)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

c9063089

net: ena: Eliminate duplicate barriers on weakly-ordered archs · ed2759f5

Sinan Kaya authored Mar 25, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

Code includes barrier() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a barrier().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed() and adding mmiowb() for ordering protection.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6d2e1a8d)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ed2759f5

net: ena: fix error handling in ena_down() sequence · 40c8b895

Netanel Belgazal authored Jan 03, 2018

BugLink: http://bugs.launchpad.net/bugs/1792044

ENA admin command queue errors are not handled as part of ena_down().
As a result, in case of error admin queue transitions to non-running
state and aborts all subsequent commands including those coming from
ena_up(). Reset scheduled by the driver from the timer service
context would not proceed due to sharing rtnl with ena_up()/ena_down()
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ee4552aa)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

40c8b895

net: ena: increase ena driver version to 1.5.0 · bafc7511

Netanel Belgazal authored Dec 28, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 70097445)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

bafc7511

net: ena: add detection and recovery mechanism for handling missed/misrouted MSI-X · a49fafdb

Netanel Belgazal authored Dec 28, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

A mechanism for detection of stuck Rx/Tx rings due to missed or
misrouted interrupts.
Check if there are unhandled completion descriptors before the first
MSI-X interrupt arrived.
The check is per queue and per interrupt vector.
Once such condition is detected, driver and device reset is scheduled.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8510e1a3)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

a49fafdb

net: ena: fix race condition between device reset and link up setup · f03a0410

Netanel Belgazal authored Nov 19, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

In rare cases, ena driver would reset and re-start the device,
for example, in case of misbehaving application that causes
transmit timeout

The first step in the reset procedure is to stop the Tx traffic by
calling ena_carrier_off().

After the driver have just started the device reset procedure, device
happens to send an asynchronous notification (via AENQ) to the driver
than there was a link change (to link-up state).
This link change is mapped to a call to netif_carrier_on() which
re-activates the Tx queues, violating the assumption of no tx traffic
until device reset is completed, as the reset task might still be in
the process of queues initialization, leading to an access to
uninitialized memory.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d18e4f68)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

f03a0410

net: ena: increase ena driver version to 1.3.0 · 31b60681

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 046b3071)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

31b60681

net: ena: add new admin define for future support of IPv6 RSS · 70946709

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 58894d52)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

70946709

net: ena: add statistics for missed tx packets · 7bd1dfa4

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

Add a new statistic to ethtool stats that show the number of packets
without transmit acknowledgement from ENA device.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 11095fdb)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

7bd1dfa4

net: ena: add power management ops to the ENA driver · 2e4ae3d7

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8c5c7abd)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

2e4ae3d7

net: ena: remove legacy suspend suspend/resume support · 7c9f4160

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

Remove ena_device_io_suspend/resume() methods
Those methods were intend to be used by the device to trigger
suspend/resume but eventually it was dropped.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit dbeaf1e3)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

7c9f4160

net: ena: improve ENA driver boot time. · a0ecdeb8

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

The ena admin commands timeout is in resolutions of 100ms.
Therefore, When the driver works in polling mode, it sleeps for 100ms
each time. The overall boot time of the ENA driver is ~1.5 sec.
To reduce the boot time, This change modifies the granularity of
the sleeps to 5ms.
This change improves the boot time to 220ms.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 88aef2f5)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

a0ecdeb8

net: ena: fix wrong max Tx/Rx queues on ethtool · 1168aa00

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

ethtool ena_get_channels() expose the max number of queues as the max
number of queues ENA supports (128 queues) and not the actual number
of created queues.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a59df396)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

1168aa00

net: ena: fix rare kernel crash when bar memory remap fails · 9fa1b332

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

This failure is rare and only found on testing where deliberately fail
devm_ioremap()

[  451.170464] ena 0000:04:00.0: failed to remap regs bar
451.170549] Workqueue: pciehp-1 pciehp_power_thread
[  451.170551] task: ffff88085a5f2d00 task.stack: ffffc9000756c000
[  451.170552] RIP: 0010:devm_iounmap+0x2d/0x40
[  451.170553] RSP: 0018:ffffc9000756fac0 EFLAGS: 00010282
[  451.170554] RAX: 00000000fffffffe RBX: 0000000000000000 RCX:
0000000000000000
[  451.170555] RDX: ffffffff813a7e00 RSI: 0000000000000282 RDI:
0000000000000282
[  451.170556] RBP: ffffc9000756fac8 R08: 00000000fffffffe R09:
00000000000009b7
[  451.170557] R10: 0000000000000005 R11: 00000000000009b6 R12:
ffff880856c9d0a0
[  451.170558] R13: ffffc9000f5c90c0 R14: ffff880856c9d0a0 R15:
0000000000000028
[  451.170559] FS:  0000000000000000(0000) GS:ffff88085f400000(0000)
knlGS:0000000000000000
[  451.170560] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  451.170561] CR2: 00007f169038b000 CR3: 0000000001c09000 CR4:
00000000003406f0
[  451.170562] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  451.170562] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  451.170563] Call Trace:
[  451.170572]  ena_release_bars.isra.48+0x34/0x60 [ena]
[  451.170574]  ena_probe+0x144/0xd90 [ena]
[  451.170579]  ? ida_simple_get+0x98/0x100
[  451.170585]  ? kernfs_next_descendant_post+0x40/0x50
[  451.170591]  local_pci_probe+0x45/0xa0
[  451.170592]  pci_device_probe+0x157/0x180
[  451.170599]  driver_probe_device+0x2a8/0x460
[  451.170600]  __device_attach_driver+0x7e/0xe0
[  451.170602]  ? driver_allows_async_probing+0x30/0x30
[  451.170603]  bus_for_each_drv+0x68/0xb0
[  451.170605]  __device_attach+0xdd/0x160
[  451.170607]  device_attach+0x10/0x20
[  451.170610]  pci_bus_add_device+0x4f/0xa0
[  451.170611]  pci_bus_add_devices+0x39/0x70
[  451.170613]  pciehp_configure_device+0x96/0x120
[  451.170614]  pciehp_enable_slot+0x1b3/0x290
[  451.170616]  pciehp_power_thread+0x3b/0xb0
[  451.170622]  process_one_work+0x149/0x360
[  451.170623]  worker_thread+0x4d/0x3c0
[  451.170626]  kthread+0x109/0x140
[  451.170627]  ? rescuer_thread+0x380/0x380
[  451.170628]  ? kthread_park+0x60/0x60
[  451.170632]  ret_from_fork+0x25/0x30
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 411838e7)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

9fa1b332

net: ena: reduce the severity of some printouts · 322edf4f

Netanel Belgazal authored Oct 17, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

Decrease log level of checksum errors as these messages can be
triggered remotely by bad packets.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cd7aea18)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

322edf4f

net: ena: Remove redundant unlikely() · 70166cf8

Tobias Klauser authored Sep 26, 2017

BugLink: http://bugs.launchpad.net/bugs/1792044

IS_ERR() already implies unlikely(), so it can be omitted.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1f4cf93b)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

70166cf8

UBUNTU: Start new release · b44c25ec
Kleber Sacilotto de Souza authored Sep 28, 2018
```
Ignore: yes
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
```
b44c25ec

24 Sep, 2018 4 commits

UBUNTU: Ubuntu-4.4.0-137.163 · f299608c
Stefan Bader authored Sep 24, 2018
```
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
```
f299608c

iscsi target: Use hex2bin instead of a re-implementation · 5772cc87

Vincent Pelletier authored Sep 09, 2018

This change has the following effects, in order of descreasing importance:
1) Prevent a stack buffer overflow.
2) Do not append an unnecessary NULL to an anyway binary buffer, which
   is writing one byte past client_digest when caller is:
     chap_string_to_hex(client_digest, chap_r, strlen(chap_r));

The latter was found by KASAN (see below) when input value hes expected
size (32 hex chars), and further analysis revealed a stack buffer overflow
can  happen when network-received value is longer, allowing an
unauthenticated remote attacker to smash up to 17 bytes after destination
buffer (16 bytes attacker-controlled and one null).
As switching to hex2bin requires specifying destination buffer length, and
does not internally append any null, it solves both issues.

This addresses CVE-2018-14633.

Beyond this:
- Validate received value length and check hex2bin accepted the input, to
  log this rejection reason instead of just failing authentication.
- Only log received CHAP_R and CHAP_C values once they passed sanity
  checks.

==================================================================
BUG: KASAN: stack-out-of-bounds in chap_string_to_hex+0x32/0x60 [iscsi_target_mod]
Write of size 1 at addr ffff8801090ef7c8 by task kworker/0:0/1021

CPU: 0 PID: 1021 Comm: kworker/0:0 Tainted: G           O      4.17.8kasan.sess.connops+ #2
Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 05/19/2014
Workqueue: events iscsi_target_do_login_rx [iscsi_target_mod]
Call Trace:
 dump_stack+0x71/0xac
 print_address_description+0x65/0x22e
 ? chap_string_to_hex+0x32/0x60 [iscsi_target_mod]
 kasan_report.cold.6+0x241/0x2fd
 chap_string_to_hex+0x32/0x60 [iscsi_target_mod]
 chap_server_compute_md5.isra.2+0x2cb/0x860 [iscsi_target_mod]
 ? chap_binaryhex_to_asciihex.constprop.5+0x50/0x50 [iscsi_target_mod]
 ? ftrace_caller_op_ptr+0xe/0xe
 ? __orc_find+0x6f/0xc0
 ? unwind_next_frame+0x231/0x850
 ? kthread+0x1a0/0x1c0
 ? ret_from_fork+0x35/0x40
 ? ret_from_fork+0x35/0x40
 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? deref_stack_reg+0xd0/0xd0
 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? is_module_text_address+0xa/0x11
 ? kernel_text_address+0x4c/0x110
 ? __save_stack_trace+0x82/0x100
 ? ret_from_fork+0x35/0x40
 ? save_stack+0x8c/0xb0
 ? 0xffffffffc1660000
 ? iscsi_target_do_login+0x155/0x8d0 [iscsi_target_mod]
 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? process_one_work+0x35c/0x640
 ? worker_thread+0x66/0x5d0
 ? kthread+0x1a0/0x1c0
 ? ret_from_fork+0x35/0x40
 ? iscsi_update_param_value+0x80/0x80 [iscsi_target_mod]
 ? iscsit_release_cmd+0x170/0x170 [iscsi_target_mod]
 chap_main_loop+0x172/0x570 [iscsi_target_mod]
 ? chap_server_compute_md5.isra.2+0x860/0x860 [iscsi_target_mod]
 ? rx_data+0xd6/0x120 [iscsi_target_mod]
 ? iscsit_print_session_params+0xd0/0xd0 [iscsi_target_mod]
 ? cyc2ns_read_begin.part.2+0x90/0x90
 ? _raw_spin_lock_irqsave+0x25/0x50
 ? memcmp+0x45/0x70
 iscsi_target_do_login+0x875/0x8d0 [iscsi_target_mod]
 ? iscsi_target_check_first_request.isra.5+0x1a0/0x1a0 [iscsi_target_mod]
 ? del_timer+0xe0/0xe0
 ? memset+0x1f/0x40
 ? flush_sigqueue+0x29/0xd0
 iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? iscsi_target_nego_release+0x80/0x80 [iscsi_target_mod]
 ? iscsi_target_restore_sock_callbacks+0x130/0x130 [iscsi_target_mod]
 process_one_work+0x35c/0x640
 worker_thread+0x66/0x5d0
 ? flush_rcu_work+0x40/0x40
 kthread+0x1a0/0x1c0
 ? kthread_bind+0x30/0x30
 ret_from_fork+0x35/0x40

The buggy address belongs to the page:
page:ffffea0004243bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x17fffc000000000()
raw: 017fffc000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: ffffea0004243c20 ffffea0004243ba0 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801090ef680: f2 f2 f2 f2 f2 f2 f2 01 f2 f2 f2 f2 f2 f2 f2 00
 ffff8801090ef700: f2 f2 f2 f2 f2 f2 f2 00 02 f2 f2 f2 f2 f2 f2 00
>ffff8801090ef780: 00 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f2 f2 f2 f2 00
                                              ^
 ffff8801090ef800: 00 f2 f2 f2 f2 f2 f2 00 00 00 00 02 f2 f2 f2 f2
 ffff8801090ef880: f2 f2 f2 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00
==================================================================
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>

CVE-2018-14633
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

5772cc87

mm: get rid of vmacache_flush_all() entirely · 7ed926e6

Linus Torvalds authored Sep 12, 2018

Jann Horn points out that the vmacache_flush_all() function is not only
potentially expensive, it's buggy too.  It also happens to be entirely
unnecessary, because the sequence number overflow case can be avoided by
simply making the sequence number be 64-bit.  That doesn't even grow the
data structures in question, because the other adjacent fields are
already 64-bit.

So simplify the whole thing by just making the sequence number overflow
case go away entirely, which gets rid of all the complications and makes
the code faster too.  Win-win.

[ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics
  also just goes away entirely with this ]
Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

CVE-2018-17182

(backported from commit 7a9cdebd)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

7ed926e6

UBUNTU: Start new release · fff97d72
Stefan Bader authored Sep 24, 2018
```
Ignore: yes
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
```
fff97d72

11 Sep, 2018 7 commits

UBUNTU: Ubuntu-4.4.0-136.162 · 8a1a8196
Kleber Sacilotto de Souza authored Sep 11, 2018
```
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
```
8a1a8196

Revert "bpf: prevent speculative execution in eBPF interpreter" · 48a02848

Tyler Hicks authored Sep 11, 2018

This reverts commit 81774d48 which was
part of an out-of-tree mitigation for CVE-2017-5753 (Spectre variant 1),
in the BPF subsystem, that was available at the time of the coordinated
release date. The Ubuntu kernel has since rebased on top of newer
linux-stable releases and picked up commit b2157399 ("bpf: prevent
out-of-bounds speculation") which is upstream's mitigation of Spectre
variant 1 in the BPF code.

CVE-2017-5753
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

48a02848

Revert "UBUNTU: SAUCE: bpf: Use barrier_nospec() instead of osb()" · cb0321f0

Tyler Hicks authored Sep 11, 2018

This reverts commit 5b9ee259 which was
part of an out-of-tree mitigation for CVE-2017-5753 (Spectre variant 1),
in the BPF subsystem, that was available at the time of the coordinated
release date. The Ubuntu kernel has since rebased on top of newer
linux-stable releases and picked up commit b2157399 ("bpf: prevent
out-of-bounds speculation") which is upstream's mitigation of Spectre
variant 1 in the BPF code.

CVE-2017-5753
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

cb0321f0

bpf: properly enforce index mask to prevent out-of-bounds speculation · f0fe227f

Daniel Borkmann authored Sep 11, 2018

While reviewing the verifier code, I recently noticed that the
following two program variants in relation to tail calls can be
loaded.

Variant 1:

  # bpftool p d x i 15
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:5]
    3: (05) goto pc+2
    4: (18) r2 = map[id:6]
    6: (b7) r3 = 7
    7: (35) if r3 >= 0xa0 goto pc+2
    8: (54) (u32) r3 &= (u32) 255
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 5
    5: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B
  # bpftool m s i 6
    6: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B

Variant 2:

  # bpftool p d x i 20
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:8]
    3: (05) goto pc+2
    4: (18) r2 = map[id:7]
    6: (b7) r3 = 7
    7: (35) if r3 >= 0x4 goto pc+2
    8: (54) (u32) r3 &= (u32) 3
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 8
    8: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B
  # bpftool m s i 7
    7: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B

In both cases the index masking inserted by the verifier in order
to control out of bounds speculation from a CPU via b2157399
("bpf: prevent out-of-bounds speculation") seems to be incorrect
in what it is enforcing. In the 1st variant, the mask is applied
from the map with the significantly larger number of entries where
we would allow to a certain degree out of bounds speculation for
the smaller map, and in the 2nd variant where the mask is applied
from the map with the smaller number of entries, we get buggy
behavior since we truncate the index of the larger map.

The original intent from commit b2157399 is to reject such
occasions where two or more different tail call maps are used
in the same tail call helper invocation. However, the check on
the BPF_MAP_PTR_POISON is never hit since we never poisoned the
saved pointer in the first place! We do this explicitly for map
lookups but in case of tail calls we basically used the tail
call map in insn_aux_data that was processed in the most recent
path which the verifier walked. Thus any prior path that stored
a pointer in insn_aux_data at the helper location was always
overridden.

Fix it by moving the map pointer poison logic into a small helper
that covers both BPF helpers with the same logic. After that in
fixup_bpf_calls() the poison check is then hit for tail calls
and the program rejected. Latter only happens in unprivileged
case since this is the *only* occasion where a rewrite needs to
happen, and where such rewrite is specific to the map (max_entries,
index_mask). In the privileged case the rewrite is generic for
the insn->imm / insn->code update so multiple maps from different
paths can be handled just fine since all the remaining logic
happens in the instruction processing itself. This is similar
to the case of map lookups: in case there is a collision of
maps in fixup_bpf_calls() we must skip the inlined rewrite since
this will turn the generic instruction sequence into a non-
generic one. Thus the patch_call_imm will simply update the
insn->imm location where the bpf_map_lookup_elem() will later
take care of the dispatch. Given we need this 'poison' state
as a check, the information of whether a map is an unpriv_array
gets lost, so enforcing it prior to that needs an additional
state. In general this check is needed since there are some
complex and tail call intensive BPF programs out there where
LLVM tends to generate such code occasionally. We therefore
convert the map_ptr rather into map_state to store all this
w/o extra memory overhead, and the bit whether one of the maps
involved in the collision was from an unpriv_array thus needs
to be retained as well there.

Fixes: b2157399 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

CVE-2017-5753

(backported from commit c93552c4)
[tyhicks: Ignore pointer poison related changes since poisoning is not
          part of 4.4]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

f0fe227f

x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+ · 4eee1abe

Andi Kleen authored Sep 10, 2018

BugLink: https://launchpad.net/bugs/1788563

On Nehalem and newer core CPUs the CPU cache internally uses 44 bits
physical address space. The L1TF workaround is limited by this internal
cache address width, and needs to have one bit free there for the
mitigation to work.

Older client systems report only 36bit physical address space so the range
check decides that L1TF is not mitigated for a 36bit phys/32GB system with
some memory holes.

But since these actually have the larger internal cache width this warning
is bogus because it would only really be needed if the system had more than
43bits of memory.

Add a new internal x86_cache_bits field. Normally it is the same as the
physical bits field reported by CPUID, but for Nehalem and newerforce it to
be at least 44bits.

Change the L1TF memory size warning to use the new cache_bits field to
avoid bogus warnings and remove the bogus comment about memory size.

Fixes: 17dbca11 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Reported-by: George Anchev <studio@anchev.net>
Reported-by: Christopher Snowhill <kode54@gmail.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Michael Hocko <mhocko@suse.com>
Cc: vbabka@suse.cz
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180824170351.34874-1-andi@firstfloor.org

CVE-2018-3620
CVE-2018-3646

(backported from commit cc51e542)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

4eee1abe

x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM · c3a13447

Vlastimil Babka authored Sep 10, 2018

BugLink: https://launchpad.net/bugs/1788563

Two users have reported [1] that they have an "extremely unlikely" system
with more than MAX_PA/2 memory and L1TF mitigation is not effective. In
fact it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to
holes in the e820 map, the main region is almost 500MB over the 32GB limit:

[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable

Suggestions to use 'mem=32G' to enable the L1TF mitigation while losing the
500MB revealed, that there's an off-by-one error in the check in
l1tf_select_mitigation().

l1tf_pfn_limit() returns the last usable pfn (inclusive) and the range
check in the mitigation path does not take this into account.

Instead of amending the range check, make l1tf_pfn_limit() return the first
PFN which is over the limit which is less error prone. Adjust the other
users accordingly.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536

Fixes: 17dbca11 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Reported-by: George Anchev <studio@anchev.net>
Reported-by: Christopher Snowhill <kode54@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180823134418.17008-1-vbabka@suse.cz

CVE-2018-3620
CVE-2018-3646

(cherry picked from commit b0a182f8)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

c3a13447

x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit · 1649aa35

Vlastimil Babka authored Sep 10, 2018

BugLink: https://launchpad.net/bugs/1788563

On 32bit PAE kernels on 64bit hardware with enough physical bits,
l1tf_pfn_limit() will overflow unsigned long. This in turn affects
max_swapfile_size() and can lead to swapon returning -EINVAL. This has been
observed in a 32bit guest with 42 bits physical address size, where
max_swapfile_size() overflows exactly to 1 << 32, thus zero, and produces
the following warning to dmesg:

[    6.396845] Truncating oversized swap area, only using 0k out of 2047996k

Fix this by using unsigned long long instead.

Fixes: 17dbca11 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Fixes: 377eeaa8 ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2")
Reported-by: Dominique Leuenberger <dimstar@suse.de>
Reported-by: Adrian Schroeter <adrian@suse.de>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180820095835.5298-1-vbabka@suse.cz

CVE-2018-3620
CVE-2018-3646

(cherry picked from commit 9df95169)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

1649aa35

10 Sep, 2018 7 commits

x86/paravirt: Fix spectre-v2 mitigations for paravirt guests · 14493b56

Peter Zijlstra authored Aug 03, 2018

CVE-2018-15594

Nadav reported that on guests we're failing to rewrite the indirect
calls to CALLEE_SAVE paravirt functions. In particular the
pv_queued_spin_unlock() call is left unpatched and that is all over the
place. This obviously wrecks Spectre-v2 mitigation (for paravirt
guests) which relies on not actually having indirect calls around.

The reason is an incorrect clobber test in paravirt_patch_call(); this
function rewrites an indirect call with a direct call to the _SAME_
function, there is no possible way the clobbers can be different
because of this.

Therefore remove this clobber check. Also put WARNs on the other patch
failure case (not enough room for the instruction) which I've not seen
trigger in my (limited) testing.

Three live kernel image disassemblies for lock_sock_nested (as a small
function that illustrates the problem nicely). PRE is the current
situation for guests, POST is with this patch applied and NATIVE is with
or without the patch for !guests.

PRE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  *0xffffffff822299e8
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

POST:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock>
   0xffffffff817be9a5 <+53>:    xchg   %ax,%ax
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063aa0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

NATIVE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    movb   $0x0,(%rdi)
   0xffffffff817be9a3 <+51>:    nopl   0x0(%rax)
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

Fixes: 63f70270 ("[PATCH] i386: PARAVIRT: add common patching machinery")
Fixes: 3010a066 ("x86/paravirt, objtool: Annotate indirect calls")
Reported-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org
(cherry picked from commit 5800dc5c)
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

14493b56

Linux 4.4.144 · e3f1595b

Greg Kroah-Hartman authored Jul 25, 2018

BugLink: https://bugs.launchpad.net/bugs/1791080Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

e3f1595b

ubi: fastmap: Erase outdated anchor PEBs during attach · b986efbe

Sascha Hauer authored Dec 05, 2017

BugLink: https://bugs.launchpad.net/bugs/1791080

commit f78e5623 upstream.

The fastmap update code might erase the current fastmap anchor PEB
in case it doesn't find any new free PEB. When a power cut happens
in this situation we must not have any outdated fastmap anchor PEB
on the device, because that would be used to attach during next
boot.
The easiest way to make that sure is to erase all outdated fastmap
anchor PEBs synchronously during attach.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Fixes: dbb7d2a8 ("UBI: Add fastmap core")
Cc: <stable@vger.kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

b986efbe

ubi: Fix Fastmap's update_vol() · b5902a56

Richard Weinberger authored Aug 24, 2016

BugLink: https://bugs.launchpad.net/bugs/1791080

commit f7d11b33 upstream.

Usually Fastmap is free to consider every PEB in one of the pools
as newer than the existing PEB. Since PEBs in a pool are by definition
newer than everything else.
But update_vol() missed the case that a pool can contain more than
one candidate.

Cc: <stable@vger.kernel.org>
Fixes: dbb7d2a8 ("UBI: Add fastmap core")
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

b5902a56

ubi: Fix races around ubi_refill_pools() · 1409b55f

Richard Weinberger authored Aug 24, 2016

BugLink: https://bugs.launchpad.net/bugs/1791080

commit 2e8f08de upstream.

When writing a new Fastmap the first thing that happens
is refilling the pools in memory.
At this stage it is possible that new PEBs from the new pools
get already claimed and written with data.
If this happens before the new Fastmap data structure hits the
flash and we face power cut the freshly written PEB will not
scanned and unnoticed.

Solve the issue by locking the pools until Fastmap is written.

Cc: <stable@vger.kernel.org>
Fixes: dbb7d2a8 ("UBI: Add fastmap core")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

1409b55f

ubi: Be more paranoid while seaching for the most recent Fastmap · 22e96ba8

Richard Weinberger authored Jun 14, 2016

BugLink: https://bugs.launchpad.net/bugs/1791080

commit 74f2c6e9 upstream.

Since PEB erasure is asynchornous it can happen that there is
more than one Fastmap on the MTD. This is fine because the attach logic
will pick the Fastmap data structure with the highest sequence number.

On a not so well configured MTD stack spurious ECC errors are common.
Causes can be different, bad hardware, wrong operating modes, etc...
If the most current Fastmap renders bad due to ECC errors UBI might
pick an older Fastmap to attach from.
While this can only happen on an anyway broken setup it will show
completely different sympthoms and makes finding the root cause much
more difficult.
So, be debug friendly and fall back to scanning mode of we're facing
an ECC error while scanning for Fastmap.

Cc: <stable@vger.kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

22e96ba8

ubi: Rework Fastmap attach base code · 96a24fbd

Richard Weinberger authored Jun 14, 2016

BugLink: https://bugs.launchpad.net/bugs/1791080

commit fdf10ed7 upstream.

Introduce a new list to the UBI attach information
object to be able to deal better with old and corrupted
Fastmap eraseblocks.
Also move more Fastmap specific code into fastmap.c.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

96a24fbd