- 02 Oct, 2018 3 commits
-
-
Ross Lagerwall authored
BugLink: https://bugs.launchpad.net/bugs/1793753 Fix a race between ip_set_dump_start() and ip_set_swap(). The race is as follows: * Without holding the ref lock, ip_set_swap() checks ref_netlink of the set and it is 0. * ip_set_dump_start() takes a reference on the set. * ip_set_swap() does the swap (even though it now has a non-zero reference count). * ip_set_dump_start() gets the set from ip_set_list again which is now a different set since it has been swapped. * ip_set_dump_start() calls __ip_set_put_netlink() and hits a BUG_ON due to the reference count being 0. Fix this race by extending the critical region in which the ref lock is held to include checking the ref counts. The race can be reproduced with the following script: while :; do ipset destroy hash_ip1 ipset destroy hash_ip2 ipset create hash_ip1 hash:ip family inet hashsize 1024 \ maxelem 500000 ipset create hash_ip2 hash:ip family inet hashsize 300000 \ maxelem 500000 ipset create hash_ip3 hash:ip family inet hashsize 1024 \ maxelem 500000 ipset save & ipset swap hash_ip3 hash_ip2 ipset destroy hash_ip3 wait done Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> (cherry picked from commit e5173418) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Vishwanath Pai authored
BugLink: https://bugs.launchpad.net/bugs/1793753 This fix adds a new reference counter (ref_netlink) for the struct ip_set. The other reference counter (ref) can be swapped out by ip_set_swap and we need a separate counter to keep track of references for netlink events like dump. Using the same ref counter for dump causes a race condition which can be demonstrated by the following script: ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \ counters ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \ counters ipset save & ipset swap hash_ip3 hash_ip2 ipset destroy hash_ip3 /* will crash the machine */ Swap will exchange the values of ref so destroy will see ref = 0 instead of ref = 1. With this fix in place swap will not succeed because ipset save still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink). Both delete and swap will error out if ref_netlink != 0 on the set. Note: The changes to *_head functions is because previously we would increment ref whenever we called these functions, we don't do that anymore. Reviewed-by: Joshua Hunt <johunt@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> (cherry picked from commit 596cf3fe) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Thadeu Lima de Souza Cascardo authored
When we are checking for the existence of ABI files, we will use the previous changelog stanza version as part of the lookup path. On backport kernels where we just copy the master kernel ABI, that ABI path will not have the same version as the backport kernel. We didn't catch this before because we didn't call the insertchanges rule for backports, which would call final-checks, and also because we changed the changelog to use the version from the master kernel. As we plan to change those two factors, we will start seeing check failures that we shouldn't. Ignore: yes Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 28 Sep, 2018 26 commits
-
-
Marcelo Henrique Cerri authored
BugLink: http://bugs.launchpad.net/bugs/1793461 Update the startnewrelease target to support backport version numbers when creating the new changelog entry. Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 Added memory barriers where they were missing to support multiple architectures, and removed redundant ones. As part of removing the redundant memory barriers and improving performance, we moved to more relaxed versions of memory barriers, as well as to the more relaxed version of writel - writel_relaxed, while maintaining correctness. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 37dff155 linux-next) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 Add READ_ONCE calls where necessary (for example when iterating over a memory field that gets updated by the hardware). Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 28abf4e9 linux-next) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 acquire the rtnl_lock during device destruction to avoid using partially destroyed device. ena_remove() shares almost the same logic as ena_destroy_device(), so use ena_destroy_device() and avoid duplications. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 944b28aa linux-next) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 ena_destroy_device() can potentially be called twice. To avoid this, check that the device is running and only then proceed destroying it. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit fe870c77 linux-next) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 When ena_destroy_device() is called from ena_suspend(), the device is still reachable from the driver. Therefore, the driver can send a command to the device to free all resources. However, in all other cases of calling ena_destroy_device(), the device is potentially in an error state and unreachable from the driver. In these cases the driver must not send commands to the device. The current implementation does not request resource freeing from the device even when possible. We add the graceful parameter to ena_destroy_device() to enable resource freeing when possible, and use it in ena_suspend(). Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit cfa324a5 linux-next) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 The buffer length field in the ena rx descriptor is 16 bit, and the current driver passes a full page in each ena rx descriptor. When PAGE_SIZE equals 64kB or more, the buffer length field becomes zero. To solve this issue, limit the ena Rx descriptor to use 16kB even when allocating 64kB kernel pages. This change would not impact ena device functionality, as 16kB is still larger than maximum MTU. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit ef5b0771 linux-next) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 Starting with driver version 1.5.0, in case of a surprise device unplug, there is a race caused by invoking ena_destroy_device() from two different places. As a result, the readless register might be accessed after it was destroyed. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 772ed869 linux-next) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Gal Pressman authored
BugLink: http://bugs.launchpad.net/bugs/1792044 UBSAN triggers the following undefined behaviour warnings: [...] [ 13.236124] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:468:22 [ 13.240043] shift exponent 64 is too large for 64-bit type 'long long unsigned int' [...] [ 13.744769] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:373:4 [ 13.748694] shift exponent 64 is too large for 64-bit type 'long long unsigned int' [...] When splitting the address to high and low, GENMASK_ULL is used to generate a bitmask with dma_addr_bits field from io_sq (in ena_com_prepare_tx and ena_com_add_single_rx_desc). The problem is that dma_addr_bits is not initialized with a proper value (besides being cleared in ena_com_create_io_queue). Assign dma_addr_bits the correct value that is stored in ena_dev when initializing the SQ. Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Gal Pressman <pressmangal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 101f0cd4) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Kees Cook authored
BugLink: http://bugs.launchpad.net/bugs/1792044 As implemented in upstream commit a86854d0 "treewide: devm_kzalloc() -> devm_kcalloc()", for ena only. (backported from commit a86854d0) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Sinan Kaya authored
BugLink: http://bugs.launchpad.net/bugs/1792044 Code includes barrier() followed by writel(). writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Create a new wrapper function with relaxed write operator. Use the new wrapper when a write is following a barrier(). Since code already has an explicit barrier call, changing writel() to writel_relaxed() and adding mmiowb() for ordering protection. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 6d2e1a8d) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 ENA admin command queue errors are not handled as part of ena_down(). As a result, in case of error admin queue transitions to non-running state and aborts all subsequent commands including those coming from ena_up(). Reset scheduled by the driver from the timer service context would not proceed due to sharing rtnl with ena_up()/ena_down() Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit ee4552aa) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 70097445) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 A mechanism for detection of stuck Rx/Tx rings due to missed or misrouted interrupts. Check if there are unhandled completion descriptors before the first MSI-X interrupt arrived. The check is per queue and per interrupt vector. Once such condition is detected, driver and device reset is scheduled. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 8510e1a3) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 In rare cases, ena driver would reset and re-start the device, for example, in case of misbehaving application that causes transmit timeout The first step in the reset procedure is to stop the Tx traffic by calling ena_carrier_off(). After the driver have just started the device reset procedure, device happens to send an asynchronous notification (via AENQ) to the driver than there was a link change (to link-up state). This link change is mapped to a call to netif_carrier_on() which re-activates the Tx queues, violating the assumption of no tx traffic until device reset is completed, as the reset task might still be in the process of queues initialization, leading to an access to uninitialized memory. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit d18e4f68) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 046b3071) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 58894d52) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 Add a new statistic to ethtool stats that show the number of packets without transmit acknowledgement from ENA device. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 11095fdb) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 8c5c7abd) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 Remove ena_device_io_suspend/resume() methods Those methods were intend to be used by the device to trigger suspend/resume but eventually it was dropped. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit dbeaf1e3) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 The ena admin commands timeout is in resolutions of 100ms. Therefore, When the driver works in polling mode, it sleeps for 100ms each time. The overall boot time of the ENA driver is ~1.5 sec. To reduce the boot time, This change modifies the granularity of the sleeps to 5ms. This change improves the boot time to 220ms. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 88aef2f5) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 ethtool ena_get_channels() expose the max number of queues as the max number of queues ENA supports (128 queues) and not the actual number of created queues. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit a59df396) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 This failure is rare and only found on testing where deliberately fail devm_ioremap() [ 451.170464] ena 0000:04:00.0: failed to remap regs bar 451.170549] Workqueue: pciehp-1 pciehp_power_thread [ 451.170551] task: ffff88085a5f2d00 task.stack: ffffc9000756c000 [ 451.170552] RIP: 0010:devm_iounmap+0x2d/0x40 [ 451.170553] RSP: 0018:ffffc9000756fac0 EFLAGS: 00010282 [ 451.170554] RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000000 [ 451.170555] RDX: ffffffff813a7e00 RSI: 0000000000000282 RDI: 0000000000000282 [ 451.170556] RBP: ffffc9000756fac8 R08: 00000000fffffffe R09: 00000000000009b7 [ 451.170557] R10: 0000000000000005 R11: 00000000000009b6 R12: ffff880856c9d0a0 [ 451.170558] R13: ffffc9000f5c90c0 R14: ffff880856c9d0a0 R15: 0000000000000028 [ 451.170559] FS: 0000000000000000(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000 [ 451.170560] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 451.170561] CR2: 00007f169038b000 CR3: 0000000001c09000 CR4: 00000000003406f0 [ 451.170562] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 451.170562] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 451.170563] Call Trace: [ 451.170572] ena_release_bars.isra.48+0x34/0x60 [ena] [ 451.170574] ena_probe+0x144/0xd90 [ena] [ 451.170579] ? ida_simple_get+0x98/0x100 [ 451.170585] ? kernfs_next_descendant_post+0x40/0x50 [ 451.170591] local_pci_probe+0x45/0xa0 [ 451.170592] pci_device_probe+0x157/0x180 [ 451.170599] driver_probe_device+0x2a8/0x460 [ 451.170600] __device_attach_driver+0x7e/0xe0 [ 451.170602] ? driver_allows_async_probing+0x30/0x30 [ 451.170603] bus_for_each_drv+0x68/0xb0 [ 451.170605] __device_attach+0xdd/0x160 [ 451.170607] device_attach+0x10/0x20 [ 451.170610] pci_bus_add_device+0x4f/0xa0 [ 451.170611] pci_bus_add_devices+0x39/0x70 [ 451.170613] pciehp_configure_device+0x96/0x120 [ 451.170614] pciehp_enable_slot+0x1b3/0x290 [ 451.170616] pciehp_power_thread+0x3b/0xb0 [ 451.170622] process_one_work+0x149/0x360 [ 451.170623] worker_thread+0x4d/0x3c0 [ 451.170626] kthread+0x109/0x140 [ 451.170627] ? rescuer_thread+0x380/0x380 [ 451.170628] ? kthread_park+0x60/0x60 [ 451.170632] ret_from_fork+0x25/0x30 Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 411838e7) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Netanel Belgazal authored
BugLink: http://bugs.launchpad.net/bugs/1792044 Decrease log level of checksum errors as these messages can be triggered remotely by bad packets. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit cd7aea18) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Tobias Klauser authored
BugLink: http://bugs.launchpad.net/bugs/1792044 IS_ERR() already implies unlikely(), so it can be omitted. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 1f4cf93b) Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Kleber Sacilotto de Souza authored
Ignore: yes Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 24 Sep, 2018 4 commits
-
-
Stefan Bader authored
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Vincent Pelletier authored
This change has the following effects, in order of descreasing importance: 1) Prevent a stack buffer overflow. 2) Do not append an unnecessary NULL to an anyway binary buffer, which is writing one byte past client_digest when caller is: chap_string_to_hex(client_digest, chap_r, strlen(chap_r)); The latter was found by KASAN (see below) when input value hes expected size (32 hex chars), and further analysis revealed a stack buffer overflow can happen when network-received value is longer, allowing an unauthenticated remote attacker to smash up to 17 bytes after destination buffer (16 bytes attacker-controlled and one null). As switching to hex2bin requires specifying destination buffer length, and does not internally append any null, it solves both issues. This addresses CVE-2018-14633. Beyond this: - Validate received value length and check hex2bin accepted the input, to log this rejection reason instead of just failing authentication. - Only log received CHAP_R and CHAP_C values once they passed sanity checks. ================================================================== BUG: KASAN: stack-out-of-bounds in chap_string_to_hex+0x32/0x60 [iscsi_target_mod] Write of size 1 at addr ffff8801090ef7c8 by task kworker/0:0/1021 CPU: 0 PID: 1021 Comm: kworker/0:0 Tainted: G O 4.17.8kasan.sess.connops+ #2 Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 05/19/2014 Workqueue: events iscsi_target_do_login_rx [iscsi_target_mod] Call Trace: dump_stack+0x71/0xac print_address_description+0x65/0x22e ? chap_string_to_hex+0x32/0x60 [iscsi_target_mod] kasan_report.cold.6+0x241/0x2fd chap_string_to_hex+0x32/0x60 [iscsi_target_mod] chap_server_compute_md5.isra.2+0x2cb/0x860 [iscsi_target_mod] ? chap_binaryhex_to_asciihex.constprop.5+0x50/0x50 [iscsi_target_mod] ? ftrace_caller_op_ptr+0xe/0xe ? __orc_find+0x6f/0xc0 ? unwind_next_frame+0x231/0x850 ? kthread+0x1a0/0x1c0 ? ret_from_fork+0x35/0x40 ? ret_from_fork+0x35/0x40 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod] ? deref_stack_reg+0xd0/0xd0 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod] ? is_module_text_address+0xa/0x11 ? kernel_text_address+0x4c/0x110 ? __save_stack_trace+0x82/0x100 ? ret_from_fork+0x35/0x40 ? save_stack+0x8c/0xb0 ? 0xffffffffc1660000 ? iscsi_target_do_login+0x155/0x8d0 [iscsi_target_mod] ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod] ? process_one_work+0x35c/0x640 ? worker_thread+0x66/0x5d0 ? kthread+0x1a0/0x1c0 ? ret_from_fork+0x35/0x40 ? iscsi_update_param_value+0x80/0x80 [iscsi_target_mod] ? iscsit_release_cmd+0x170/0x170 [iscsi_target_mod] chap_main_loop+0x172/0x570 [iscsi_target_mod] ? chap_server_compute_md5.isra.2+0x860/0x860 [iscsi_target_mod] ? rx_data+0xd6/0x120 [iscsi_target_mod] ? iscsit_print_session_params+0xd0/0xd0 [iscsi_target_mod] ? cyc2ns_read_begin.part.2+0x90/0x90 ? _raw_spin_lock_irqsave+0x25/0x50 ? memcmp+0x45/0x70 iscsi_target_do_login+0x875/0x8d0 [iscsi_target_mod] ? iscsi_target_check_first_request.isra.5+0x1a0/0x1a0 [iscsi_target_mod] ? del_timer+0xe0/0xe0 ? memset+0x1f/0x40 ? flush_sigqueue+0x29/0xd0 iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod] ? iscsi_target_nego_release+0x80/0x80 [iscsi_target_mod] ? iscsi_target_restore_sock_callbacks+0x130/0x130 [iscsi_target_mod] process_one_work+0x35c/0x640 worker_thread+0x66/0x5d0 ? flush_rcu_work+0x40/0x40 kthread+0x1a0/0x1c0 ? kthread_bind+0x30/0x30 ret_from_fork+0x35/0x40 The buggy address belongs to the page: page:ffffea0004243bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x17fffc000000000() raw: 017fffc000000000 0000000000000000 0000000000000000 00000000ffffffff raw: ffffea0004243c20 ffffea0004243ba0 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801090ef680: f2 f2 f2 f2 f2 f2 f2 01 f2 f2 f2 f2 f2 f2 f2 00 ffff8801090ef700: f2 f2 f2 f2 f2 f2 f2 00 02 f2 f2 f2 f2 f2 f2 00 >ffff8801090ef780: 00 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f2 f2 f2 f2 00 ^ ffff8801090ef800: 00 f2 f2 f2 f2 f2 f2 00 00 00 00 02 f2 f2 f2 f2 ffff8801090ef880: f2 f2 f2 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 ================================================================== Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> CVE-2018-14633 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Linus Torvalds authored
Jann Horn points out that the vmacache_flush_all() function is not only potentially expensive, it's buggy too. It also happens to be entirely unnecessary, because the sequence number overflow case can be avoided by simply making the sequence number be 64-bit. That doesn't even grow the data structures in question, because the other adjacent fields are already 64-bit. So simplify the whole thing by just making the sequence number overflow case go away entirely, which gets rid of all the complications and makes the code faster too. Win-win. [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics also just goes away entirely with this ] Reported-by: Jann Horn <jannh@google.com> Suggested-by: Will Deacon <will.deacon@arm.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> CVE-2018-17182 (backported from commit 7a9cdebd) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
Stefan Bader authored
Ignore: yes Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
-
- 11 Sep, 2018 7 commits
-
-
Kleber Sacilotto de Souza authored
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
Tyler Hicks authored
This reverts commit 81774d48 which was part of an out-of-tree mitigation for CVE-2017-5753 (Spectre variant 1), in the BPF subsystem, that was available at the time of the coordinated release date. The Ubuntu kernel has since rebased on top of newer linux-stable releases and picked up commit b2157399 ("bpf: prevent out-of-bounds speculation") which is upstream's mitigation of Spectre variant 1 in the BPF code. CVE-2017-5753 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
Tyler Hicks authored
This reverts commit 5b9ee259 which was part of an out-of-tree mitigation for CVE-2017-5753 (Spectre variant 1), in the BPF subsystem, that was available at the time of the coordinated release date. The Ubuntu kernel has since rebased on top of newer linux-stable releases and picked up commit b2157399 ("bpf: prevent out-of-bounds speculation") which is upstream's mitigation of Spectre variant 1 in the BPF code. CVE-2017-5753 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
Daniel Borkmann authored
While reviewing the verifier code, I recently noticed that the following two program variants in relation to tail calls can be loaded. Variant 1: # bpftool p d x i 15 0: (15) if r1 == 0x0 goto pc+3 1: (18) r2 = map[id:5] 3: (05) goto pc+2 4: (18) r2 = map[id:6] 6: (b7) r3 = 7 7: (35) if r3 >= 0xa0 goto pc+2 8: (54) (u32) r3 &= (u32) 255 9: (85) call bpf_tail_call#12 10: (b7) r0 = 1 11: (95) exit # bpftool m s i 5 5: prog_array flags 0x0 key 4B value 4B max_entries 4 memlock 4096B # bpftool m s i 6 6: prog_array flags 0x0 key 4B value 4B max_entries 160 memlock 4096B Variant 2: # bpftool p d x i 20 0: (15) if r1 == 0x0 goto pc+3 1: (18) r2 = map[id:8] 3: (05) goto pc+2 4: (18) r2 = map[id:7] 6: (b7) r3 = 7 7: (35) if r3 >= 0x4 goto pc+2 8: (54) (u32) r3 &= (u32) 3 9: (85) call bpf_tail_call#12 10: (b7) r0 = 1 11: (95) exit # bpftool m s i 8 8: prog_array flags 0x0 key 4B value 4B max_entries 160 memlock 4096B # bpftool m s i 7 7: prog_array flags 0x0 key 4B value 4B max_entries 4 memlock 4096B In both cases the index masking inserted by the verifier in order to control out of bounds speculation from a CPU via b2157399 ("bpf: prevent out-of-bounds speculation") seems to be incorrect in what it is enforcing. In the 1st variant, the mask is applied from the map with the significantly larger number of entries where we would allow to a certain degree out of bounds speculation for the smaller map, and in the 2nd variant where the mask is applied from the map with the smaller number of entries, we get buggy behavior since we truncate the index of the larger map. The original intent from commit b2157399 is to reject such occasions where two or more different tail call maps are used in the same tail call helper invocation. However, the check on the BPF_MAP_PTR_POISON is never hit since we never poisoned the saved pointer in the first place! We do this explicitly for map lookups but in case of tail calls we basically used the tail call map in insn_aux_data that was processed in the most recent path which the verifier walked. Thus any prior path that stored a pointer in insn_aux_data at the helper location was always overridden. Fix it by moving the map pointer poison logic into a small helper that covers both BPF helpers with the same logic. After that in fixup_bpf_calls() the poison check is then hit for tail calls and the program rejected. Latter only happens in unprivileged case since this is the *only* occasion where a rewrite needs to happen, and where such rewrite is specific to the map (max_entries, index_mask). In the privileged case the rewrite is generic for the insn->imm / insn->code update so multiple maps from different paths can be handled just fine since all the remaining logic happens in the instruction processing itself. This is similar to the case of map lookups: in case there is a collision of maps in fixup_bpf_calls() we must skip the inlined rewrite since this will turn the generic instruction sequence into a non- generic one. Thus the patch_call_imm will simply update the insn->imm location where the bpf_map_lookup_elem() will later take care of the dispatch. Given we need this 'poison' state as a check, the information of whether a map is an unpriv_array gets lost, so enforcing it prior to that needs an additional state. In general this check is needed since there are some complex and tail call intensive BPF programs out there where LLVM tends to generate such code occasionally. We therefore convert the map_ptr rather into map_state to store all this w/o extra memory overhead, and the bit whether one of the maps involved in the collision was from an unpriv_array thus needs to be retained as well there. Fixes: b2157399 ("bpf: prevent out-of-bounds speculation") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> CVE-2017-5753 (backported from commit c93552c4) [tyhicks: Ignore pointer poison related changes since poisoning is not part of 4.4] Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
Andi Kleen authored
BugLink: https://launchpad.net/bugs/1788563 On Nehalem and newer core CPUs the CPU cache internally uses 44 bits physical address space. The L1TF workaround is limited by this internal cache address width, and needs to have one bit free there for the mitigation to work. Older client systems report only 36bit physical address space so the range check decides that L1TF is not mitigated for a 36bit phys/32GB system with some memory holes. But since these actually have the larger internal cache width this warning is bogus because it would only really be needed if the system had more than 43bits of memory. Add a new internal x86_cache_bits field. Normally it is the same as the physical bits field reported by CPUID, but for Nehalem and newerforce it to be at least 44bits. Change the L1TF memory size warning to use the new cache_bits field to avoid bogus warnings and remove the bogus comment about memory size. Fixes: 17dbca11 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Reported-by: George Anchev <studio@anchev.net> Reported-by: Christopher Snowhill <kode54@gmail.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Cc: Michael Hocko <mhocko@suse.com> Cc: vbabka@suse.cz Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180824170351.34874-1-andi@firstfloor.org CVE-2018-3620 CVE-2018-3646 (backported from commit cc51e542) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Joseph Salisbury <joseph.salisbury@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
Vlastimil Babka authored
BugLink: https://launchpad.net/bugs/1788563 Two users have reported [1] that they have an "extremely unlikely" system with more than MAX_PA/2 memory and L1TF mitigation is not effective. In fact it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to holes in the e820 map, the main region is almost 500MB over the 32GB limit: [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable Suggestions to use 'mem=32G' to enable the L1TF mitigation while losing the 500MB revealed, that there's an off-by-one error in the check in l1tf_select_mitigation(). l1tf_pfn_limit() returns the last usable pfn (inclusive) and the range check in the mitigation path does not take this into account. Instead of amending the range check, make l1tf_pfn_limit() return the first PFN which is over the limit which is less error prone. Adjust the other users accordingly. [1] https://bugzilla.suse.com/show_bug.cgi?id=1105536 Fixes: 17dbca11 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Reported-by: George Anchev <studio@anchev.net> Reported-by: Christopher Snowhill <kode54@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180823134418.17008-1-vbabka@suse.cz CVE-2018-3620 CVE-2018-3646 (cherry picked from commit b0a182f8) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Joseph Salisbury <joseph.salisbury@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
Vlastimil Babka authored
BugLink: https://launchpad.net/bugs/1788563 On 32bit PAE kernels on 64bit hardware with enough physical bits, l1tf_pfn_limit() will overflow unsigned long. This in turn affects max_swapfile_size() and can lead to swapon returning -EINVAL. This has been observed in a 32bit guest with 42 bits physical address size, where max_swapfile_size() overflows exactly to 1 << 32, thus zero, and produces the following warning to dmesg: [ 6.396845] Truncating oversized swap area, only using 0k out of 2047996k Fix this by using unsigned long long instead. Fixes: 17dbca11 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Fixes: 377eeaa8 ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Reported-by: Dominique Leuenberger <dimstar@suse.de> Reported-by: Adrian Schroeter <adrian@suse.de> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180820095835.5298-1-vbabka@suse.cz CVE-2018-3620 CVE-2018-3646 (cherry picked from commit 9df95169) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Joseph Salisbury <joseph.salisbury@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-