- 25 May, 2019 38 commits
-
-
NeilBrown authored
commit 2bc13b83 upstream. Currently if many flush requests are submitted to an md device is quick succession, they are serialized and can take a long to process them all. We don't really need to call flush all those times - a single flush call can satisfy all requests submitted before it started. So keep track of when the current flush started and when it finished, allow any pending flush that was requested before the flush started to complete without waiting any more. Test results from Xiao: Test is done on a raid10 device which is created by 4 SSDs. The tool is dbench. 1. The latest linux stable kernel Operation Count AvgLat MaxLat -------------------------------------------------- Deltree 768 10.509 78.305 Flush 2078376 0.013 10.094 Close 21787697 0.019 18.821 LockX 96580 0.007 3.184 Mkdir 384 0.008 0.062 Rename 1255883 0.191 23.534 ReadX 46495589 0.020 14.230 WriteX 14790591 7.123 60.706 Unlink 5989118 0.440 54.551 UnlockX 96580 0.005 2.736 FIND_FIRST 10393845 0.042 12.079 SET_FILE_INFORMATION 2415558 0.129 10.088 QUERY_FILE_INFORMATION 4711725 0.005 8.462 QUERY_PATH_INFORMATION 26883327 0.032 21.715 QUERY_FS_INFORMATION 4929409b 0.010 8.238 NTCreateX 29660080 0.100 53.268 Throughput 1034.88 MB/sec (sync open) 128 clients 128 procs max_latency=60.712 ms 2. With patch1 "Revert "MD: fix lock contention for flush bios"" Operation Count AvgLat MaxLat -------------------------------------------------- Deltree 256 8.326 36.761 Flush 693291 3.974 180.269 Close 7266404 0.009 36.929 LockX 32160 0.006 0.840 Mkdir 128 0.008 0.021 Rename 418755 0.063 29.945 ReadX 15498708 0.007 7.216 WriteX 4932310 22.482 267.928 Unlink 1997557 0.109 47.553 UnlockX 32160 0.004 1.110 FIND_FIRST 3465791 0.036 7.320 SET_FILE_INFORMATION 805825 0.015 1.561 QUERY_FILE_INFORMATION 1570950 0.005 2.403 QUERY_PATH_INFORMATION 8965483 0.013 14.277 QUERY_FS_INFORMATION 1643626 0.009 3.314 NTCreateX 9892174 0.061 41.278 Throughput 345.009 MB/sec (sync open) 128 clients 128 procs max_latency=267.939 m 3. With patch1 and patch2 Operation Count AvgLat MaxLat -------------------------------------------------- Deltree 768 9.570 54.588 Flush 2061354 0.666 15.102 Close 21604811 0.012 25.697 LockX 95770 0.007 1.424 Mkdir 384 0.008 0.053 Rename 1245411 0.096 12.263 ReadX 46103198 0.011 12.116 WriteX 14667988 7.375 60.069 Unlink 5938936 0.173 30.905 UnlockX 95770 0.005 4.147 FIND_FIRST 10306407 0.041 11.715 SET_FILE_INFORMATION 2395987 0.048 7.640 QUERY_FILE_INFORMATION 4672371 0.005 9.291 QUERY_PATH_INFORMATION 26656735 0.018 19.719 QUERY_FS_INFORMATION 4887940 0.010 7.654 NTCreateX 29410811 0.059 28.551 Throughput 1026.21 MB/sec (sync open) 128 clients 128 procs max_latency=60.075 ms Cc: <stable@vger.kernel.org> # v4.19+ Tested-by: Xiao Ni <xni@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
NeilBrown authored
commit 4bc034d3 upstream. This reverts commit 5a409b4f. This patch has two problems. 1/ it make multiple calls to submit_bio() from inside a make_request_fn. The bios thus submitted will be queued on current->bio_list and not submitted immediately. As the bios are allocated from a mempool, this can theoretically result in a deadlock - all the pool of requests could be in various ->bio_list queues and a subsequent mempool_alloc could block waiting for one of them to be released. 2/ It aims to handle a case when there are many concurrent flush requests. It handles this by submitting many requests in parallel - all of which are identical and so most of which do nothing useful. It would be more efficient to just send one lower-level request, but allow that to satisfy multiple upper-level requests. Fixes: 5a409b4f ("MD: fix lock contention for flush bios") Cc: <stable@vger.kernel.org> # v4.19+ Tested-by: Xiao Ni <xni@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Moore authored
commit 35a196be upstream. Prevent userspace from changing the the /proc/PID/attr values if the task's credentials are currently overriden. This not only makes sense conceptually, it also prevents some really bizarre error cases caused when trying to commit credentials to a task with overridden credentials. Cc: <stable@vger.kernel.org> Reported-by: "chengjian (D)" <cj.chengjian@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: James Morris <james.morris@microsoft.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hou Tao authored
commit f6b50160 upstream. __GFP_HIGHMEM is disabled if dax is enabled on brd, however dax support for brd has been removed since commit (7a862fbb "brd: remove dax support"), so restore __GFP_HIGHMEM in brd_insert_page(). Also remove the no longer applicable comments about DAX and highmem. Cc: stable@vger.kernel.org Fixes: 7a862fbb ("brd: remove dax support") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexander Shishkin authored
commit 51e0f227 upstream. Commit 7bd1d409 ("stm class: Introduce an abstraction for System Trace Module devices") naively calculates the channel bitmap size in 64-bit chunks regardless of the size of underlying unsigned long, making the bitmap half as big on a 32-bit system. This leads to an out of bounds access with the upper half of the bitmap. Fix this by using BITS_TO_LONGS. While at it, convert to using struct_size() for the total size calculation of the master struct. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Fixes: 7bd1d409 ("stm class: Introduce an abstraction for System Trace Module devices") Reported-by: Mulu He <muluhe@codeaurora.org> Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tingwei Zhang authored
commit ee496da4 upstream. Number of free masters is not set correctly in stm free path. Fix this by properly adding the number of output channels before setting them to 0 in stm_output_disclaim(). Currently it is equivalent to doing nothing since master->nr_free is incremented by 0. Fixes: 7bd1d409 ("stm class: Introduce an abstraction for System Trace Module devices") Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Cc: stable@vger.kernel.org # v4.4 Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit 1829dda0 upstream. LEVEL is a very common word, and now after many years it suddenly clashed with another LEVEL define in the DRBD code. Rename it to PA_ASM_LEVEL instead. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit bdca5d64 upstream. The LEVEL define clashed with the DRBD code. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit d19a1290 upstream. When making the text sections writeable with set_kernel_text_rw(1), include all text sections including those in the __init section. Otherwise functions marked with __meminit will stay read-only. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # 4.20+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit 2d94a832 upstream. Add compiler memory barriers to ensure the compiler doesn't reorder memory operations around these instructions. Cc: stable@vger.kernel.org # v4.20+ Fixes: 3847dab7 ("parisc: Add alternative coding infrastructure") Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit b4387490 upstream. No need to spend CPU cycles when we run on QEMU. Signed-off-by: Helge Deller <deller@gmx.de> CC: stable@vger.kernel.org # v4.9+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John David Anglin authored
commit 44224bdb upstream. The pdtlb and pitlb instructions are strongly ordered. The asms invoking these instructions should be compiler memory barriers to ensure the compiler doesn't reorder memory operations around these instructions. Signed-off-by: John David Anglin <dave.anglin@bell.net> CC: stable@vger.kernel.org # v4.20+ Fixes: 3847dab7 ("parisc: Add alternative coding infrastructure") Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit 3e1120f4 upstream. Signed-off-by: Helge Deller <deller@gmx.de> CC: stable@vger.kernel.org # v4.9+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steve Twiss authored
commit 70b46491 upstream. During several error paths in the function regulator_set_voltage_unlocked() the value of 'ret' can take on negative error values. However, in calls that go through the 'goto out' statement, this return value is lost and return 0 is used instead, indicating a 'pass'. There are several cases where this function should legitimately return a fail instead of a pass: one such case includes constraints check during voltage selection in the call to regulator_check_voltage(), which can have -EINVAL for the case when an unsupported voltage is incorrectly requested. In that case, -22 is expected as the return value, not 0. Fixes: 9243a195 ("regulator: core: Change voltage setting path") Cc: stable <stable@vger.kernel.org> Signed-off-by: Steve Twiss <stwiss.opensource@diasemi.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ming Lei authored
commit c7e2d94b upstream. Once blk_cleanup_queue() returns, tags shouldn't be used any more, because blk_mq_free_tag_set() may be called. Commit 45a9c9d9 ("blk-mq: Fix a use-after-free") fixes this issue exactly. However, that commit introduces another issue. Before 45a9c9d9, we are allowed to run queue during cleaning up queue if the queue's kobj refcount is held. After that commit, queue can't be run during queue cleaning up, otherwise oops can be triggered easily because some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue(). We have invented ways for addressing this kind of issue before, such as: 8dc765d4 ("SCSI: fix queue cleanup race before queue initialization is done") c2856ae2 ("blk-mq: quiesce queue before freeing queue") But still can't cover all cases, recently James reports another such kind of issue: https://marc.info/?l=linux-scsi&m=155389088124782&w=2 This issue can be quite hard to address by previous way, given scsi_run_queue() may run requeues for other LUNs. Fixes the above issue by freeing hctx's resources in its release handler, and this way is safe becasue tags isn't needed for freeing such hctx resource. This approach follows typical design pattern wrt. kobject's release handler. Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: James Smart <james.smart@broadcom.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen <martin.petersen@oracle.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>, Reported-by: James Smart <james.smart@broadcom.com> Fixes: 45a9c9d9 ("blk-mq: Fix a use-after-free") Cc: stable@vger.kernel.org Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Saeed Mahameed authored
[ Upstream commit 8f0916c6 ] ethtool user spaces needs to know ring count via ETHTOOL_GRXRINGS when executing (ethtool -x) which is retrieved via ethtool get_rxnfc callback, in mlx5 this callback is disabled when CONFIG_MLX5_EN_RXNFC=n. This patch allows only ETHTOOL_GRXRINGS command on mlx5e_get_rxnfc() when CONFIG_MLX5_EN_RXNFC is disabled, so ethtool -x will continue working. Fixes: fe6d86b3 ("net/mlx5e: Add CONFIG_MLX5_EN_RXNFC for ethtool rx nfc") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Saeed Mahameed authored
[ Upstream commit bad861f3 ] mlxfw can be compiled as external module while mlx5_core can be builtin, in such case mlx5 will act like mlxfw is disabled. Since mlxfw is just a service library for mlx* drivers, imply it in mlx5_core to make it always reachable if it was enabled. Fixes: 3ffaabec ("net/mlx5e: Support the flash device ethtool callback") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dmytro Linkin authored
[ Upstream commit c979c445 ] Flow destination comparison has an inaccuracy: code see no difference between same vf ports, which belong to different pfs. Example: If start ping from VF0 (PF1) to VF1 (PF1) and mirror all traffic to VF0 (PF2), icmp reply to VF0 (PF1) and mirrored flow to VF0 (PF2) would be determined as same destination. It lead to creating flow handler with rule nodes, which not added to node tree. When later driver try to delete this flow rules we got kernel crash. Add comparison of vhca_id field to avoid this. Fixes: 1228e912 ("net/mlx5: Consider encapsulation properties when comparing destinations") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dmytro Linkin authored
[ Upstream commit cf83c8fd ] For all representors added firmware version info to show in ethtool driver info. For uplink representor, because only it is tied to the pci device sysfs, added pci bus info. Fixes: ff9b85de ("net/mlx5e: Add some ethtool port control entries to the uplink rep netdev") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jorge E. Moreira authored
[ Upstream commit ba95e5df ] Avoid a race in which static variables in net/vmw_vsock/af_vsock.c are accessed (while handling interrupts) before they are initialized. [ 4.201410] BUG: unable to handle kernel paging request at ffffffffffffffe8 [ 4.207829] IP: vsock_addr_equals_addr+0x3/0x20 [ 4.211379] PGD 28210067 P4D 28210067 PUD 28212067 PMD 0 [ 4.211379] Oops: 0000 [#1] PREEMPT SMP PTI [ 4.211379] Modules linked in: [ 4.211379] CPU: 1 PID: 30 Comm: kworker/1:1 Not tainted 4.14.106-419297-gd7e28cc1f241 #1 [ 4.211379] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 4.211379] Workqueue: virtio_vsock virtio_transport_rx_work [ 4.211379] task: ffffa3273d175280 task.stack: ffffaea1800e8000 [ 4.211379] RIP: 0010:vsock_addr_equals_addr+0x3/0x20 [ 4.211379] RSP: 0000:ffffaea1800ebd28 EFLAGS: 00010286 [ 4.211379] RAX: 0000000000000002 RBX: 0000000000000000 RCX: ffffffffb94e42f0 [ 4.211379] RDX: 0000000000000400 RSI: ffffffffffffffe0 RDI: ffffaea1800ebdd0 [ 4.211379] RBP: ffffaea1800ebd58 R08: 0000000000000001 R09: 0000000000000001 [ 4.211379] R10: 0000000000000000 R11: ffffffffb89d5d60 R12: ffffaea1800ebdd0 [ 4.211379] R13: 00000000828cbfbf R14: 0000000000000000 R15: ffffaea1800ebdc0 [ 4.211379] FS: 0000000000000000(0000) GS:ffffa3273fd00000(0000) knlGS:0000000000000000 [ 4.211379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.211379] CR2: ffffffffffffffe8 CR3: 000000002820e001 CR4: 00000000001606e0 [ 4.211379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.211379] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.211379] Call Trace: [ 4.211379] ? vsock_find_connected_socket+0x6c/0xe0 [ 4.211379] virtio_transport_recv_pkt+0x15f/0x740 [ 4.211379] ? detach_buf+0x1b5/0x210 [ 4.211379] virtio_transport_rx_work+0xb7/0x140 [ 4.211379] process_one_work+0x1ef/0x480 [ 4.211379] worker_thread+0x312/0x460 [ 4.211379] kthread+0x132/0x140 [ 4.211379] ? process_one_work+0x480/0x480 [ 4.211379] ? kthread_destroy_worker+0xd0/0xd0 [ 4.211379] ret_from_fork+0x35/0x40 [ 4.211379] Code: c7 47 08 00 00 00 00 66 c7 07 28 00 c7 47 08 ff ff ff ff c7 47 04 ff ff ff ff c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 8b 47 08 <3b> 46 08 75 0a 8b 47 04 3b 46 04 0f 94 c0 c3 31 c0 c3 90 66 2e [ 4.211379] RIP: vsock_addr_equals_addr+0x3/0x20 RSP: ffffaea1800ebd28 [ 4.211379] CR2: ffffffffffffffe8 [ 4.211379] ---[ end trace f31cc4a2e6df3689 ]--- [ 4.211379] Kernel panic - not syncing: Fatal exception in interrupt [ 4.211379] Kernel Offset: 0x37000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 4.211379] Rebooting in 5 seconds.. Fixes: 22b5c0b6 ("vsock/virtio: fix kernel panic after device hot-unplug") Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: netdev@vger.kernel.org Cc: kernel-team@android.com Cc: stable@vger.kernel.org [4.9+] Signed-off-by: Jorge E. Moreira <jemoreira@google.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bodong Wang authored
[ Upstream commit dd064867 ] The command was mistakenly using enable_hca in embedded CPU field. Fixes: 22e939a9 (net/mlx5: Update enable HCA dependency) Signed-off-by: Bodong Wang <bodong@mellanox.com> Reported-by: Alex Rosenbaum <alexr@mellanox.com> Signed-off-by: Alex Rosenbaum <alexr@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jianbo Liu authored
[ Upstream commit 12d5cbf8 ] When flow_rule_match_XYZ() functions were first introduced, flow_rule_match_cvlan() for inner vlan is missing. In mlx5_core driver, to get inner vlan key and mask, flow_rule_match_vlan() is just called, which is wrong because it obtains outer vlan information by FLOW_DISSECTOR_KEY_VLAN. This commit fixes this by changing to call flow_rule_match_cvlan() after it's added. Fixes: 8f256622 ("flow_offload: add flow_rule and flow_match structures and use them") Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Edward Cree authored
[ Upstream commit bae9ed69 ] Plumb it through from the flow_dissector. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vadim Pasternak authored
[ Upstream commit f1436c80 ] Prevent reading unsupported slave address from SFP EEPROM by testing Diagnostic Monitoring Type byte in EEPROM. Read only page zero of EEPROM, in case this byte is zero. If some SFP transceiver does not support Digital Optical Monitoring (DOM), reading SFP EEPROM slave address 0x51 could return an error. Availability of DOM support is verified by reading from zero page Diagnostic Monitoring Type byte describing how diagnostic monitoring is implemented by transceiver. If bit 6 of this byte is set, it indicates that digital diagnostic monitoring has been implemented. Otherwise it is not and transceiver could fail to reply to transaction for slave address 0x51 [1010001X (A2h)], which is used to access measurements page. Such issue has been observed when reading cable MCP2M00-xxxx, MCP7F00-xxxx, and few others. Fixes: 2ea10903 ("mlxsw: spectrum: Add support for access cable info via ethtool") Fixes: 4400081b ("mlxsw: spectrum: Fix EEPROM access in case of SFP/SFP+") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vadim Pasternak authored
[ Upstream commit c52ecff7 ] Old Mellanox silicons, like switchx-2, switch-ib do not support reading QSFP modules temperature through MTMP register. Attempt to access this register on systems equipped with the this kind of silicon will cause initialization flow failure. Test for hardware resource capability is added in order to distinct between old and new silicon - old silicons do not have such capability. Fixes: 6a79507c ("mlxsw: core: Extend thermal module with per QSFP module thermal zones") Fixes: 5c42eaa0 ("mlxsw: core: Extend hwmon interface with QSFP module temperature attributes") Reported-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Junwei Hu authored
[ Upstream commit 532b0f7e ] Error message printed: modprobe: ERROR: could not insert 'tipc': Address family not supported by protocol. when modprobe tipc after the following patch: switch order of device registration, commit 7e27e8d6 ("tipc: switch order of device registration to fix a crash") Because sock_create_kern(net, AF_TIPC, ...) is called by tipc_topsrv_create_listener() in the initialization process of tipc_net_ops, tipc_socket_init() must be execute before that. I move tipc_socket_init() into function tipc_init_net(). Fixes: 7e27e8d6 ("tipc: switch order of device registration to fix a crash") Signed-off-by: Junwei Hu <hujunwei4@huawei.com> Reported-by: Wang Wang <wangwang2@huawei.com> Reviewed-by: Kang Zhou <zhoukang7@huawei.com> Reviewed-by: Suanming Mou <mousuanming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stefano Garzarella authored
[ Upstream commit ac03046e ] When the socket is released, we should free all packets queued in the per-socket list in order to avoid a memory leak. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Junwei Hu authored
[ Upstream commit 7e27e8d6 ] When tipc is loaded while many processes try to create a TIPC socket, a crash occurs: PANIC: Unable to handle kernel paging request at virtual address "dfff20000000021d" pc : tipc_sk_create+0x374/0x1180 [tipc] lr : tipc_sk_create+0x374/0x1180 [tipc] Exception class = DABT (current EL), IL = 32 bits Call trace: tipc_sk_create+0x374/0x1180 [tipc] __sock_create+0x1cc/0x408 __sys_socket+0xec/0x1f0 __arm64_sys_socket+0x74/0xa8 ... This is due to race between sock_create and unfinished register_pernet_device. tipc_sk_insert tries to do "net_generic(net, tipc_net_id)". but tipc_net_id is not initialized yet. So switch the order of the two to close the race. This can be reproduced with multiple processes doing socket(AF_TIPC, ...) and one process doing module removal. Fixes: a62fbcce ("tipc: make subscriber server support net namespace") Signed-off-by: Junwei Hu <hujunwei4@huawei.com> Reported-by: Wang Wang <wangwang2@huawei.com> Reviewed-by: Xiaogang Wang <wangxiaogang3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sabrina Dubroca authored
[ Upstream commit feadc4b6 ] Currently, nla_put_iflink() doesn't put the IFLA_LINK attribute when iflink == ifindex. In some cases, a device can be created in a different netns with the same ifindex as its parent. That device will not dump its IFLA_LINK attribute, which can confuse some userspace software that expects it. For example, if the last ifindex created in init_net and foo are both 8, these commands will trigger the issue: ip link add parent type dummy # ifindex 9 ip link add link parent netns foo type macvlan # ifindex 9 in ns foo So, in case a device puts the IFLA_LINK_NETNSID attribute in a dump, always put the IFLA_LINK attribute as well. Thanks to Dan Winship for analyzing the original OpenShift bug down to the missing netlink attribute. v2: change Fixes tag, it's been here forever, as Nicolas Dichtel said add Nicolas' ack v3: change Fixes tag fix subject typo, spotted by Edward Cree Analyzed-by: Dan Winship <danw@redhat.com> Fixes: d8a5ec67 ("[NET]: netlink support for moving devices between network namespaces.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
YueHaibing authored
[ Upstream commit 3ebe1bca ] BUG: unable to handle kernel paging request at ffffffffa018f000 PGD 3270067 P4D 3270067 PUD 3271063 PMD 2307eb067 PTE 0 Oops: 0000 [#1] PREEMPT SMP CPU: 0 PID: 4138 Comm: modprobe Not tainted 5.1.0-rc7+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:ppp_register_compressor+0x3e/0xd0 [ppp_generic] Code: 98 4a 3f e2 48 8b 15 c1 67 00 00 41 8b 0c 24 48 81 fa 40 f0 19 a0 75 0e eb 35 48 8b 12 48 81 fa 40 f0 19 a0 74 RSP: 0018:ffffc90000d93c68 EFLAGS: 00010287 RAX: ffffffffa018f000 RBX: ffffffffa01a3000 RCX: 000000000000001a RDX: ffff888230c750a0 RSI: 0000000000000000 RDI: ffffffffa019f000 RBP: ffffc90000d93c80 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0194080 R13: ffff88822ee1a700 R14: 0000000000000000 R15: ffffc90000d93e78 FS: 00007f2339557540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa018f000 CR3: 000000022bde4000 CR4: 00000000000006f0 Call Trace: ? 0xffffffffa01a3000 deflate_init+0x11/0x1000 [ppp_deflate] ? 0xffffffffa01a3000 do_one_initcall+0x6c/0x3cc ? kmem_cache_alloc_trace+0x248/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe If ppp_deflate fails to register in deflate_init, module initialization failed out, however ppp_deflate_draft may has been regiestred and not unregistered before return. Then the seconed modprobe will trigger crash like this. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pieter Jansen van Vuuren authored
[ Upstream commit cb07d915 ] Add rcu locks when accessing netdev when processing route request and tunnel keep alive messages received from hardware. Fixes: 8e6a9046 ("nfp: flower vxlan neighbour offload") Fixes: 856f5b13 ("nfp: flower vxlan neighbour keep-alive") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniele Palmas authored
[ Upstream commit b4e467c8 ] Added support for Telit LE910Cx 0x1260 and 0x1261 compositions. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Willem de Bruijn authored
[ Upstream commit 185ce5c3 ] Zerocopy skbs without completion notification were added for packet sockets with PACKET_TX_RING user buffers. Those signal completion through the TP_STATUS_USER bit in the ring. Zerocopy annotation was added only to avoid premature notification after clone or orphan, by triggering a copy on these paths for these packets. The mechanism had to define a special "no-uarg" mode because packet sockets already use skb_uarg(skb) == skb_shinfo(skb)->destructor_arg for a different pointer. Before deferencing skb_uarg(skb), verify that it is a real pointer. Fixes: 5cd8d46e ("packet: copy user buffers before orphan or clone") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yunjian Wang authored
[ Upstream commit 00f9fec4 ] The error print within mlx4_flow_steer_promisc_add() should be a info print. Fixes: 592e49dd ('net/mlx4: Implement promiscuous mode with device managed flow-steering') Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit d7c04b05 ] When host is under high stress, it is very possible thread running netdev_wait_allrefs() returns from msleep(250) 10 seconds late. This leads to these messages in the syslog : [...] unregister_netdevice: waiting for syz_tun to become free. Usage count = 0 If the device refcount is zero, the wait is over. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Fainelli authored
[ Upstream commit 0fe9f173 ] Jiri reported that with a kernel built with CONFIG_FIXED_PHY=y, CONFIG_NET_DSA=m and CONFIG_NET_DSA_LOOP=m, we would not get to a functional state where the mock-up driver is registered. Turns out that we are not descending into drivers/net/dsa/ unconditionally, and we won't be able to link-in dsa_loop_bdinfo.o which does the actual mock-up mdio device registration. Reported-by: Jiri Pirko <jiri@resnulli.us> Fixes: 40013ff2 ("net: dsa: Fix functional dsa-loop dependency on FIXED_PHY") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Tested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 61fb0d01 ] At ipv6 route dismantle, fib6_drop_pcpu_from() is responsible for finding all percpu routes and set their ->from pointer to NULL, so that fib6_ref can reach its expected value (1). The problem right now is that other cpus can still catch the route being deleted, since there is no rcu grace period between the route deletion and call to fib6_drop_pcpu_from() This can leak the fib6 and associated resources, since no notifier will take care of removing the last reference(s). I decided to add another boolean (fib6_destroying) instead of reusing/renaming exception_bucket_flushed to ease stable backports, and properly document the memory barriers used to implement this fix. This patch has been co-developped with Wei Wang. Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Wei Wang <weiwan@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin Lau <kafai@fb.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wei Wang authored
[ Upstream commit 510e2ced ] When inserting route cache into the exception table, the key is generated with both src_addr and dest_addr with src addr routing. However, current logic always assumes the src_addr used to generate the key is a /128 host address. This is not true in the following scenarios: 1. When the route is a gateway route or does not have next hop. (rt6_is_gw_or_nonexthop() == false) 2. When calling ip6_rt_cache_alloc(), saddr is passed in as NULL. This means, when looking for a route cache in the exception table, we have to do the lookup twice: first time with the passed in /128 host address, second time with the src_addr stored in fib6_info. This solves the pmtu discovery issue reported by Mikael Magnusson where a route cache with a lower mtu info is created for a gateway route with src addr. However, the lookup code is not able to find this route cache. Fixes: 2b760fcf ("ipv6: hook up exception table to store dst cache") Reported-by: Mikael Magnusson <mikael.kernel@lists.m7n.se> Bisected-by: David Ahern <dsahern@gmail.com> Signed-off-by: Wei Wang <weiwan@google.com> Cc: Martin Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 22 May, 2019 2 commits
-
-
Greg Kroah-Hartman authored
-
Martin Schwidefsky authored
commit 1a42010c upstream. Define the gup_fast_permitted to check against the asce_limit of the mm attached to the current task, then replace the s390 specific gup code with the generic implementation in mm/gup.c. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-