- 27 Oct, 2022 40 commits
-
-
Eric Dumazet authored
Similar to changes done in TCP in blamed commit. We should not sense pfmemalloc status in sendpage() methods. Fixes: 32614006 ("tcp: TX zerocopy should not sense pfmemalloc status") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221027040637.1107703-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
skb_append_pagefrags() is used by af_unix and udp sendpage() implementation so far. In commit 32614006 ("tcp: TX zerocopy should not sense pfmemalloc status") we explained why we should not sense pfmemalloc status for pages owned by user space. We should also use skb_fill_page_desc_noacc() in skb_append_pagefrags() to avoid following KCSAN report: BUG: KCSAN: data-race in lru_add_fn / skb_append_pagefrags write to 0xffffea00058fc1c8 of 8 bytes by task 17319 on cpu 0: __list_add include/linux/list.h:73 [inline] list_add include/linux/list.h:88 [inline] lruvec_add_folio include/linux/mm_inline.h:323 [inline] lru_add_fn+0x327/0x410 mm/swap.c:228 folio_batch_move_lru+0x1e1/0x2a0 mm/swap.c:246 lru_add_drain_cpu+0x73/0x250 mm/swap.c:669 lru_add_drain+0x21/0x60 mm/swap.c:773 free_pages_and_swap_cache+0x16/0x70 mm/swap_state.c:311 tlb_batch_pages_flush mm/mmu_gather.c:59 [inline] tlb_flush_mmu_free mm/mmu_gather.c:256 [inline] tlb_flush_mmu+0x5b2/0x640 mm/mmu_gather.c:263 tlb_finish_mmu+0x86/0x100 mm/mmu_gather.c:363 exit_mmap+0x190/0x4d0 mm/mmap.c:3098 __mmput+0x27/0x1b0 kernel/fork.c:1185 mmput+0x3d/0x50 kernel/fork.c:1207 copy_process+0x19fc/0x2100 kernel/fork.c:2518 kernel_clone+0x166/0x550 kernel/fork.c:2671 __do_sys_clone kernel/fork.c:2812 [inline] __se_sys_clone kernel/fork.c:2796 [inline] __x64_sys_clone+0xc3/0xf0 kernel/fork.c:2796 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffffea00058fc1c8 of 8 bytes by task 17325 on cpu 1: page_is_pfmemalloc include/linux/mm.h:1817 [inline] __skb_fill_page_desc include/linux/skbuff.h:2432 [inline] skb_fill_page_desc include/linux/skbuff.h:2453 [inline] skb_append_pagefrags+0x210/0x600 net/core/skbuff.c:3974 unix_stream_sendpage+0x45e/0x990 net/unix/af_unix.c:2338 kernel_sendpage+0x184/0x300 net/socket.c:3561 sock_sendpage+0x5a/0x70 net/socket.c:1054 pipe_to_sendpage+0x128/0x160 fs/splice.c:361 splice_from_pipe_feed fs/splice.c:415 [inline] __splice_from_pipe+0x222/0x4d0 fs/splice.c:559 splice_from_pipe fs/splice.c:594 [inline] generic_splice_sendpage+0x89/0xc0 fs/splice.c:743 do_splice_from fs/splice.c:764 [inline] direct_splice_actor+0x80/0xa0 fs/splice.c:931 splice_direct_to_actor+0x305/0x620 fs/splice.c:886 do_splice_direct+0xfb/0x180 fs/splice.c:974 do_sendfile+0x3bf/0x910 fs/read_write.c:1255 __do_sys_sendfile64 fs/read_write.c:1323 [inline] __se_sys_sendfile64 fs/read_write.c:1309 [inline] __x64_sys_sendfile64+0x10c/0x150 fs/read_write.c:1309 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x0000000000000000 -> 0xffffea00058fc188 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 17325 Comm: syz-executor.0 Not tainted 6.1.0-rc1-syzkaller-00158-g440b7895-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022 Fixes: 32614006 ("tcp: TX zerocopy should not sense pfmemalloc status") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221027040346.1104204-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Raed Salem authored
The cited commit at rx sa update operation passes the sci object attribute, in the wrong endianness and not as expected by the HW effectively create malformed hw sa context in case of update rx sa consequently, HW produces unexpected MACsec packets which uses this sa. Fix by passing sci to create macsec object with the correct endianness, while at it add __force u64 to prevent sparse check error of type "sparse: error: incorrect type in assignment". Fixes: aae3454e ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-16-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Raed Salem authored
The cited commit produces a sparse check error of type "sparse: error: restricted __be64 degrades to integer". The offending line wrongly did a bitwise operation between two different storage types one of 64 bit when the other smaller side is 16 bit which caused the above sparse error, furthermore bitwise operation usage here is wrong in the first place as the constant MACSEC_PORT_ES is not a bitwise field. Fix by using the right mask to get the lower 16 bit if the sci number, and use comparison operator '==' instead of bitwise '&' operator. Fixes: 3b20949c ("net/mlx5e: Add MACsec RX steering rules") Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-15-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Raed Salem authored
The cited commit adds the support for update/delete MACsec Rx SA, naturally, these operations need to check if the SA in question exists to update/delete the SA and return error code otherwise, however they do just the opposite i.e. return with error if the SA exists Fix by change the check to return error in case the SA in question does not exist, adjust error message and code accordingly. Fixes: aae3454e ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-14-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Raed Salem authored
The cited commit at update rx sa operation passes object attributes to MACsec object create function without initializing/setting all attributes fields leaving some of them with garbage values, therefore violating the implicit assumption at create object function, which assumes that all input object attributes fields are set. Fix by initializing the object attributes struct to zero, thus leaving unset fields with the legal zero value. Fixes: aae3454e ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-13-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Suresh Devarakonda authored
When setting Bluefield to DPU NIC mode using mlxconfig tool + sync firmware reset flow, we run into scenario where the host was not eswitch manager at the time of mlx5 driver load but becomes eswitch manager after the sync firmware reset flow. This results in null pointer access of mpfs structure during mac filter add. This change prevents null pointer access but mpfs table entries will not be added. Fixes: 5ec69744 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Suresh Devarakonda <ramad@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-12-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Roy Novich authored
Update devlink health fw fatal reporter state to "healthy" is needed by strictly calling devlink_health_reporter_state_update() after recovery was done by PCI error handler. This is needed when fw_fatal reporter was triggered due to PCI error. Poll health is called and set reporter state to error. Health recovery failed (since EEH didn't re-enable the PCI). PCI handlers keep on recover flow and succeed later without devlink acknowledgment. Fix this by adding devlink state update at the end of the PCI handler recovery process. Fixes: 6181e5cb ("devlink: add support for reporter recovery completion") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-11-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Roi Dayan authored
On multi table split the driver creates a new attr instance with data being copied from prev attr instance zeroing action flags. Also need to reset dests properties to avoid incorrect dests per attr. Fixes: 8300f225 ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-10-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ariel Levkovich authored
Reject TC rules that forward from internal port to internal port as it is not supported. This include rules that are explicitly have internal port as the filter device as well as rules that apply on tunnel interfaces as the route device for the tunnel interface can be an internal port. Fixes: 27484f71 ("net/mlx5e: Offload tc rules that redirect to ovs internal port") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-9-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Tariq Toukan authored
mlx5_cmd_cleanup_async_ctx should return only after all its callback handlers were completed. Before this patch, the below race between mlx5_cmd_cleanup_async_ctx and mlx5_cmd_exec_cb_handler was possible and lead to a use-after-free: 1. mlx5_cmd_cleanup_async_ctx is called while num_inflight is 2 (i.e. elevated by 1, a single inflight callback). 2. mlx5_cmd_cleanup_async_ctx decreases num_inflight to 1. 3. mlx5_cmd_exec_cb_handler is called, decreases num_inflight to 0 and is about to call wake_up(). 4. mlx5_cmd_cleanup_async_ctx calls wait_event, which returns immediately as the condition (num_inflight == 0) holds. 5. mlx5_cmd_cleanup_async_ctx returns. 6. The caller of mlx5_cmd_cleanup_async_ctx frees the mlx5_async_ctx object. 7. mlx5_cmd_exec_cb_handler goes on and calls wake_up() on the freed object. Fix it by syncing using a completion object. Mark it completed when num_inflight reaches 0. Trace: BUG: KASAN: use-after-free in do_raw_spin_lock+0x23d/0x270 Read of size 4 at addr ffff888139cd12f4 by task swapper/5/0 CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x57/0x7d print_report.cold+0x2d5/0x684 ? do_raw_spin_lock+0x23d/0x270 kasan_report+0xb1/0x1a0 ? do_raw_spin_lock+0x23d/0x270 do_raw_spin_lock+0x23d/0x270 ? rwlock_bug.part.0+0x90/0x90 ? __delete_object+0xb8/0x100 ? lock_downgrade+0x6e0/0x6e0 _raw_spin_lock_irqsave+0x43/0x60 ? __wake_up_common_lock+0xb9/0x140 __wake_up_common_lock+0xb9/0x140 ? __wake_up_common+0x650/0x650 ? destroy_tis_callback+0x53/0x70 [mlx5_core] ? kasan_set_track+0x21/0x30 ? destroy_tis_callback+0x53/0x70 [mlx5_core] ? kfree+0x1ba/0x520 ? do_raw_spin_unlock+0x54/0x220 mlx5_cmd_exec_cb_handler+0x136/0x1a0 [mlx5_core] ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core] ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core] mlx5_cmd_comp_handler+0x65a/0x12b0 [mlx5_core] ? dump_command+0xcc0/0xcc0 [mlx5_core] ? lockdep_hardirqs_on_prepare+0x400/0x400 ? cmd_comp_notifier+0x7e/0xb0 [mlx5_core] cmd_comp_notifier+0x7e/0xb0 [mlx5_core] atomic_notifier_call_chain+0xd7/0x1d0 mlx5_eq_async_int+0x3ce/0xa20 [mlx5_core] atomic_notifier_call_chain+0xd7/0x1d0 ? irq_release+0x140/0x140 [mlx5_core] irq_int_handler+0x19/0x30 [mlx5_core] __handle_irq_event_percpu+0x1f2/0x620 handle_irq_event+0xb2/0x1d0 handle_edge_irq+0x21e/0xb00 __common_interrupt+0x79/0x1a0 common_interrupt+0x78/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40 RIP: 0010:default_idle+0x42/0x60 Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 eb 47 22 02 85 c0 7e 07 0f 00 2d e0 9f 48 00 fb f4 <c3> 48 c7 c7 80 08 7f 85 e8 d1 d3 3e fe eb de 66 66 2e 0f 1f 84 00 RSP: 0018:ffff888100dbfdf0 EFLAGS: 00000242 RAX: 0000000000000001 RBX: ffffffff84ecbd48 RCX: 1ffffffff0afe110 RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835cc9bc RBP: 0000000000000005 R08: 0000000000000001 R09: ffff88881dec4ac3 R10: ffffed1103bd8958 R11: 0000017d0ca571c9 R12: 0000000000000005 R13: ffffffff84f024e0 R14: 0000000000000000 R15: dffffc0000000000 ? default_idle_call+0xcc/0x450 default_idle_call+0xec/0x450 do_idle+0x394/0x450 ? arch_cpu_idle_exit+0x40/0x40 ? do_idle+0x17/0x450 cpu_startup_entry+0x19/0x20 start_secondary+0x221/0x2b0 ? set_cpu_sibling_map+0x2070/0x2070 secondary_startup_64_no_verify+0xcd/0xdb </TASK> Allocated by task 49502: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 kvmalloc_node+0x48/0xe0 mlx5e_bulk_async_init+0x35/0x110 [mlx5_core] mlx5e_tls_priv_tx_list_cleanup+0x84/0x3e0 [mlx5_core] mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core] mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core] mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core] mlx5e_suspend+0xdb/0x140 [mlx5_core] mlx5e_remove+0x89/0x190 [mlx5_core] auxiliary_bus_remove+0x52/0x70 device_release_driver_internal+0x40f/0x650 driver_detach+0xc1/0x180 bus_remove_driver+0x125/0x2f0 auxiliary_driver_unregister+0x16/0x50 mlx5e_cleanup+0x26/0x30 [mlx5_core] cleanup+0xc/0x4e [mlx5_core] __x64_sys_delete_module+0x2b5/0x450 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Freed by task 49502: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0x11d/0x1b0 kfree+0x1ba/0x520 mlx5e_tls_priv_tx_list_cleanup+0x2e7/0x3e0 [mlx5_core] mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core] mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core] mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core] mlx5e_suspend+0xdb/0x140 [mlx5_core] mlx5e_remove+0x89/0x190 [mlx5_core] auxiliary_bus_remove+0x52/0x70 device_release_driver_internal+0x40f/0x650 driver_detach+0xc1/0x180 bus_remove_driver+0x125/0x2f0 auxiliary_driver_unregister+0x16/0x50 mlx5e_cleanup+0x26/0x30 [mlx5_core] cleanup+0xc/0x4e [mlx5_core] __x64_sys_delete_module+0x2b5/0x450 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: e355477e ("net/mlx5: Make mlx5_cmd_exec_cb() a safe API") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-8-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Saeed Mahameed authored
mlx5 SQs must select the timestamp format explicitly according to the active clock mode, select the current active timestamp mode so ASO SQ create will succeed. This fixes the following error prints when trying to create ipsec ASO SQ while the timestamp format is real time mode. mlx5_cmd_out_err:778:(pid 34874): CREATE_SQ(0x904) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xd61c0b), err(-22) mlx5_aso_create_sq:285:(pid 34874): Failed to open aso wq sq, err=-22 mlx5e_ipsec_init:436:(pid 34874): IPSec initialization failed, -22 Fixes: cdd04f4d ("net/mlx5: Add support to create SQ and CQ for ASO") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-7-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paul Blakey authored
Currently encap slow path rules just forward to software without setting the chain id miss register, so driver doesn't restore the chain, and packets hitting this rule will restart from tc chain 0 instead of continuing to the chain the encap rule was on. Fix this by setting the chain id miss register to the chain id mapping. Fixes: 8f1e0b97 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-6-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Aya Levin authored
When tx_port_ts is set, the driver diverts all UPD traffic over PTP port to a dedicated PTP-SQ. The SKBs are cached until the wire-CQE arrives. When the packet size is greater then MTU, the firmware might drop it and the packet won't be transmitted to the wire, hence the wire-CQE won't reach the driver. In this case the SKBs are accumulated in the SKB fifo. Add room check to consider the PTP-SQ SKB fifo, when the SKB fifo is full, driver stops the queue resulting in a TX timeout. Devlink TX-reporter can recover from it. Fixes: 1880bc4e ("net/mlx5e: Add TX port timestamp support") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-5-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rongwei Liu authored
When 2nd flow rules arrives, it will merge together with the 1st one if matcher criteria is the same. If merge fails, driver will rollback the merge contents, and reject the 2nd rule. At rollback stage, matcher can't be disconnected unconditionally, otherise the 1st rule can't be hit anymore. Add logic to check if the matcher should be disconnected or not. Fixes: cc2295cd ("net/mlx5: DR, Improve steering for empty or RX/TX-only matchers") Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-4-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Moshe Shemesh authored
After firmware reset driver should verify firmware already enabled CRS and became responsive to pci config cycles before restoring pci state. Fix that by waiting till device_id is readable through PCI again. Fixes: eabe8e5e ("net/mlx5: Handle sync reset now event") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-3-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Hyong Youb Kim authored
An offloaded SA stops receiving after about 2^32 + replay_window packets. For example, when SA reaches <seq-hi 0x1, seq 0x2c>, all subsequent packets get dropped with SA-icv-failure (integrity_failed). To reproduce the bug: - ConnectX-6 Dx with crypto enabled (FW 22.30.1004) - ipsec.conf: nic-offload = yes replay-window = 32 esn = yes salifetime=24h - Run netperf for a long time to send more than 2^32 packets netperf -H <device-under-test> -t TCP_STREAM -l 20000 When 2^32 + replay_window packets are received, the replay window moves from the 2nd half of subspace (overlap=1) to the 1st half (overlap=0). The driver then updates the 'esn' value in NIC (i.e. seq_hi) as follows. seq_hi = xfrm_replay_seqhi(seq_bottom) new esn in NIC = seq_hi + 1 The +1 increment is wrong, as seq_hi already contains the correct seq_hi. For example, when seq_hi=1, the driver actually tells NIC to use seq_hi=2 (esn). This incorrect esn value causes all subsequent packets to fail integrity checks (SA-icv-failure). So, do not increment. Fixes: cb010083 ("net/mlx5: IPSec, Add support for ESN") Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-2-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Zhengchao Shao says: ==================== fix some issues in netdevsim driver When strace tool is used to perform memory injection, memory leaks and files not removed issues are found. Fix them. ==================== Link: https://lore.kernel.org/r/20221026014642.116261-1-shaozhengchao@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Zhengchao Shao authored
Remove dir in nsim_dev_debugfs_init() when creating ports dir failed. Otherwise, the netdevsim device will not be created next time. Kernel reports an error: debugfs: Directory 'netdevsim1' with parent 'netdevsim' already present! Fixes: ab1d0cc0 ("netdevsim: change debugfs tree topology") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Zhengchao Shao authored
If some items in nsim_dev_resources_register() fail, memory leak will occur. The following is the memory leak information. unreferenced object 0xffff888074c02600 (size 128): comm "echo", pid 8159, jiffies 4294945184 (age 493.530s) hex dump (first 32 bytes): 40 47 ea 89 ff ff ff ff 01 00 00 00 00 00 00 00 @G.............. ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ backtrace: [<0000000011a31c98>] kmalloc_trace+0x22/0x60 [<0000000027384c69>] devl_resource_register+0x144/0x4e0 [<00000000a16db248>] nsim_drv_probe+0x37a/0x1260 [<000000007d1f448c>] really_probe+0x20b/0xb10 [<00000000c416848a>] __driver_probe_device+0x1b3/0x4a0 [<00000000077e0351>] driver_probe_device+0x49/0x140 [<0000000054f2465a>] __device_attach_driver+0x18c/0x2a0 [<000000008538f359>] bus_for_each_drv+0x151/0x1d0 [<0000000038e09747>] __device_attach+0x1c9/0x4e0 [<00000000dd86e533>] bus_probe_device+0x1d5/0x280 [<00000000839bea35>] device_add+0xae0/0x1cb0 [<000000009c2abf46>] new_device_store+0x3b6/0x5f0 [<00000000fb823d7f>] bus_attr_store+0x72/0xa0 [<000000007acc4295>] sysfs_kf_write+0x106/0x160 [<000000005f50cb4d>] kernfs_fop_write_iter+0x3a8/0x5a0 [<0000000075eb41bf>] vfs_write+0x8f0/0xc80 Fixes: 37923ed6 ("netdevsim: Add simple FIB resource controller via devlink") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Zhengchao Shao authored
If device_register() failed in nsim_bus_dev_new(), the value of reference in nsim_bus_dev->dev is 1. obj->name in nsim_bus_dev->dev will not be released. unreferenced object 0xffff88810352c480 (size 16): comm "echo", pid 5691, jiffies 4294945921 (age 133.270s) hex dump (first 16 bytes): 6e 65 74 64 65 76 73 69 6d 31 00 00 00 00 00 00 netdevsim1...... backtrace: [<000000005e2e5e26>] __kmalloc_node_track_caller+0x3a/0xb0 [<0000000094ca4fc8>] kvasprintf+0xc3/0x160 [<00000000aad09bcc>] kvasprintf_const+0x55/0x180 [<000000009bac868d>] kobject_set_name_vargs+0x56/0x150 [<000000007c1a5d70>] dev_set_name+0xbb/0xf0 [<00000000ad0d126b>] device_add+0x1f8/0x1cb0 [<00000000c222ae24>] new_device_store+0x3b6/0x5e0 [<0000000043593421>] bus_attr_store+0x72/0xa0 [<00000000cbb1833a>] sysfs_kf_write+0x106/0x160 [<00000000d0dedb8a>] kernfs_fop_write_iter+0x3a8/0x5a0 [<00000000770b66e2>] vfs_write+0x8f0/0xc80 [<0000000078bb39be>] ksys_write+0x106/0x210 [<00000000005e55a4>] do_syscall_64+0x35/0x80 [<00000000eaa40bbc>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 40e4fe4c ("netdevsim: move device registration and related code to bus.c") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20221026015405.128795-1-shaozhengchao@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Merge tag 'linux-can-fixes-for-6.1-20221027' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2022-10-27 Anssi Hannula fixes the use of the completions in the kvaser_usb driver. Biju Das contributes 2 patches for the rcar_canfd driver. A IRQ storm that can be triggered by high CAN bus load and channel specific IRQ handlers are fixed. Yang Yingliang fixes the j1939 transport protocol by moving a kfree_skb() out of a spin_lock_irqsave protected section. * tag 'linux-can-fixes-for-6.1-20221027' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before kfree_skb() can: rcar_canfd: fix channel specific IRQ handling for RZ/G2L can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive can: kvaser_usb: Fix possible completions during init_completion ==================== Link: https://lore.kernel.org/r/20221027114356.1939821-1-mkl@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rafał Miłecki authored
Queueing packets doesn't guarantee their transmission. Update TX stats after hardware confirms consuming submitted data. This also fixes a possible race and NULL dereference. bcm4908_enet_start_xmit() could try to access skb after freeing it in the bcm4908_enet_poll_tx(). Reported-by: Florian Fainelli <f.fainelli@gmail.com> Fixes: 4feffead ("net: broadcom: bcm4908enet: add BCM4908 controller driver") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221027112430.8696-1-zajec5@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Nicolas Dichtel says: ==================== ip: rework the fix for dflt addr selection for connected nexthop" This series reworks the fix that is reverted in the second commit. As Julian explained, nhc_scope is related to nhc_gw, it's not the scope of the route. ==================== Link: https://lore.kernel.org/r/20221020100952.8748-1-nicolas.dichtel@6wind.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nicolas Dichtel authored
As explained by Julian, fib_nh_scope is related to fib_nh_gw4, but fib_info_update_nhc_saddr() needs the scope of the route, which is the scope "before" fib_nh_scope, ie fib_nh_scope - 1. This patch fixes the problem described in commit 747c1430 ("ip: fix dflt addr selection for connected nexthop"). Fixes: 597cfe4f ("nexthop: Add support for IPv4 nexthops") Link: https://lore.kernel.org/netdev/6c8a44ba-c2d5-cdf-c5c7-5baf97cba38@ssi.bg/Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nicolas Dichtel authored
This reverts commit 747c1430. As explained by Julian, nhc_scope is related to nhc_gw, not to the route. Revert the original patch. The initial problem is fixed differently in the next commit. Link: https://lore.kernel.org/netdev/6c8a44ba-c2d5-cdf-c5c7-5baf97cba38@ssi.bg/Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nicolas Dichtel authored
This reverts commit eb55dc09. The patch that introduces this bug is reverted right after this one. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
During review of previous change another thing came up - we should limit the use of validation workarounds to old commands. Don't list the workarounds one by one, as we're rejecting all existing ones. We can deal with the masking in the unlikely event that new flag is added. Link: https://lore.kernel.org/all/6ba9f727e555fd376623a298d5d305ad408c3d47.camel@sipsolutions.net/ Link: https://lore.kernel.org/r/20221026001524.1892202-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Florian Fainelli authored
Avoid the PHY library call unnecessarily into the suspend/resume functions by setting phydev->mac_managed_pm to true. The SYSTEMPORT driver essentially does exactly what mdio_bus_phy_resume() does by calling phy_resume(). Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221025234201.2549360-1-f.fainelli@gmail.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Yang Yingliang authored
It is not allowed to call kfree_skb() from hardware interrupt context or with interrupts being disabled. The skb is unlinked from the queue, so it can be freed after spin_unlock_irqrestore(). Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/all/20221027091237.2290111-1-yangyingliang@huawei.com Cc: stable@vger.kernel.org [mkl: adjust subject] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Yang Yingliang authored
If of_device_register() returns error, the of node and the name allocated in dev_set_name() is leaked, call put_device() to give up the reference that was set in device_initialize(), so that of node is put in logical_port_release() and the name is freed in kobject_cleanup(). Fixes: 1acf2318 ("ehea: dynamic add / remove port") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221025130011.1071357-1-yangyingliang@huawei.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Aaron Conole says: ==================== openvswitch: syzbot splat fix and introduce selftest Syzbot recently caught a splat when dropping features from openvswitch datapaths that are in-use. The WARN() call is definitely too large a hammer for the situation, so change to pr_warn. Second patch in the series introduces a new selftest suite which can help show that an issue is fixed. This change might be more suited to net-next tree, so it has been separated out as an additional patch and can be either applied to either tree based on preference. ==================== Link: https://lore.kernel.org/r/20221025105018.466157-1-aconole@redhat.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Aaron Conole authored
Previous commit resolves a WARN splat that can be difficult to reproduce, but with the ovs-dpctl.py utility, it can be trivial. Introduce a test case which creates a DP, and then downgrades the feature set. This will include a utility 'ovs-dpctl.py' that can be extended to do additional tests and diagnostics. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Aaron Conole authored
As noted by Paolo Abeni, pr_warn doesn't generate any splat and can still preserve the warning to the user that feature downgrade occurred. We likely cannot introduce other kinds of checks / enforcement here because syzbot can generate different genl versions to the datapath. Reported-by: syzbot+31cde0bef4bbf8ba2d86@syzkaller.appspotmail.com Fixes: 44da5ae5 ("openvswitch: Drop user features if old user space attempted to create datapath") Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Marc Kleine-Budde authored
Biju Das <biju.das.jz@bp.renesas.com> says: This patch series fixes the below issues in R-Car CAN-FD driver. 1) Race condition in CAN driver under heavy CAN load condition with both channels enabled results in IRQ storm on global FIFO receive IRQ line. 2) Add channel specific TX interrupts handling for RZ/G2L SoC as it has separate IRQ lines for each TX. changes since v1: https://lore.kernel.org/all/20221022081503.1051257-1-biju.das.jz@bp.renesas.com * Added check for IRQ active and enabled before handling the IRQ on a particular channel. Link: https://lore.kernel.org/all/20221025155657.1426948-1-biju.das.jz@bp.renesas.com [mkl: adjust message, add link, take only patches 1 + 2, upstream 3 via can-next] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Biju Das authored
RZ/G2L has separate channel specific IRQs for transmit and error interrupts. But the IRQ handler processes both channels, even if there no interrupt occurred on one of the channels. This patch fixes the issue by passing a channel specific context parameter instead of global one for the IRQ register and the IRQ handler, it just handles the channel which is triggered the interrupt. Fixes: 76e9353a ("can: rcar_canfd: Add support for RZ/G2L family") Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/all/20221025155657.1426948-3-biju.das.jz@bp.renesas.com Cc: stable@vger.kernel.org [mkl: adjust commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Biju Das authored
We are seeing an IRQ storm on the global receive IRQ line under heavy CAN bus load conditions with both CAN channels enabled. Conditions: The global receive IRQ line is shared between can0 and can1, either of the channels can trigger interrupt while the other channel's IRQ line is disabled (RFIE). When global a receive IRQ interrupt occurs, we mask the interrupt in the IRQ handler. Clearing and unmasking of the interrupt is happening in rx_poll(). There is a race condition where rx_poll() unmasks the interrupt, but the next IRQ handler does not mask the IRQ due to NAPIF_STATE_MISSED flag (e.g.: can0 RX FIFO interrupt is disabled and can1 is triggering RX interrupt, the delay in rx_poll() processing results in setting NAPIF_STATE_MISSED flag) leading to an IRQ storm. This patch fixes the issue by checking IRQ active and enabled before handling the IRQ on a particular channel. Fixes: dd3bd23e ("can: rcar_canfd: Add Renesas R-Car CAN FD driver") Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/all/20221025155657.1426948-2-biju.das.jz@bp.renesas.com Cc: stable@vger.kernel.org [mkl: adjust commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Anssi Hannula authored
kvaser_usb uses completions to signal when a response event is received for outgoing commands. However, it uses init_completion() to reinitialize the start_comp and stop_comp completions before sending the start/stop commands. In case the device sends the corresponding response just before the actual command is sent, complete() may be called concurrently with init_completion() which is not safe. This might be triggerable even with a properly functioning device by stopping the interface (CMD_STOP_CHIP) just after it goes bus-off (which also causes the driver to send CMD_STOP_CHIP when restart-ms is off), but that was not tested. Fix the issue by using reinit_completion() instead. Fixes: 080f40a6 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-2-extja@kvaser.com Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Kunihiko Hayashi authored
The phylib callback is called after MAC driver's own resume callback is called. For AVE driver, after resuming immediately, PHY state machine is in PHY_NOLINK because there is a time lag from link-down to link-up due to autoneg. The result is WARN_ON() dump in mdio_bus_phy_resume(). Since ave_resume() itself calls phy_resume(), AVE driver should manage PHY PM. To indicate that MAC driver manages PHY PM, set phydev->mac_managed_pm to true to avoid the unnecessary phylib call and add missing phy_init_hw() to ave_resume(). Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM") Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://lore.kernel.org/r/20221024072227.24769-1-hayashi.kunihiko@socionext.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Juergen Borleis authored
Using 'ethtool -d […]' on an i.MX6UL leads to a kernel crash: Unhandled fault: external abort on non-linefetch (0x1008) at […] due to this SoC has less registers in its FEC implementation compared to other i.MX6 variants. Thus, a run-time decision is required to avoid access to non-existing registers. Fixes: a51d3ab5 ("net: fec: use a more proper compatible string for i.MX6UL type device") Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20221024080552.21004-1-jbe@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-