- 12 Oct, 2021 15 commits
-
-
Maxim Mikityanskiy authored
Commit 846d6da1 ("net/mlx5e: Fix division by 0 in mlx5e_select_queue") makes mlx5e_build_nic_params assign a non-zero initial value to priv->num_tc_x_num_ch, so that mlx5e_select_queue doesn't fail with division by 0 if called before the first activation of channels. However, the initialization flow of representors doesn't call mlx5e_build_nic_params, so this bug can still happen with representors. This commit fixes the bug by adding the missing assignment to mlx5e_build_rep_params. Fixes: 846d6da1 ("net/mlx5e: Fix division by 0 in mlx5e_select_queue") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Aya Levin authored
Due to current HW arch limitations, RX-FCS (scattering FCS frame field to software) and RX-port-timestamp (improved timestamp accuracy on the receive side) can't work together. RX-port-timestamp is not controlled by the user and it is enabled by default when supported by the HW/FW. This patch sets RX-port-timestamp opposite to RX-FCS configuration. Fixes: 102722fc ("net/mlx5e: Add support for RXFCS feature flag") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Saeed Mahameed authored
Before this patch, mlx5 representors advertised the NETIF_F_VLAN_CHALLENGED bit, this could lead to missing features when using reps with vxlan/bridge and maybe other virtual interfaces, when such interfaces inherit this bit and block vlan usage in their topology. Example: $ip link add dev bridge type bridge # add representor interface to the bridge $ip link set dev pf0hpf master $ip link add link bridge name vlan10 type vlan id 10 protocol 802.1q Error: 8021q: VLANs not supported on device. Reps are perfectly capable of handling vlan traffic, although they don't implement vlan_{add,kill}_vid ndos, hence, remove NETIF_F_VLAN_CHALLENGED advertisement. Fixes: cb67b832 ("net/mlx5e: Introduce SRIOV VF representors") Reported-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com>
-
Valentine Fatiev authored
Prior to this patch in case mlx5_core_destroy_cq() failed it returns without completing all destroy operations and that leads to memory leak. Instead, complete the destroy flow before return error. Also move mlx5_debug_cq_remove() to the beginning of mlx5_core_destroy_cq() to be symmetrical with mlx5_core_create_cq(). kmemleak complains on: unreferenced object 0xc000000038625100 (size 64): comm "ethtool", pid 28301, jiffies 4298062946 (age 785.380s) hex dump (first 32 bytes): 60 01 48 94 00 00 00 c0 b8 05 34 c3 00 00 00 c0 `.H.......4..... 02 00 00 00 00 00 00 00 00 db 7d c1 00 00 00 c0 ..........}..... backtrace: [<000000009e8643cb>] add_res_tree+0xd0/0x270 [mlx5_core] [<00000000e7cb8e6c>] mlx5_debug_cq_add+0x5c/0xc0 [mlx5_core] [<000000002a12918f>] mlx5_core_create_cq+0x1d0/0x2d0 [mlx5_core] [<00000000cef0a696>] mlx5e_create_cq+0x210/0x3f0 [mlx5_core] [<000000009c642c26>] mlx5e_open_cq+0xb4/0x130 [mlx5_core] [<0000000058dfa578>] mlx5e_ptp_open+0x7f4/0xe10 [mlx5_core] [<0000000081839561>] mlx5e_open_channels+0x9cc/0x13e0 [mlx5_core] [<0000000009cf05d4>] mlx5e_switch_priv_channels+0xa4/0x230 [mlx5_core] [<0000000042bbedd8>] mlx5e_safe_switch_params+0x14c/0x300 [mlx5_core] [<0000000004bc9db8>] set_pflag_tx_port_ts+0x9c/0x160 [mlx5_core] [<00000000a0553443>] mlx5e_set_priv_flags+0xd0/0x1b0 [mlx5_core] [<00000000a8f3d84b>] ethnl_set_privflags+0x234/0x2d0 [<00000000fd27f27c>] genl_family_rcv_msg_doit+0x108/0x1d0 [<00000000f495e2bb>] genl_family_rcv_msg+0xe4/0x1f0 [<00000000646c5c2c>] genl_rcv_msg+0x78/0x120 [<00000000d53e384e>] netlink_rcv_skb+0x74/0x1a0 Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Valentine Fatiev <valentinef@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Tariq Toukan authored
Do not allow configurations of MQPRIO channel mode that do not fully define and utilize the channels txqs. Fixes: ec60c458 ("net/mlx5e: Support MQPRIO channel mode") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Currently, bridge cleanup is calling to cancel_delayed_work(). When this function is finished, there is a chance that the delayed work is still running. Also, the delayed work is queueing itself. As a result, we might execute the delayed work after the bridge cleanup have finished and hit a null-ptr oops[1]. Fix it by using cancel_delayed_work_sync(), which is waiting until the work is done and will cancel the queue work. [1] [ 8202.143043 ] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 8202.144438 ] #PF: supervisor write access in kernel mode [ 8202.145476 ] #PF: error_code(0x0002) - not-present page [ 8202.146520 ] PGD 0 P4D 0 [ 8202.147126 ] Oops: 0002 [#1] SMP NOPTI [ 8202.147899 ] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.14.0-rc6_for_upstream_min_debug_2021_08_25_16_06 #1 [ 8202.149741 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 8202.151908 ] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 8202.156234 ] RSP: 0018:ffff88846f885ea0 EFLAGS: 00010046 [ 8202.157289 ] RAX: 0000000000000000 RBX: ffff88846f880000 RCX: 0000000000000000 [ 8202.158731 ] RDX: 0000000000000001 RSI: ffff8881004000c8 RDI: 0000000000000000 [ 8202.160177 ] RBP: ffff8881fe684978 R08: ffff888100140000 R09: ffffffff824455b8 [ 8202.161569 ] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 8202.163004 ] R13: 0000000000000012 R14: 0000000000000200 R15: ffff88812992d000 [ 8202.164018 ] FS: 0000000000000000(0000) GS:ffff88846f880000(0000) knlGS:0000000000000000 [ 8202.164960 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8202.165634 ] CR2: 0000000000000000 CR3: 0000000108cac004 CR4: 0000000000370ea0 [ 8202.166450 ] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8202.167807 ] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8202.168852 ] Call Trace: [ 8202.169421 ] <IRQ> [ 8202.169792 ] __queue_work+0xf2/0x3d0 [ 8202.170481 ] ? queue_work_node+0x40/0x40 [ 8202.171270 ] call_timer_fn+0x2b/0x100 [ 8202.171932 ] __run_timers.part.0+0x152/0x220 [ 8202.172717 ] ? __hrtimer_run_queues+0x171/0x290 [ 8202.173526 ] ? kvm_clock_get_cycles+0xd/0x10 [ 8202.174232 ] ? ktime_get+0x35/0x90 [ 8202.174943 ] run_timer_softirq+0x26/0x50 [ 8202.175745 ] __do_softirq+0xc7/0x271 [ 8202.176373 ] irq_exit_rcu+0x93/0xb0 [ 8202.176983 ] sysvec_apic_timer_interrupt+0x72/0x90 [ 8202.177755 ] </IRQ> [ 8202.178245 ] asm_sysvec_apic_timer_interrupt+0x12/0x20 Fixes: c636a0f0 ("net/mlx5: Bridge, dynamic entry ageing") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Jacob Keller authored
Commit 4dd0d5c3 ("ice: add lock around Tx timestamp tracker flush") added a lock around the Tx timestamp tracker flow which is used to cleanup any left over SKBs and prepare for device removal. This lock is problematic because it is being held around a call to ice_clear_phy_tstamp. The clear function takes a mutex to send a PHY write command to firmware. This could lead to a deadlock if the mutex actually sleeps, and causes the following warning on a kernel with preemption debugging enabled: [ 715.419426] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:573 [ 715.427900] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3100, name: rmmod [ 715.435652] INFO: lockdep is turned off. [ 715.439591] Preemption disabled at: [ 715.439594] [<0000000000000000>] 0x0 [ 715.446678] CPU: 52 PID: 3100 Comm: rmmod Tainted: G W OE 5.15.0-rc4+ #42 bdd7ec3018e725f159ca0d372ce8c2c0e784891c [ 715.458058] Hardware name: Intel Corporation S2600STQ/S2600STQ, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ 715.468483] Call Trace: [ 715.470940] dump_stack_lvl+0x6a/0x9a [ 715.474613] ___might_sleep.cold+0x224/0x26a [ 715.478895] __mutex_lock+0xb3/0x1440 [ 715.482569] ? stack_depot_save+0x378/0x500 [ 715.486763] ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.494979] ? kfree+0xc1/0x520 [ 715.498128] ? mutex_lock_io_nested+0x12a0/0x12a0 [ 715.502837] ? kasan_set_free_info+0x20/0x30 [ 715.507110] ? __kasan_slab_free+0x10b/0x140 [ 715.511385] ? slab_free_freelist_hook+0xc7/0x220 [ 715.516092] ? kfree+0xc1/0x520 [ 715.519235] ? ice_deinit_lag+0x16c/0x220 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.527359] ? ice_remove+0x1cf/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.535133] ? pci_device_remove+0xab/0x1d0 [ 715.539318] ? __device_release_driver+0x35b/0x690 [ 715.544110] ? driver_detach+0x214/0x2f0 [ 715.548035] ? bus_remove_driver+0x11d/0x2f0 [ 715.552309] ? pci_unregister_driver+0x26/0x250 [ 715.556840] ? ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.564799] ? __do_sys_delete_module.constprop.0+0x2d8/0x4e0 [ 715.570554] ? do_syscall_64+0x3b/0x90 [ 715.574303] ? entry_SYSCALL_64_after_hwframe+0x44/0xae [ 715.579529] ? start_flush_work+0x542/0x8f0 [ 715.583719] ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.591923] ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.599960] ? wait_for_completion_io+0x250/0x250 [ 715.604662] ? lock_acquire+0x196/0x200 [ 715.608504] ? do_raw_spin_trylock+0xa5/0x160 [ 715.612864] ice_sbq_rw_reg+0x1e6/0x2f0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.620813] ? ice_reset+0x130/0x130 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.628497] ? __debug_check_no_obj_freed+0x1e8/0x3c0 [ 715.633550] ? trace_hardirqs_on+0x1c/0x130 [ 715.637748] ice_write_phy_reg_e810+0x70/0xf0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.646220] ? do_raw_spin_trylock+0xa5/0x160 [ 715.650581] ? ice_ptp_release+0x910/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.658797] ? ice_ptp_release+0x255/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.667013] ice_clear_phy_tstamp+0x2c/0x110 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.675403] ice_ptp_release+0x408/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.683440] ice_remove+0x560/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.691037] ? _raw_spin_unlock_irqrestore+0x46/0x73 [ 715.696005] pci_device_remove+0xab/0x1d0 [ 715.700018] __device_release_driver+0x35b/0x690 [ 715.704637] driver_detach+0x214/0x2f0 [ 715.708389] bus_remove_driver+0x11d/0x2f0 [ 715.712489] pci_unregister_driver+0x26/0x250 [ 715.716857] ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d] [ 715.724637] __do_sys_delete_module.constprop.0+0x2d8/0x4e0 [ 715.730210] ? free_module+0x6d0/0x6d0 [ 715.733963] ? task_work_run+0xe1/0x170 [ 715.737803] ? exit_to_user_mode_loop+0x17f/0x1d0 [ 715.742509] ? rcu_read_lock_sched_held+0x12/0x80 [ 715.747215] ? trace_hardirqs_on+0x1c/0x130 [ 715.751401] do_syscall_64+0x3b/0x90 [ 715.754981] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 715.760033] RIP: 0033:0x7f4dfe59000b [ 715.763612] Code: 73 01 c3 48 8b 0d 6d 1e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 1e 0c 00 f7 d8 64 89 01 48 [ 715.782357] RSP: 002b:00007ffe8c891708 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 715.789923] RAX: ffffffffffffffda RBX: 00005558a20468b0 RCX: 00007f4dfe59000b [ 715.797054] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005558a2046918 [ 715.804189] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 715.811319] R10: 00007f4dfe603ac0 R11: 0000000000000206 R12: 00007ffe8c891940 [ 715.818455] R13: 00007ffe8c8920a3 R14: 00005558a20462a0 R15: 00005558a20468b0 Notice that this is the only case where we use the lock in this way. In the cleanup kthread and work kthread the lock is only taken around the bit accesses. This was done intentionally to avoid this kind of issue. The way the lock is used, we only protect ordering of bit sets vs bit clears. The Tx writers in the hot path don't need to be protected against the entire kthread loop. The Tx queues threads only need to ensure that they do not re-use an index that is currently in use. The cleanup loop does not need to block all new set bits, since it will re-queue itself if new timestamps are present. Fix the tracker flow so that it uses the same flow as the standard cleanup thread. In addition, ensure the in_use bitmap actually gets cleared properly. This fixes the warning and also avoids the potential deadlock that might have occurred otherwise. Fixes: 4dd0d5c3 ("ice: add lock around Tx timestamp tracker flush") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Justin Iurman says: ==================== Correct the IOAM behavior for undefined trace type bits (@Jakub @David: there will be a conflict for #2 when merging net->net-next, due to commit [1]. The conflict is only 5-10 lines for #2 (#1 should be fine) inside the file tools/testing/selftests/net/ioam6.sh, so quite short though possibly ugly. Sorry for that, I didn't expect to post this one... Had I known, I'd have made the opposite.) Modify both the input and output behaviors regarding the trace type when one of the undefined bits is set. The goal is to keep the interoperability when new fields (aka new bits inside the range 12-21) will be defined. The draft [2] says the following: --------------------------------------------------------------- "Bit 12-21 Undefined. These values are available for future assignment in the IOAM Trace-Type Registry (Section 8.2). Every future node data field corresponding to one of these bits MUST be 4-octets long. An IOAM encapsulating node MUST set the value of each undefined bit to 0. If an IOAM transit node receives a packet with one or more of these bits set to 1, it MUST either: 1. Add corresponding node data filled with the reserved value 0xFFFFFFFF, after the node data fields for the IOAM-Trace-Type bits defined above, such that the total node data added by this node in units of 4-octets is equal to NodeLen, or 2. Not add any node data fields to the packet, even for the IOAM-Trace-Type bits defined above." --------------------------------------------------------------- The output behavior has been modified to respect the fact that "an IOAM encap node MUST set the value of each undefined bit to 0" (i.e., undefined bits can't be set anymore). As for the input behavior, current implementation is based on the second choice (i.e., "not add any data fields to the packet [...]"). With this solution, any interoperability is lost (i.e., if a new bit is defined, then an "old" kernel implementation wouldn't fill IOAM data when such new bit is set inside the trace type). The input behavior is therefore relaxed and these undefined bits are now allowed to be set. It is only possible thanks to the sentence "every future node data field corresponding to one of these bits MUST be 4-octets long". Indeed, the default empty value (the one for 4-octet fields) is inserted whenever an undefined bit is set. [1] cfbe9b00 [2] https://datatracker.ietf.org/doc/html/draft-ietf-ippm-ioam-data#section-5.4.1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Justin Iurman authored
The output behavior for undefined bits is now directly tested inside the bash script. Trying to set an undefined bit should be refused. The input behavior for undefined bits has been removed due to the fact that we would need another sender allowed to set undefined bits. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Justin Iurman authored
The check for undefined bits in the trace type is moved from the input side to the output side, while the input side is relaxed and now inserts default empty values when an undefined bit is set. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arun Ramadoss authored
When the ksz module is installed and removed using rmmod, kernel crashes with null pointer dereferrence error. During rmmod, ksz_switch_remove function tries to cancel the mib_read_workqueue using cancel_delayed_work_sync routine and unregister switch from dsa. During dsa_unregister_switch it calls ksz_mac_link_down, which in turn reschedules the workqueue since mib_interval is non-zero. Due to which queue executed after mib_interval and it tries to access dp->slave. But the slave is unregistered in the ksz_switch_remove function. Hence kernel crashes. To avoid this crash, before canceling the workqueue, resetted the mib_interval to 0. v1 -> v2: -Removed the if condition in ksz_mib_read_work Fixes: 469b390e ("net: dsa: microchip: use delayed_work instead of timer + work") Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vegard Nossum authored
Fix the following build/link errors by adding a dependency on CRYPTO, CRYPTO_HASH, CRYPTO_SHA256 and CRC32: ld: drivers/net/usb/r8152.o: in function `rtl8152_fw_verify_checksum': r8152.c:(.text+0x2b2a): undefined reference to `crypto_alloc_shash' ld: r8152.c:(.text+0x2bed): undefined reference to `crypto_shash_digest' ld: r8152.c:(.text+0x2c50): undefined reference to `crypto_destroy_tfm' ld: drivers/net/usb/r8152.o: in function `_rtl8152_set_rx_mode': r8152.c:(.text+0xdcb0): undefined reference to `crc32_le' Fixes: 9370f2d0 ("r8152: support request_firmware for RTL8153") Fixes: ac718b69 ("net/usb: new driver for RTL8152") Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maarten Zanders authored
mv88e6xxx_port_ppu_updates() interpretes data in the PORT_STS register incorrectly for internal ports (ie no PPU). In these cases, the PHY_DETECT bit indicates link status. This results in forcing the MAC state whenever the PHY link goes down which is not intended. As a side effect, LED's configured to show link status stay lit even though the physical link is down. Add a check in mac_link_down and mac_link_up to see if it concerns an external port and only then, look at PPU status. Fixes: 5d5b231d (net: dsa: mv88e6xxx: use PHY_DETECT in mac_link_up/mac_link_down) Reported-by: Maarten Zanders <m.zanders@televic.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Maarten Zanders <maarten.zanders@mind.be> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wan Jiabing authored
Fix the following coccicheck warning: drivers/net/ethernet/mscc/ocelot.c:474:duplicated argument to & or | drivers/net/ethernet/mscc/ocelot.c:476:duplicated argument to & or | drivers/net/ethernet/mscc/ocelot_net.c:1627:duplicated argument to & or | These DEV_CLOCK_CFG_MAC_TX_RST are duplicate here. Here should be DEV_CLOCK_CFG_MAC_RX_RST. Fixes: e6e12df6 ("net: mscc: ocelot: convert to phylink") Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Boyd authored
Then name of this protocol changed in commit 94531cfc ("af_unix: Add unix_stream_proto for sockmap") because that commit added stream support to the af_unix protocol. Renaming the existing protocol makes a ChromeOS protocol test[1] fail now that the name has changed in /proc/net/protocols from "UNIX" to "UNIX-DGRAM". Let's put the name back to how it was while keeping the stream protocol as "UNIX-STREAM" so that the procfs interface doesn't change. This fixes the test and maintains backwards compatibility in proc. Cc: Jiang Wang <jiang.wang@bytedance.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Cong Wang <cong.wang@bytedance.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Dmitry Osipenko <digetx@gmail.com> Link: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/supported_protocols.go;l=50;drc=e8b1c3f94cb40a054f4aa1ef1aff61e75dc38f18 [1] Fixes: 94531cfc ("af_unix: Add unix_stream_proto for sockmap") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 09 Oct, 2021 7 commits
-
-
Xuan Zhuo authored
commit 12628565 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net") accidentally reverted the effect of commit 1a802423 ("virtio-net: fix for skb_over_panic inside big mode") on drivers/net/virtio_net.c As a result, users of crosvm (which is using large packet mode) are experiencing crashes with 5.14-rc1 and above that do not occur with 5.13. Crash trace: [ 61.346677] skbuff: skb_over_panic: text:ffffffff881ae2c7 len:3762 put:3762 head:ffff8a5ec8c22000 data:ffff8a5ec8c22010 tail:0xec2 end:0xec0 dev:<NULL> [ 61.369192] kernel BUG at net/core/skbuff.c:111! [ 61.372840] invalid opcode: 0000 [#1] SMP PTI [ 61.374892] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.14.0-rc1 linux-v5.14-rc1-for-mesa-ci.tar.bz2 #1 [ 61.376450] Hardware name: ChromiumOS crosvm, BIOS 0 .. [ 61.393635] Call Trace: [ 61.394127] <IRQ> [ 61.394488] skb_put.cold+0x10/0x10 [ 61.395095] page_to_skb+0xf7/0x410 [ 61.395689] receive_buf+0x81/0x1660 [ 61.396228] ? netif_receive_skb_list_internal+0x1ad/0x2b0 [ 61.397180] ? napi_gro_flush+0x97/0xe0 [ 61.397896] ? detach_buf_split+0x67/0x120 [ 61.398573] virtnet_poll+0x2cf/0x420 [ 61.399197] __napi_poll+0x25/0x150 [ 61.399764] net_rx_action+0x22f/0x280 [ 61.400394] __do_softirq+0xba/0x257 [ 61.401012] irq_exit_rcu+0x8e/0xb0 [ 61.401618] common_interrupt+0x7b/0xa0 [ 61.402270] </IRQ> See https://lore.kernel.org/r/5edaa2b7c2fe4abd0347b8454b2ac032b6694e2c.camel%40collabora.com for the report. Apply the original 1a802423 ("virtio-net: fix for skb_over_panic inside big mode") again, the original logic still holds: In virtio-net's large packet mode, there is a hole in the space behind buf. hdr_padded_len - hdr_len We must take this into account when calculating tailroom. Cc: Greg KH <gregkh@linuxfoundation.org> Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Fixes: 12628565 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reported-by: Corentin Noël <corentin.noel@collabora.com> Tested-by: Corentin Noël <corentin.noel@collabora.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
In case a PHY device was probed thus in the PHY_READY state, but not configured and with no network device attached yet, we should not be trying to shut it down because it has been brought back into reset by phy_device_reset() towards the end of phy_probe() and anyway we have not configured the PHY yet. Fixes: e2f016cf ("net: phy: add a shutdown procedure") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
chongjiapeng authored
The error code is missing in this code scenario, add the error code '-EINVAL' to the return value 'rc'. Eliminate the follow smatch warning: drivers/net/ethernet/qlogic/qed/qed_main.c:1298 qed_slowpath_start() warn: missing error code 'rc'. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Fixes: d51e4af5 ("qed: aRFS infrastructure support") Signed-off-by: chongjiapeng <jiapeng.chong@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
It was a documented fact that ds->ops->change_tag_protocol() offered rtnetlink mutex protection to the switch driver, since there was an ASSERT_RTNL right before the call in dsa_switch_change_tag_proto() (initiated from sysfs). The blamed commit introduced another call path for ds->ops->change_tag_protocol() which does not hold the rtnl_mutex. This is: dsa_tree_setup -> dsa_tree_setup_switches -> dsa_switch_setup -> dsa_switch_setup_tag_protocol -> ds->ops->change_tag_protocol() -> dsa_port_setup -> dsa_slave_create -> register_netdevice(slave_dev) -> dsa_tree_setup_master -> dsa_master_setup -> dev->dsa_ptr = cpu_dp The reason why the rtnl_mutex is held in the sysfs call path is to ensure that, once the master and all the DSA interfaces are down (which is required so that no packets flow), they remain down during the tagging protocol change. The above calling order illustrates the fact that it should not be risky to change the initial tagging protocol to the one specified in the device tree at the given time: - packets cannot enter the dsa_switch_rcv() packet type handler since netdev_uses_dsa() for the master will not yet return true, since dev->dsa_ptr has not yet been populated - packets cannot enter the dsa_slave_xmit() function because no DSA interface has yet been registered So from the DSA core's perspective, holding the rtnl_mutex is indeed not necessary. Yet, drivers may need to do things which need rtnl_mutex protection. For example: felix_set_tag_protocol -> felix_setup_tag_8021q -> dsa_tag_8021q_register -> dsa_tag_8021q_setup -> dsa_tag_8021q_port_setup -> vlan_vid_add -> ASSERT_RTNL These drivers do not really have a choice to take the rtnl_mutex themselves, since in the sysfs case, the rtnl_mutex is already held. Fixes: deff7107 ("net: dsa: Allow default tag protocol to be overridden from DT") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zheyu Ma authored
The driver can call card->isac.release() function from an atomic context. Fix this by calling this function after releasing the lock. The following log reveals it: [ 44.168226 ] BUG: sleeping function called from invalid context at kernel/workqueue.c:3018 [ 44.168941 ] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 5475, name: modprobe [ 44.169574 ] INFO: lockdep is turned off. [ 44.169899 ] irq event stamp: 0 [ 44.170160 ] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 44.170627 ] hardirqs last disabled at (0): [<ffffffff814209ed>] copy_process+0x132d/0x3e00 [ 44.171240 ] softirqs last enabled at (0): [<ffffffff81420a1a>] copy_process+0x135a/0x3e00 [ 44.171852 ] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 44.172318 ] Preemption disabled at: [ 44.172320 ] [<ffffffffa009b0a9>] nj_release+0x69/0x500 [netjet] [ 44.174441 ] Call Trace: [ 44.174630 ] dump_stack_lvl+0xa8/0xd1 [ 44.174912 ] dump_stack+0x15/0x17 [ 44.175166 ] ___might_sleep+0x3a2/0x510 [ 44.175459 ] ? nj_release+0x69/0x500 [netjet] [ 44.175791 ] __might_sleep+0x82/0xe0 [ 44.176063 ] ? start_flush_work+0x20/0x7b0 [ 44.176375 ] start_flush_work+0x33/0x7b0 [ 44.176672 ] ? trace_irq_enable_rcuidle+0x85/0x170 [ 44.177034 ] ? kasan_quarantine_put+0xaa/0x1f0 [ 44.177372 ] ? kasan_quarantine_put+0xaa/0x1f0 [ 44.177711 ] __flush_work+0x11a/0x1a0 [ 44.177991 ] ? flush_work+0x20/0x20 [ 44.178257 ] ? lock_release+0x13c/0x8f0 [ 44.178550 ] ? __kasan_check_write+0x14/0x20 [ 44.178872 ] ? do_raw_spin_lock+0x148/0x360 [ 44.179187 ] ? read_lock_is_recursive+0x20/0x20 [ 44.179530 ] ? __kasan_check_read+0x11/0x20 [ 44.179846 ] ? do_raw_spin_unlock+0x55/0x900 [ 44.180168 ] ? ____kasan_slab_free+0x116/0x140 [ 44.180505 ] ? _raw_spin_unlock_irqrestore+0x41/0x60 [ 44.180878 ] ? skb_queue_purge+0x1a3/0x1c0 [ 44.181189 ] ? kfree+0x13e/0x290 [ 44.181438 ] flush_work+0x17/0x20 [ 44.181695 ] mISDN_freedchannel+0xe8/0x100 [ 44.182006 ] isac_release+0x210/0x260 [mISDNipac] [ 44.182366 ] nj_release+0xf6/0x500 [netjet] [ 44.182685 ] nj_remove+0x48/0x70 [netjet] [ 44.182989 ] pci_device_remove+0xa9/0x250 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
Bridging, and possibly other upper stack gizmos, adds the lower device's netdev->dev_addr to its own uc list, and then requests it be deleted when the upper bridge device is removed. This delete request also happens with the bridging vlan_filtering is enabled and then disabled. Bonding has a similar behavior with the uc list, but since it also uses set_mac to manage netdev->dev_addr, it doesn't have the same the failure case. Because we store our netdev->dev_addr in our uc list, we need to ignore the delete request from dev_uc_sync so as to not lose the address and all hope of communicating. Note that ndo_set_mac_address is expressly changing netdev->dev_addr, so no limitation is set there. Fixes: 2a654540 ("ionic: Add Rx filter and rx_mode ndo support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haiyang Zhang authored
Fix error handling in mana_create_rxq() when cq->gdma_id >= gc->max_num_cqs. Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1633698691-31721-1-git-send-email-haiyangz@microsoft.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 08 Oct, 2021 15 commits
-
-
Xiaolong Huang authored
The cmtp_add_connection() would add a cmtp session to a controller and run a kernel thread to process cmtp. __module_get(THIS_MODULE); session->task = kthread_run(cmtp_session, session, "kcmtpd_ctr_%d", session->num); During this process, the kernel thread would call detach_capi_ctr() to detach a register controller. if the controller was not attached yet, detach_capi_ctr() would trigger an array-index-out-bounds bug. [ 46.866069][ T6479] UBSAN: array-index-out-of-bounds in drivers/isdn/capi/kcapi.c:483:21 [ 46.867196][ T6479] index -1 is out of range for type 'capi_ctr *[32]' [ 46.867982][ T6479] CPU: 1 PID: 6479 Comm: kcmtpd_ctr_0 Not tainted 5.15.0-rc2+ #8 [ 46.869002][ T6479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 46.870107][ T6479] Call Trace: [ 46.870473][ T6479] dump_stack_lvl+0x57/0x7d [ 46.870974][ T6479] ubsan_epilogue+0x5/0x40 [ 46.871458][ T6479] __ubsan_handle_out_of_bounds.cold+0x43/0x48 [ 46.872135][ T6479] detach_capi_ctr+0x64/0xc0 [ 46.872639][ T6479] cmtp_session+0x5c8/0x5d0 [ 46.873131][ T6479] ? __init_waitqueue_head+0x60/0x60 [ 46.873712][ T6479] ? cmtp_add_msgpart+0x120/0x120 [ 46.874256][ T6479] kthread+0x147/0x170 [ 46.874709][ T6479] ? set_kthread_struct+0x40/0x40 [ 46.875248][ T6479] ret_from_fork+0x1f/0x30 [ 46.875773][ T6479] Signed-off-by: Xiaolong Huang <butterflyhuangxx@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211008065830.305057-1-butterflyhuangxx@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
Introduction of lockless subqueues broke the class statistics. Before the change stats were accumulated in `bstats' and `qstats' on the stack which was then copied to struct gnet_dump. After the change the `bstats' and `qstats' are initialized to 0 and never updated, yet still fed to gnet_dump. The code updates the global qdisc->cpu_bstats and qdisc->cpu_qstats instead, clobbering them. Most likely a copy-paste error from the code in mqprio_dump(). __gnet_stats_copy_basic() and __gnet_stats_copy_queue() accumulate the values for per-CPU case but for global stats they overwrite the value, so only stats from the last loop iteration / tc end up in sch->[bq]stats. Use the on-stack [bq]stats variables again and add the stats manually in the global case. Fixes: ce679e8d ("net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> https://lore.kernel.org/all/20211007175000.2334713-2-bigeasy@linutronix.de/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Vladimir Oltean says: ==================== DSA bridge TX forwarding offload fixes - part 1 This is part 1 of a series of fixes to the bridge TX forwarding offload feature introduced for v5.15. Sadly, the other fixes are so intrusive that they cannot be reasonably be sent to the "net" tree, as they also include API changes. So they are left as part 2 for net-next. ==================== Link: https://lore.kernel.org/r/20211007164711.2897238-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
Similar to commit 6087175b ("net: dsa: mt7530: use independent VLAN learning on VLAN-unaware bridges"), software forwarding between an unoffloaded LAG port (a bonding interface with an unsupported policy) and a mv88e6xxx user port directly under a bridge is broken. We adopt the same strategy, which is to make the standalone ports not find any ATU entry learned on a bridge port. Theory: the mv88e6xxx ATU is looked up by FID and MAC address. There are as many FIDs as VIDs (4096). The FID is derived from the VID when possible (the VTU maps a VID to a FID), with a fallback to the port based default FID value when not (802.1Q Mode is disabled on the port, or the classified VID isn't present in the VTU). The mv88e6xxx driver makes the following use of FIDs and VIDs: - the port's DefaultVID (to which untagged & pvid-tagged packets get classified) is 0 and is absent from the VTU, so this kind of packets is processed in FID 0, the default FID assigned by mv88e6xxx_setup_port. - every time a bridge VLAN is created, mv88e6xxx_port_vlan_join() -> mv88e6xxx_atu_new() associates a FID with that VID which increases linearly starting from 1. Like this: bridge vlan add dev lan0 vid 100 # FID 1 bridge vlan add dev lan1 vid 100 # still FID 1 bridge vlan add dev lan2 vid 1024 # FID 2 The FID allocation made by the driver is sub-optimal for the following reasons: (a) A standalone port has a DefaultPVID of 0 and a default FID of 0 too. A VLAN-unaware bridged port has a DefaultPVID of 0 and a default FID of 0 too. The difference is that the bridged ports may learn ATU entries, while the standalone port has the requirement that it must not, and must not find them either. Standalone ports must not use the same FID as ports belonging to a bridge. All standalone ports can use the same FID, since the ATU will never have an entry in that FID. (b) Multiple VLAN-unaware bridges will all use a DefaultPVID of 0 and a default FID of 0 on all their ports. The FDBs will not be isolated between these bridges. Every VLAN-unaware bridge must use the same FID on all its ports, different from the FID of other bridge ports. (c) Each bridge VLAN uses a unique FID which is useful for Independent VLAN Learning, but the same VLAN ID on multiple VLAN-aware bridges will result in the same FID being used by mv88e6xxx_atu_new(). The correct behavior is for VLAN 1 in br0 to have a different FID compared to VLAN 1 in br1. This patch cannot fix all the above. Traditionally the DSA framework did not care about this, and the reality is that DSA core involvement is needed for the aforementioned issues to be solved. The only thing we can solve here is an issue which does not require API changes, and that is issue (a), aka use a different FID for standalone ports vs ports under VLAN-unaware bridges. The first step is deciding what VID and FID to use for standalone ports, and what VID and FID for bridged ports. The 0/0 pair for standalone ports is what they used up till now, let's keep using that. For bridged ports, there are 2 cases: - VLAN-aware ports will never end up using the port default FID, because packets will always be classified to a VID in the VTU or dropped otherwise. The FID is the one associated with the VID in the VTU. - On VLAN-unaware ports, we _could_ leave their DefaultVID (pvid) at zero (just as in the case of standalone ports), and just change the port's default FID from 0 to a different number (say 1). However, Tobias points out that there is one more requirement to cater to: cross-chip bridging. The Marvell DSA header does not carry the FID in it, only the VID. So once a packet crosses a DSA link, if it has a VID of zero it will get classified to the default FID of that cascade port. Relying on a port default FID for upstream cascade ports results in contradictions: a default FID of 0 breaks ATU isolation of bridged ports on the downstream switch, a default FID of 1 breaks standalone ports on the downstream switch. So not only must standalone ports have different FIDs compared to bridged ports, they must also have different DefaultVID values. IEEE 802.1Q defines two reserved VID values: 0 and 4095. So we simply choose 4095 as the DefaultVID of ports belonging to VLAN-unaware bridges, and VID 4095 maps to FID 1. For the xmit operation to look up the same ATU database, we need to put VID 4095 in DSA tags sent to ports belonging to VLAN-unaware bridges too. All shared ports are configured to map this VID to the bridging FID, because they are members of that VLAN in the VTU. Shared ports don't need to have 802.1QMode enabled in any way, they always parse the VID from the DSA header, they don't need to look at the 802.1Q header. We install VID 4095 to the VTU in mv88e6xxx_setup_port(), with the mention that mv88e6xxx_vtu_setup() which was located right below that call was flushing the VTU so those entries wouldn't be preserved. So we need to relocate the VTU flushing prior to the port initialization during ->setup(). Also note that this is why it is safe to assume that VID 4095 will get associated with FID 1: the user ports haven't been created, so there is no avenue for the user to create a bridge VLAN which could otherwise race with the creation of another FID which would otherwise use up the non-reserved FID value of 1. [ Currently mv88e6xxx_port_vlan_join() doesn't have the option of specifying a preferred FID, it always calls mv88e6xxx_atu_new(). ] mv88e6xxx_port_db_load_purge() is the function to access the ATU for FDB/MDB entries, and it used to determine the FID to use for VLAN-unaware FDB entries (VID=0) using mv88e6xxx_port_get_fid(). But the driver only called mv88e6xxx_port_set_fid() once, during probe, so no surprises, the port FID was always 0, the call to get_fid() was redundant. As much as I would have wanted to not touch that code, the logic is broken when we add a new FID which is not the port-based default. Now the port-based default FID only corresponds to standalone ports, and FDB/MDB entries belong to the bridging service. So while in the future, when the DSA API will support FDB isolation, we will have to figure out the FID based on the bridge number, for now there's a single bridging FID, so hardcode that. Lastly, the tagger needs to check, when it is transmitting a VLAN untagged skb, whether it is sending it towards a bridged or a standalone port. When we see it is bridged we assume the bridge is VLAN-unaware. Not because it cannot be VLAN-aware but: - if we are transmitting from a VLAN-aware bridge we are likely doing so using TX forwarding offload. That code path guarantees that skbs have a vlan hwaccel tag in them, so we would not enter the "else" branch of the "if (skb->protocol == htons(ETH_P_8021Q))" condition. - if we are transmitting on behalf of a VLAN-aware bridge but with no TX forwarding offload (no PVT support, out of space in the PVT, whatever), we would indeed be transmitting with VLAN 4095 instead of the bridge device's pvid. However we would be injecting a "From CPU" frame, and the switch won't learn from that - it only learns from "Forward" frames. So it is inconsequential for address learning. And VLAN 4095 is absolutely enough for the frame to exit the switch, since we never remove that VLAN from any port. Fixes: 57e661aa ("net: dsa: mv88e6xxx: Link aggregation support") Reported-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The VLAN support in mv88e6xxx has a loaded history. Commit 2ea7a679 ("net: dsa: Don't add vlans when vlan filtering is disabled") noticed some issues with VLAN and decided the best way to deal with them was to make the DSA core ignore VLANs added by the bridge while VLAN awareness is turned off. Those issues were never explained, just presented as "at least one corner case". That approach had problems of its own, presented by commit 54a0ed0d ("net: dsa: provide an option for drivers to always receive bridge VLANs") for the DSA core, followed by commit 1fb74191 ("net: dsa: mv88e6xxx: fix vlan setup") which applied ds->configure_vlan_while_not_filtering = true for mv88e6xxx in particular. We still don't know what corner case Andrew saw when he wrote commit 2ea7a679 ("net: dsa: Don't add vlans when vlan filtering is disabled"), but Tobias now reports that when we use TX forwarding offload, pinging an external station from the bridge device is broken if the front-facing DSA user port has flooding turned off. The full description is in the link below, but for short, when a mv88e6xxx port is under a VLAN-unaware bridge, it inherits that bridge's pvid. So packets ingressing a user port will be classified to e.g. VID 1 (assuming that value for the bridge_default_pvid), whereas when tag_dsa.c xmits towards a user port, it always sends packets using a VID of 0 if that port is standalone or under a VLAN-unaware bridge - or at least it did so prior to commit d82f8ab0 ("net: dsa: tag_dsa: offload the bridge forwarding process"). In any case, when there is a conversation between the CPU and a station connected to a user port, the station's MAC address is learned in VID 1 but the CPU tries to transmit through VID 0. The packets reach the intended station, but via flooding and not by virtue of matching the existing ATU entry. DSA has established (and enforced in other drivers: sja1105, felix, mt7530) that a VLAN-unaware port should use a private pvid, and not inherit the one from the bridge. The bridge's pvid should only be inherited when that bridge is VLAN-aware, so all state transitions need to be handled. On the other hand, all bridge VLANs should sit in the VTU starting with the moment when the bridge offloads them via switchdev, they are just not used. This solves the problem that Tobias sees because packets ingressing on VLAN-unaware user ports now get classified to VID 0, which is also the VID used by tag_dsa.c on xmit. Fixes: d82f8ab0 ("net: dsa: tag_dsa: offload the bridge forwarding process") Link: https://patchwork.kernel.org/project/netdevbpf/patch/20211003222312.284175-2-vladimir.oltean@nxp.com/#24491503Reported-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The present code is structured this way due to an incomplete thought process. In Documentation/networking/switchdev.rst we document that if a bridge is VLAN-unaware, then the presence or lack of a pvid on a bridge port (or on the bridge itself, for that matter) should not affect the ability to receive and transmit tagged or untagged packets. If the bridge on behalf of which we are sending this packet is VLAN-aware, then the TX forwarding offload API ensures that the skb will be VLAN-tagged (if the packet was sent by user space as untagged, it will get transmitted town to the driver as tagged with the bridge device's pvid). But if the bridge is VLAN-unaware, it may or may not be VLAN-tagged. In fact the logic to insert the bridge's PVID came from the idea that we should emulate what is being done in the VLAN-aware case. But we shouldn't. It appears that injecting packets using a VLAN ID of 0 serves the purpose of forwarding the packets to the egress port with no VLAN tag added or stripped by the hardware, and no filtering being performed. So we can simply remove the superfluous logic. One reason why this logic is broken is that when CONFIG_BRIDGE_VLAN_FILTERING=n, we call br_vlan_get_pvid_rcu() but that returns an error and we do error out, dropping all packets on xmit. Not really smart. This is also an issue when the user deletes the bridge pvid: $ bridge vlan del dev br0 vid 1 self As mentioned, in both cases, packets should still flow freely, and they do just that on any net device where the bridge is not offloaded, but on mv88e6xxx they don't. Fixes: d82f8ab0 ("net: dsa: tag_dsa: offload the bridge forwarding process") Reported-by: Andrew Lunn <andrew@lunn.ch> Link: https://patchwork.kernel.org/project/netdevbpf/patch/20211003155141.2241314-1-andrew@lunn.ch/ Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210928233708.1246774-1-vladimir.oltean@nxp.com/Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The dp->bridge_num is zero-based, with -1 being the encoding for an invalid value. But dsa_bridge_num_put used to check for an invalid value by comparing bridge_num with 0, which is of course incorrect. The result is that the bridge_num will never get cleared by dsa_bridge_num_put, and further port joins to other bridges will get a bridge_num larger than the previous one, and once all the available bridges with TX forwarding offload supported by the hardware get exhausted, the TX forwarding offload feature is simply disabled. In the case of sja1105, 7 iterations of the loop below are enough to exhaust the TX forwarding offload bits, and further bridge joins operate without that feature. ip link add br0 type bridge vlan_filtering 1 while :; do ip link set sw0p2 master br0 && sleep 1 ip link set sw0p2 nomaster && sleep 1 done This issue is enough of an indication that having the dp->bridge_num invalid encoding be a negative number is prone to bugs, so this will be changed to a one-based value, with the dp->bridge_num of zero being the indication of no bridge. However, that is material for net-next. Fixes: f5e165e7 ("net: dsa: track unique bridge numbers across all DSA switch trees") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Lin Ma authored
The nci_core_conn_close_rsp_packet() function will release the conn_info with given conn_id. However, it needs to set the rf_conn_info to NULL to prevent other routines like nci_rf_intf_activated_ntf_packet() to trigger the UAF. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Lin Ma <linma@zju.edu.cn> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Karsten Graul authored
Commit 8f3d65c1 ("net/smc: fix wait on already cleared link") introduced link refcounting to avoid waits on already cleared links. This patch extents and improves the refcounting to cover all remaining possible cases for this kind of error situation. Fixes: 15e1b99a ("net/smc: no WR buffer wait for terminating link group") Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge branch 'stmmac-regression-fix' Herve Codina says: ==================== net: stmmac: fix regression on SPEAr3xx SOC The ethernet driver used on old SPEAr3xx soc was previously supported on old kernel. Some regressions were introduced during the different updates leading to a broken driver for this soc. This series fixes these regressions and brings back ethernet on SPEAr3xx. Tested on a SPEAr320 board. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herve Codina authored
On SPEAr3xx, ethernet driver is not compatible with the SPEAr600 one. Indeed, SPEAr3xx uses an earlier version of this IP (v3.40) and needs some driver tuning compare to SPEAr600. The v3.40 IP support was added to stmmac driver and this patch fixes this issue and use the correct compatible string for SPEAr3xx Signed-off-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herve Codina authored
dwmac 3.40a is an old ip version that can be found on SPEAr3xx soc. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herve Codina authored
dwmac 3.40a is an old ip version that can be found on SPEAr3xx soc. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herve Codina authored
Some old IPs do not provide the hardware feature register. On these IPs, this register is read 0x00000000. In old driver version, this feature was handled but a regression came with the commit f10a6a35 ("stmmac: rework get_hw_feature function"). Indeed, this commit removes the return value in dma->get_hw_feature(). This return value was used to indicate the validity of retrieved information and used later on in stmmac_hw_init() to override priv->plat data if this hardware feature were valid. This patch restores the return code in ->get_hw_feature() in order to indicate the hardware feature validity and override priv->plat data only if this hardware feature is valid. Fixes: f10a6a35 ("stmmac: rework get_hw_feature function") Signed-off-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
recvmsg() can enter an infinite loop if the caller provides the MSG_WAITALL, the data present in the receive queue is not sufficient to fulfill the request, and no more data is received by the peer. When the above happens, mptcp_wait_data() will always return with no wait, as the MPTCP_DATA_READY flag checked by such function is set and never cleared in such code path. Leveraging the above syzbot was able to trigger an RCU stall: rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 0-...!: (10499 ticks this GP) idle=0af/1/0x4000000000000000 softirq=10678/10678 fqs=1 (t=10500 jiffies g=13089 q=109) rcu: rcu_preempt kthread starved for 10497 jiffies! g13089 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:R running task stack:28696 pid: 14 ppid: 2 flags:0x00004000 Call Trace: context_switch kernel/sched/core.c:4955 [inline] __schedule+0x940/0x26f0 kernel/sched/core.c:6236 schedule+0xd3/0x270 kernel/sched/core.c:6315 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881 rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1955 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2128 kthread+0x405/0x4f0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 rcu: Stack dump where RCU GP kthread last ran: Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 8510 Comm: syz-executor827 Not tainted 5.15.0-rc2-next-20210920-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:84 [inline] RIP: 0010:memory_is_nonzero mm/kasan/generic.c:102 [inline] RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:128 [inline] RIP: 0010:memory_is_poisoned mm/kasan/generic.c:159 [inline] RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline] RIP: 0010:kasan_check_range+0xc8/0x180 mm/kasan/generic.c:189 Code: 38 00 74 ed 48 8d 50 08 eb 09 48 83 c0 01 48 39 d0 74 7a 80 38 00 74 f2 48 89 c2 b8 01 00 00 00 48 85 d2 75 56 5b 5d 41 5c c3 <48> 85 d2 74 5e 48 01 ea eb 09 48 83 c0 01 48 39 d0 74 50 80 38 00 RSP: 0018:ffffc9000cd676c8 EFLAGS: 00000283 RAX: ffffed100e9a110e RBX: ffffed100e9a110f RCX: ffffffff88ea062a RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888074d08870 RBP: ffffed100e9a110e R08: 0000000000000001 R09: ffff888074d08877 R10: ffffed100e9a110e R11: 0000000000000000 R12: ffff888074d08000 R13: ffff888074d08000 R14: ffff888074d08088 R15: ffff888074d08000 FS: 0000555556d8e300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 S: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000180 CR3: 0000000068909000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: instrument_atomic_read_write include/linux/instrumented.h:101 [inline] test_and_clear_bit include/asm-generic/bitops/instrumented-atomic.h:83 [inline] mptcp_release_cb+0x14a/0x210 net/mptcp/protocol.c:3016 release_sock+0xb4/0x1b0 net/core/sock.c:3204 mptcp_wait_data net/mptcp/protocol.c:1770 [inline] mptcp_recvmsg+0xfd1/0x27b0 net/mptcp/protocol.c:2080 inet6_recvmsg+0x11b/0x5e0 net/ipv6/af_inet6.c:659 sock_recvmsg_nosec net/socket.c:944 [inline] ____sys_recvmsg+0x527/0x600 net/socket.c:2626 ___sys_recvmsg+0x127/0x200 net/socket.c:2670 do_recvmmsg+0x24d/0x6d0 net/socket.c:2764 __sys_recvmmsg net/socket.c:2843 [inline] __do_sys_recvmmsg net/socket.c:2866 [inline] __se_sys_recvmmsg net/socket.c:2859 [inline] __x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2859 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc200d2dc39 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc5758e5a8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fc200d2dc39 RDX: 0000000000000002 RSI: 00000000200017c0 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000f0b5ff R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000003 R13: 00007ffc5758e5d0 R14: 00007ffc5758e5c0 R15: 0000000000000003 Fix the issue by replacing the MPTCP_DATA_READY bit with direct inspection of the msk receive queue. Reported-and-tested-by: syzbot+3360da629681aa0d22fe@syzkaller.appspotmail.com Fixes: 7a6a6cbc ("mptcp: recvmsg() can drain data from multiple subflow") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Oct, 2021 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds authored
Pull nfsd fixes from Chuck Lever: "Bug fixes for NFSD error handling paths" * tag 'nfsd-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Keep existing listeners on portlist error SUNRPC: fix sign error causing rpcsec_gss drops nfsd: Fix a warning for nfsd_file_close_inode nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero nfsd: fix error handling of register_pernet_subsys() in init_nfsd()
-
git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds authored
Pull ARM SoC fixes from Arnd Bergmann: "This is a larger than normal update for Arm SoC specific code, most of it in device trees, but also drivers and the omap and at91/sama7 platforms: - There are four new entries to the MAINTAINERS file: Sven Peter and Alyssa Rosenzweig for Apple M1, Romain Perier for Mstar/sigmastar, and Vignesh Raghavendra for TI K3 - Build fixes to address randconfig warnings in sharpsl, dove, omap1, and qcom platforms as well as the scmi and op-tee subsystems - Regression fixes for missing CONFIG_FB and other options for several defconfigs - Several bug fixes for the newly added Microchip SAMA7 platform, mostly regarding power management - Missing SMP barriers to protect accesses to SCMI virtio device - Regression fixes for TI OMAP, including a boot-time hang on am335x. - Lots of bug fixes for NXP i.MX, mostly addressing incorrect settings in devicetree files, and one revert for broken suspend. - Fixes for ARM Juno/Vexpress devicetree files, addressing a couple of schema warnings. - Regression fixes for qualcomm SoC specific drivers and devicetree files, reverting an mdt_loader change and at least pastially reverting some of the 5.15 DTS changes, plus some minor bugfixes" * tag 'armsoc-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (64 commits) MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer firmware: arm_scmi: Add proper barriers to scmi virtio device firmware: arm_scmi: Simplify spinlocks in virtio transport ARM: dts: omap3430-sdp: Fix NAND device node bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893 ARM: sharpsl_param: work around -Wstringop-overread warning ARM: defconfig: gemini: Restore framebuffer ARM: dove: mark 'putc' as inline ARM: omap1: move omap15xx local bus handling to usb.c MAINTAINERS: Add Vignesh to TI K3 platform maintainership arm64: dts: imx8m*-venice-gw7902: fix M2_RST# gpio ARM: imx6: disable the GIC CPU interface before calling stby-poweroff sequence arm64: dts: ls1028a: fix eSDHC2 node arm64: dts: imx8mm-kontron-n801x-som: do not allow to switch off buck2 ARM: dts: at91: sama7g5ek: to not touch slew-rate for SDMMC pins ARM: dts: at91: sama7g5ek: use proper slew-rate settings for GMACs ARM: at91: pm: preload base address of controllers in tlb ARM: at91: pm: group constants and addresses loading ARM: dts: at91: sama7g5ek: add suspend voltage for ddr3l rail ...
-
https://github.com/AsahiLinux/linuxArnd Bergmann authored
Apple SoC fixes for 5.15; just two MAINTAINERS updates. - MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer - MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer * tag 'asahi-soc-fixes-5.15' of https://github.com/AsahiLinux/linux: MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer Link: https://lore.kernel.org/r/a50a9015-0e62-c451-4d0d-668233b35b85@marcan.stSigned-off-by: Arnd Bergmann <arnd@arndb.de>
-