- 12 Jan, 2023 1 commit
-
-
Jie Wang authored
Currently, it used old rss size to get current tc mode. As a result, the rss size is updated, but the tc mode is still configured based on the old rss size. So this patch fixes it by using the new rss size in both process. Fixes: 93969dc1 ("net: hns3: refactor VF rss init APIs with new common rss init APIs") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Hao Lan <lanhao@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230110115359.10163-1-lanhao@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 11 Jan, 2023 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
mlx5-fixes-2023-01-09
-
Herbert Xu authored
The total cork length created by ip6_append_data includes extension headers, so we must exclude them when comparing them against the IPV6_CHECKSUM offset which does not include extension headers. Reported-by: Kyle Zeng <zengyhkyle@gmail.com> Fixes: 357b40a1 ("[IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Clément Léger authored
If ptp was not enabled due to missing IRQ for instance, lan966x_ptp_deinit() will dereference NULL pointers. Fixes: d0964594 ("net: lan966x: Add support for ptp clocks") Signed-off-by: Clément Léger <clement.leger@bootlin.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queueJakub Kicinski authored
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-01-09 (ice) This series contains updates to ice driver only. Jiasheng Jiang frees allocated cmd_buf if write_buf allocation failed to prevent memory leak. Yuan Can adds check, and proper cleanup, of gnss_tty_port allocation call to avoid memory leaks. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Add check for kzalloc ice: Fix potential memory leak in ice_gnss_tty_write() ==================== Link: https://lore.kernel.org/r/20230109225358.3478060-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Frederick Lawler authored
While experimenting with applying noqueue to a classful queue discipline, we discovered a NULL pointer dereference in the __dev_queue_xmit() path that generates a kernel OOPS: # dev=enp0s5 # tc qdisc replace dev $dev root handle 1: htb default 1 # tc class add dev $dev parent 1: classid 1:1 htb rate 10mbit # tc qdisc add dev $dev parent 1:1 handle 10: noqueue # ping -I $dev -w 1 -c 1 1.1.1.1 [ 2.172856] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 2.173217] #PF: supervisor instruction fetch in kernel mode ... [ 2.178451] Call Trace: [ 2.178577] <TASK> [ 2.178686] htb_enqueue+0x1c8/0x370 [ 2.178880] dev_qdisc_enqueue+0x15/0x90 [ 2.179093] __dev_queue_xmit+0x798/0xd00 [ 2.179305] ? _raw_write_lock_bh+0xe/0x30 [ 2.179522] ? __local_bh_enable_ip+0x32/0x70 [ 2.179759] ? ___neigh_create+0x610/0x840 [ 2.179968] ? eth_header+0x21/0xc0 [ 2.180144] ip_finish_output2+0x15e/0x4f0 [ 2.180348] ? dst_output+0x30/0x30 [ 2.180525] ip_push_pending_frames+0x9d/0xb0 [ 2.180739] raw_sendmsg+0x601/0xcb0 [ 2.180916] ? _raw_spin_trylock+0xe/0x50 [ 2.181112] ? _raw_spin_unlock_irqrestore+0x16/0x30 [ 2.181354] ? get_page_from_freelist+0xcd6/0xdf0 [ 2.181594] ? sock_sendmsg+0x56/0x60 [ 2.181781] sock_sendmsg+0x56/0x60 [ 2.181958] __sys_sendto+0xf7/0x160 [ 2.182139] ? handle_mm_fault+0x6e/0x1d0 [ 2.182366] ? do_user_addr_fault+0x1e1/0x660 [ 2.182627] __x64_sys_sendto+0x1b/0x30 [ 2.182881] do_syscall_64+0x38/0x90 [ 2.183085] entry_SYSCALL_64_after_hwframe+0x63/0xcd ... [ 2.187402] </TASK> Previously in commit d66d6c31 ("net: sched: register noqueue qdisc"), NULL was set for the noqueue discipline on noqueue init so that __dev_queue_xmit() falls through for the noqueue case. This also sets a bypass of the enqueue NULL check in the register_qdisc() function for the struct noqueue_disc_ops. Classful queue disciplines make it past the NULL check in __dev_queue_xmit() because the discipline is set to htb (in this case), and then in the call to __dev_xmit_skb(), it calls into htb_enqueue() which grabs a leaf node for a class and then calls qdisc_enqueue() by passing in a queue discipline which assumes ->enqueue() is not set to NULL. Fix this by not allowing classes to be assigned to the noqueue discipline. Linux TC Notes states that classes cannot be set to the noqueue discipline. [1] Let's enforce that here. Links: 1. https://linux-tc-notes.sourceforge.net/tc/doc/sch_noqueue.txt Fixes: d66d6c31 ("net: sched: register noqueue qdisc") Cc: stable@vger.kernel.org Signed-off-by: Frederick Lawler <fred@cloudflare.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20230109163906.706000-1-fred@cloudflare.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 10 Jan, 2023 23 commits
-
-
Hariprasad Kelam authored
resources allocated like mcam entries to support the Ntuple feature and hash tables for the tc feature are not getting freed in driver unbind. This patch fixes the issue. Fixes: 2da48943 ("octeontx2-pf: devlink params support to set mcam entry count") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Link: https://lore.kernel.org/r/20230109061325.21395-1-hkelam@marvell.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Guillaume Nault says: ==================== selftests/net: Isolate l2_tos_ttl_inherit.sh in its own netns. l2_tos_ttl_inherit.sh uses a veth pair to run its tests, but only one of the veth interfaces runs in a dedicated netns. The other one remains in the initial namespace where the existing network configuration can interfere with the setup used for the tests. Isolate both veth devices in their own netns and ensure everything gets cleaned up when the script exits. Link: https://lore.kernel.org/netdev/924f1062-ab59-9b88-3b43-c44e73a30387@alu.unizg.hr/ ==================== Link: https://lore.kernel.org/r/cover.1673191942.git.gnault@redhat.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Guillaume Nault authored
Use 'set -e' and an exit handler to stop the script if a command fails and ensure the test environment is cleaned up in any case. Also, handle the case where the script is interrupted by SIGINT. The only command that's expected to fail is 'wait $ping_pid', since it's killed by the script. Handle this case with '|| true' to make it play well with 'set -e'. Finally, return the Kselftest SKIP code (4) when the script breaks because of an environment problem or a command line failure. The 0 and 1 return codes should now reliably indicate that all tests have been run (0: all tests run and passed, 1: all tests run but at least one failed, 4: test script didn't run completely). Fixes: b690842d ("selftests/net: test l2 tunnel TOS/TTL inheriting") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Guillaume Nault authored
This selftest currently runs half in the current namespace and half in a netns of its own. Therefore, the test can fail if the current namespace is already configured with incompatible parameters (for example if it already has a veth0 interface). Adapt the script to put both ends of the veth pair in their own netns. Now veth0 is created in NS0 instead of the current namespace, while veth1 is set up in NS1 (instead of the 'testing' netns). The user visible netns names are randomised to minimise the risk of conflicts with already existing namespaces. The cleanup() function doesn't need to remove the virtual interface anymore: deleting NS0 and NS1 automatically removes the virtual interfaces they contained. We can remove $ns, which was only used to run ip commands in the 'testing' netns (let's use the builtin "-netns" option instead). However, we still need a similar functionality as ping and tcpdump now need to run in NS0. So we now have $RUN_NS0 for that. Fixes: b690842d ("selftests/net: test l2 tunnel TOS/TTL inheriting") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Guillaume Nault authored
The ping command can run before DAD completes. In that case, ping may fail and break the selftest. We don't need DAD here since we're working on isolated device pairs. Fixes: b690842d ("selftests/net: test l2 tunnel TOS/TTL inheriting") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Emeel Hakim authored
Upon updating MAC security entity (SecY) in hw offload path, the macsec security association (SA) initialization routine is called. In case of extended packet number (epn) is enabled the salt and ssci attributes are retrieved using the MACsec driver rx_sa context which is unavailable when updating a SecY property such as encoding-sa hence the null dereference. Fix by using the provided SA to set those attributes. Fixes: 4411a6c0 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Emeel Hakim authored
Currently when macsec offload is set with extended packet number (epn) enabled, the driver wrongly deduce the short secure channel identifier (ssci) from the salt instead of the stand alone ssci attribute as it should, consequently creating a mismatch between the kernel and driver's ssci values. Fix by using the ssci value from the relevant attribute. Fixes: 4411a6c0 ("net/mlx5e: Support MACsec offload extended packet number (EPN)") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
When a capability is set via port function caps callbacks, a memcpy() is performed in which the source and the target are the same address, e.g.: the copy is redundant. Hence, Remove it. Discovered by Coverity. Fixes: 7db98396 ("net/mlx5: E-Switch, Implement devlink port function cmds to control RoCE") Fixes: e5b9642a ("net/mlx5: E-Switch, Implement devlink port function cmds to control migratable") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Gavin Li authored
Previously, encap rules with gbp option would be offloaded by mistake but driver does not support gbp option offload. To fix this issue, check if the encap rule has gbp option and don't offload the rule Fixes: d8f9dfae ("net: sched: allow flower to match vxlan options") Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Rahul Rameshbabu authored
.max_adj of ptp_clock_info acts as an absolute value for the amount in ppb that can be set for a single call of .adjfine. This means that a single call to .getfine cannot be greater than .max_adj or less than -(.max_adj). Provides correct value for max frequency adjustment value supported by devices. Fixes: 3d8c38af ("net/mlx5e: Add PTP Hardware Clock (PHC) support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Aya Levin authored
When updating statistics driver queries the vport's counters. On fail, add error path releasing the allocated buffer avoiding memory leak. Fixes: 64b68e36 ("net/mlx5: Refactor and expand rep vport stat group") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Oz Shlomo authored
The offending commit removed the support for all packet rate metering. Restore the pkt rate metering support by removing the restriction. Fixes: 3603f266 ("net/mlx5e: TC, allow meter jump control action") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Oz Shlomo authored
The post meter table only matches on reg_c5. As such, the inner/outer match levels are irrelevant for the match critieria. The cited patch only sets the outer criteria to none, thus setting the inner match level for encapsulated packets. This caused rules with police action on tunnel devices to not find an existing flow group for the match criteria, thus failing to offload the rule. Set both the inner and outer match levels to none for post_meter rules. Fixes: 0d8c38d4 ("net/mlx5e: TC, init post meter rules with branching attributes") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Dragos Tatulea authored
The current code always does the accounting using the stats from the parent interface (linked in the rq). This doesn't work when there are child interfaces configured. Fix this behavior by always using the stats from the child interface priv. This will also work for parent only interfaces: the child (netdev) and parent netdev (rq->netdev) will point to the same thing. Fixes: be98737a ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Dragos Tatulea authored
A user is able to configure an arbitrary number of rx queues when creating an interface via netlink. This doesn't work for child PKEY interfaces because the child interface uses the parent receive channels. Although the child shares the parent's receive channels, the number of rx queues is important for the channel_stats array: the parent's rx channel index is used to access the child's channel_stats. So the array has to be at least as large as the parent's rx queue size for the counting to work correctly and to prevent out of bound accesses. This patch checks for the mentioned scenario and returns an error when trying to create the interface. The error is propagated to the user. Fixes: be98737a ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Dragos Tatulea authored
PKEY sub interfaces share the receive queues with the parent interface. While setting the sub interface queue count is not supported, it is currently possible to change the number of queues of the parent interface. Thus we can end up with inconsistent queue sizes between the parent and its sub interfaces. This change disallows setting the queue count on the parent interface when sub interfaces are present. This is achieved by introducing an explicit reference to the parent netdev in the mlx5i_priv of the child interface. An additional counter is also required on the parent side to detect when sub interfaces are attached and for proper cleanup. The rtnl lock is taken during the ethtool op and the sub interface ndo_init/uninit ops. There is no race here around counting the sub interfaces, reading the sub interfaces and setting the number of channels. The ASSERT_RTNL was added to document that. Fixes: be98737a ("net/mlx5e: Use dynamic per-channel allocations in stats") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Roy Novich authored
The native NIC port net device instance is being used as Uplink representor. While changing profiles private resources are not available, fix features ndo does not check if the netdev is present. Add driver protection to verify private resources are ready. Fixes: 7a9fb35e ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Moshe Shemesh authored
Command may fail while driver is reloading and can't accept FW commands till command interface is reinitialized. Such command failure is being logged to command stats. This results in NULL pointer access as command stats structure is being freed and reallocated during mlx5 devlink reload (see kernel log below). Fix it by making command stats statically allocated on driver probe. Kernel log: [ 2394.808802] BUG: unable to handle kernel paging request at 000000000002a9c0 [ 2394.810610] PGD 0 P4D 0 [ 2394.811811] Oops: 0002 [#1] SMP NOPTI ... [ 2394.815482] RIP: 0010:native_queued_spin_lock_slowpath+0x183/0x1d0 ... [ 2394.829505] Call Trace: [ 2394.830667] _raw_spin_lock_irq+0x23/0x26 [ 2394.831858] cmd_status_err+0x55/0x110 [mlx5_core] [ 2394.833020] mlx5_access_reg+0xe7/0x150 [mlx5_core] [ 2394.834175] mlx5_query_port_ptys+0x78/0xa0 [mlx5_core] [ 2394.835337] mlx5e_ethtool_get_link_ksettings+0x74/0x590 [mlx5_core] [ 2394.836454] ? kmem_cache_alloc_trace+0x140/0x1c0 [ 2394.837562] __rh_call_get_link_ksettings+0x33/0x100 [ 2394.838663] ? __rtnl_unlock+0x25/0x50 [ 2394.839755] __ethtool_get_link_ksettings+0x72/0x150 [ 2394.840862] duplex_show+0x6e/0xc0 [ 2394.841963] dev_attr_show+0x1c/0x40 [ 2394.843048] sysfs_kf_seq_show+0x9b/0x100 [ 2394.844123] seq_read+0x153/0x410 [ 2394.845187] vfs_read+0x91/0x140 [ 2394.846226] ksys_read+0x4f/0xb0 [ 2394.847234] do_syscall_64+0x5b/0x1a0 [ 2394.848228] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 34f46ae0 ("net/mlx5: Add command failures data to debugfs") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Ariel Levkovich authored
When offloading TC NIC rule which has mod_hdr action, the mod_hdr actions list is freed upon mod_hdr allocation. In the new format of handling multi table actions and CT in particular, the mod_hdr actions list is still relevant when setting the pre and post rules and therefore, freeing the list may cause adding rules which don't set the FTE_ID. Therefore, the mod_hdr actions list needs to be kept for the pre/post flows as well and should be left for these handler to be freed. Fixes: 8300f225 ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Ariel Levkovich authored
Fix attr pointer validity checks after it was already dereferenced. Fixes: cb0d54cb ("net/mlx5e: Fix wrong source vport matching on tunnel rule") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Yevgeny Kliteynik authored
If the kernel configuration asks the compiler to check frame limit of 1K, dr_rule_create_rule_nic exceed this limit: "stack frame size (1184) exceeds limit (1024)" Fixing this issue by checking configured frame limit and using the optimization STE array only for cases with the usual 2K (or larger) stack size warning. Fixes: b9b81e1e ("net/mlx5: DR, For short chains of STEs, avoid allocating ste_arr dynamically") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
-
Heiner Kallweit authored
This reverts commit 42666b2c. This chip version seems to be very rare, but it exits in consumer devices, see linked report. Link: https://stackoverflow.com/questions/75049473/cant-setup-a-wired-network-in-archlinux-fresh-install Fixes: 42666b2c ("r8169: disable detection of chip version 36") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/42e9674c-d5d0-a65a-f578-e5c74f244739@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The 'TCA_MPLS_LABEL' attribute is of 'NLA_U32' type, but has a validation type of 'NLA_VALIDATE_FUNCTION'. This is an invalid combination according to the comment above 'struct nla_policy': " Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN: NLA_BINARY Validation function called for the attribute. All other Unused - but note that it's a union " This can trigger the warning [1] in nla_get_range_unsigned() when validation of the attribute fails. Despite being of 'NLA_U32' type, the associated 'min'/'max' fields in the policy are negative as they are aliased by the 'validate' field. Fix by changing the attribute type to 'NLA_BINARY' which is consistent with the above comment and all other users of NLA_POLICY_VALIDATE_FN(). As a result, move the length validation to the validation function. No regressions in MPLS tests: # ./tdc.py -f tc-tests/actions/mpls.json [...] # echo $? 0 [1] WARNING: CPU: 0 PID: 17743 at lib/nlattr.c:118 nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117 Modules linked in: CPU: 0 PID: 17743 Comm: syz-executor.0 Not tainted 6.1.0-rc8 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014 RIP: 0010:nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117 [...] Call Trace: <TASK> __netlink_policy_dump_write_attr+0x23d/0x990 net/netlink/policy.c:310 netlink_policy_dump_write_attr+0x22/0x30 net/netlink/policy.c:411 netlink_ack_tlv_fill net/netlink/af_netlink.c:2454 [inline] netlink_ack+0x546/0x760 net/netlink/af_netlink.c:2506 netlink_rcv_skb+0x1b7/0x240 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6109 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x38f/0x500 net/socket.c:2482 ___sys_sendmsg net/socket.c:2536 [inline] __sys_sendmsg+0x197/0x230 net/socket.c:2565 __do_sys_sendmsg net/socket.c:2574 [inline] __se_sys_sendmsg net/socket.c:2572 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2572 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Link: https://lore.kernel.org/netdev/CAO4mrfdmjvRUNbDyP0R03_DrD_eFCLCguz6OxZ2TYRSv0K9gxA@mail.gmail.com/ Fixes: 2a2ea508 ("net: sched: add mpls manipulation actions to TC") Reported-by: Wei Chen <harperchen1110@gmail.com> Tested-by: Wei Chen <harperchen1110@gmail.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230107171004.608436-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 09 Jan, 2023 9 commits
-
-
Jiasheng Jiang authored
Add the check for the return value of kzalloc in order to avoid NULL pointer dereference. Moreover, use the goto-label to share the clean code. Fixes: d6b98c8d ("ice: add write functionality for GNSS TTY") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Yuan Can authored
The ice_gnss_tty_write() return directly if the write_buf alloc failed, leaking the cmd_buf. Fix by free cmd_buf if write_buf alloc failed. Fixes: d6b98c8d ("ice: add write functionality for GNSS TTY") Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Mirsad Goran Todorovac authored
Adjust size parameter in connect() to match the type of the parameter, to fix "No such file or directory" error in selftests/net/af_unix/ test_oob_unix.c:127. The existing code happens to work provided that the autogenerated pathname is shorter than sizeof (struct sockaddr), which is why it hasn't been noticed earlier. Visible from the trace excerpt: bind(3, {sa_family=AF_UNIX, sun_path="unix_oob_453059"}, 110) = 0 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa6a6577a10) = 453060 [pid <child>] connect(6, {sa_family=AF_UNIX, sun_path="unix_oob_45305"}, 16) = -1 ENOENT (No such file or directory) BUG: The filename is trimmed to sizeof (struct sockaddr). Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Cc: Florian Westphal <fw@strlen.de> Reviewed-by: Florian Westphal <fw@strlen.de> Fixes: 314001f0 ("af_unix: Add OOB support") Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Horatiu Vultur authored
The blamed commit implemented the vcap_operations to allow to add an entry in the TCAM. One of the callbacks is to validate the supported keysets. If the TCAM lookup was not enabled, then this will return failure so no entries could be added. This doesn't make much sense, as you can enable at a later point the TCAM. Therefore change it such to allow entries in TCAM even it is not enabled. Fixes: 4426b78c ("net: lan966x: Add port keyset config and callback interface") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Jaroslav reported a recent throughput regression with virtio_net caused by blamed commit. It is unclear if DODGY GSO packets coming from user space can be accepted by GRO engine in the future with minimal changes, and if there is any expected gain from it. In the meantime, make sure to detect and flush DODGY packets. Fixes: 5eddb249 ("gro: add support of (hw)gro packets to gro stack") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-and-bisected-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Minsuk Kang authored
Fix a use-after-free that occurs in hcd when in_urb sent from pn533_usb_send_frame() is completed earlier than out_urb. Its callback frees the skb data in pn533_send_async_complete() that is used as a transfer buffer of out_urb. Wait before sending in_urb until the callback of out_urb is called. To modify the callback of out_urb alone, separate the complete function of out_urb and ack_urb. Found by a modified version of syzkaller. BUG: KASAN: use-after-free in dummy_timer Call Trace: memcpy (mm/kasan/shadow.c:65) dummy_perform_transfer (drivers/usb/gadget/udc/dummy_hcd.c:1352) transfer (drivers/usb/gadget/udc/dummy_hcd.c:1453) dummy_timer (drivers/usb/gadget/udc/dummy_hcd.c:1972) arch_static_branch (arch/x86/include/asm/jump_label.h:27) static_key_false (include/linux/jump_label.h:207) timer_expire_exit (include/trace/events/timer.h:127) call_timer_fn (kernel/time/timer.c:1475) expire_timers (kernel/time/timer.c:1519) __run_timers (kernel/time/timer.c:1790) run_timer_softirq (kernel/time/timer.c:1803) Fixes: c46ee386 ("NFC: pn533: add NXP pn533 nfc device driver") Signed-off-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kees Cook authored
Zero-length arrays are deprecated[1]. Replace struct mlxsw_sp_nexthop_group_info's "nexthops" 0-length array with a flexible array. Detected with GCC 13, using -fstrict-flex-arrays=3: drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c: In function 'mlxsw_sp_nexthop_group_hash_obj': drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:3278:38: warning: array subscript i is outside array bounds of 'struct mlxsw_sp_nexthop[0]' [-Warray-bounds=] 3278 | val ^= jhash(&nh->ifindex, sizeof(nh->ifindex), seed); | ^~~~~~~~~~~~ drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:2954:33: note: while referencing 'nexthops' 2954 | struct mlxsw_sp_nexthop nexthops[0]; | ^~~~~~~~ [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays Cc: Ido Schimmel <idosch@nvidia.com> Cc: Petr Machata <petrm@nvidia.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Tested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Commit b310de78 ("net: ipa: add IPA v4.7 support") was merged despite an unresolved comment made by Konrad Dybcio. Konrad observed that the IMEM region specified for IPA v4.7 did not match that used downstream for the SM7225 SoC. In "lagoon.dtsi" present in a Sony Xperia source tree, a ipa_smmu_ap node was defined with a "qcom,additional-mapping" property that defined the IPA IMEM area starting at offset 0x146a8000 (not 0x146a9000 that was committed). The IPA v4.7 target system used for testing uses the SM7225 SoC, so we'll adhere what the downstream code specifies is the address of the IMEM region used for IPA. Link: https://lore.kernel.org/linux-arm-msm/20221208211529.757669-1-elder@linaro.org Fixes: b310de78 ("net: ipa: add IPA v4.7 support") Tested-by: Luca Weiss <luca.weiss@fairphone.com> Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan T. Ivanov authored
The introduction of support for Apple board types inadvertently changed the precedence order, causing hybrid SMBIOS+DT platforms to look up the firmware using the DMI information instead of the device tree compatible to generate the board type. Revert back to the old behavior, as affected platforms use firmwares named after the DT compatible. Fixes: 7682de8b ("wifi: brcmfmac: of: Fetch Apple properties") [1] https://bugzilla.opensuse.org/show_bug.cgi?id=1206697#c13 Cc: stable@vger.kernel.org Signed-off-by: Ivan T. Ivanov <iivanov@suse.de> Reviewed-by: Hector Martin <marcan@marcan.st> Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com> Tested-by: Peter Robinson <pbrobinson@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Jan, 2023 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fsDavid S. Miller authored
David Howells says: ==================== rxrpc: Fix race between call connection, data transmit and call disconnect Here are patches to fix an oops[1] caused by a race between call connection, initial packet transmission and call disconnection which results in something like: kernel BUG at net/rxrpc/peer_object.c:413! when the syzbot test is run. The problem is that the connection procedure is effectively split across two threads and can get expanded by taking an interrupt, thereby adding the call to the peer error distribution list *after* it has been disconnected (say by the rxrpc socket shutting down). The easiest solution is to look at the fourth set of I/O thread conversion/SACK table expansion patches that didn't get applied[2] and take from it those patches that move call connection and disconnection into the I/O thread. Moving these things into the I/O thread means that the sequencing is managed by all being done in the same thread - and the race can no longer happen. This is preferable to introducing an extra lock as adding an extra lock would make the I/O thread have to wait for the app thread in yet another place. The changes can be considered as a number of logical parts: (1) Move all of the call state changes into the I/O thread. (2) Make client connection ID space per-local endpoint so that the I/O thread doesn't need locks to access it. (3) Move actual abort generation into the I/O thread and clean it up. If sendmsg or recvmsg want to cause an abort, they have to delegate it. (4) Offload the setting up of the security context on a connection to the thread of one of the apps that's starting a call. We don't want to be doing any sort of crypto in the I/O thread. (5) Connect calls (ie. assign them to channel slots on connections) in the I/O thread. Calls are set up by sendmsg/kafs and passed to the I/O thread to connect. Connections are allocated in the I/O thread after this. (6) Disconnect calls in the I/O thread. I've also added a patch for an unrelated bug that cropped up during testing, whereby a race can occur between an incoming call and socket shutdown. Note that whilst this fixes the original syzbot bug, another bug may get triggered if this one is fixed: INFO: rcu detected stall in corrupted rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P5792 } 2657 jiffies s: 2825 root: 0x0/T rcu: blocking rcu_node structures (internal RCU debug): It doesn't look this should be anything to do with rxrpc, though, as I've tested an additional patch[3] that removes practically all the RCU usage from rxrpc and it still occurs. It seems likely that it is being caused by something in the tunnelling setup that the syzbot test does, but there's not enough info to go on. It also seems unlikely to be anything to do with the afs driver as the test doesn't use that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Howells authored
An incoming call can race with rxrpc socket destruction, leading to a leaked call. This may result in an oops when the call timer eventually expires: BUG: kernel NULL pointer dereference, address: 0000000000000874 RIP: 0010:_raw_spin_lock_irqsave+0x2a/0x50 Call Trace: <IRQ> try_to_wake_up+0x59/0x550 ? __local_bh_enable_ip+0x37/0x80 ? rxrpc_poke_call+0x52/0x110 [rxrpc] ? rxrpc_poke_call+0x110/0x110 [rxrpc] ? rxrpc_poke_call+0x110/0x110 [rxrpc] call_timer_fn+0x24/0x120 with a warning in the kernel log looking something like: rxrpc: Call 00000000ba5e571a still in use (1,SvAwtACK,1061d,0)! incurred during rmmod of rxrpc. The 1061d is the call flags: RECVMSG_READ_ALL, RX_HEARD, BEGAN_RX_TIMER, RX_LAST, EXPOSED, IS_SERVICE, RELEASED but no DISCONNECTED flag (0x800), so it's an incoming (service) call and it's still connected. The race appears to be that: (1) rxrpc_new_incoming_call() consults the service struct, checks sk_state and allocates a call - then pauses, possibly for an interrupt. (2) rxrpc_release_sock() sets RXRPC_CLOSE, nulls the service pointer, discards the prealloc and releases all calls attached to the socket. (3) rxrpc_new_incoming_call() resumes, launching the new call, including its timer and attaching it to the socket. Fix this by read-locking local->services_lock to access the AF_RXRPC socket providing the service rather than RCU in rxrpc_new_incoming_call(). There's no real need to use RCU here as local->services_lock is only write-locked by the socket side in two places: when binding and when shutting down. Fixes: 5e6ef4f1 ("rxrpc: Make the I/O thread take over the call and local processor work") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org
-