- 26 Apr, 2024 37 commits
-
-
Jason Xing authored
Adjust the parameter and support passing reason of reset which is for now NOT_SPECIFIED. No functional changes. Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jason Xing authored
Add a new standalone file for the easy future extension to support both active reset and passive reset in the TCP/DCCP/MPTCP protocols. This patch only does the preparations for reset reason mechanism, nothing else changes. The reset reasons are divided into three parts: 1) reuse drop reasons for passive reset in TCP 2) our own independent reasons which aren't relying on other reasons at all 3) reuse MP_TCPRST option for MPTCP The benefits of a standalone reset reason are listed here: 1) it can cover more than one case, such as reset reasons in MPTCP, active reset reasons. 2) people can easily/fastly understand and maintain this mechanism. 3) we get unified format of output with prefix stripped. 4) more new reset reasons are on the way ... I will implement the basic codes of active/passive reset reason in those three protocols, which are not complete for this moment. For passive reset part in TCP, I only introduce the NO_SOCKET common case which could be set as an example. After this series applied, it will have the ability to open a new gate to let other people contribute more reasons into it :) Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Song Yoong Siang authored
This patch adds support to per-packet Tx hardware timestamp request to AF_XDP zero-copy packet via XDP Tx metadata framework. Please note that user needs to enable Tx HW timestamp capability via igc_ioctl() with SIOCSHWTSTAMP cmd before sending xsk Tx hardware timestamp request. Same as implementation in RX timestamp XDP hints kfunc metadata, Timer 0 (adjustable clock) is used in xsk Tx hardware timestamp. i225/i226 have four sets of timestamping registers. *skb and *xsk_tx_buffer pointers are used to indicate whether the timestamping register is already occupied. Furthermore, a boolean variable named xsk_pending_ts is used to hold the transmit completion until the tx hardware timestamp is ready. This is because, for i225/i226, the timestamp notification event comes some time after the transmit completion event. The driver will retrigger hardware irq to clean the packet after retrieve the tx hardware timestamp. Besides, xsk_meta is added into struct igc_tx_timestamp_request as a hook to the metadata location of the transmit packet. When the Tx timestamp interrupt is fired, the interrupt handler will copy the value of Tx hwts into metadata location via xsk_tx_metadata_complete(). This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel ADL-S platform. Below are the test steps and results. Test Step 1: Run xdp_hw_metadata app ./xdp_hw_metadata <iface> > /dev/shm/result.log Test Step 2: Enable Tx hardware timestamp hwstamp_ctl -i <iface> -t 1 -r 1 Test Step 3: Run ptp4l and phc2sys for time synchronization Test Step 4: Generate UDP packets with 1ms interval for 10s trafgen --dev <iface> '{eth(da=<addr>), udp(dp=9091)}' -t 1ms -n 10000 Test Step 5: Rerun Step 1-3 with 10s iperf3 as background traffic Test Step 6: Rerun Step 1-4 with 10s iperf3 as background traffic Based on iperf3 results below, the impact of holding tx completion to throughput is not observable. Result of last UDP packet (no. 10000) in Step 4: poll: 1 (0) skip=99 fail=0 redir=10000 xsk_ring_cons__peek: 1 0x5640a37972d0: rx_desc[9999]->addr=f2110 addr=f2110 comp_addr=f2110 EoP rx_hash: 0x2049BE1D with RSS type:0x1 HW RX-time: 1679819246792971268 (sec:1679819246.7930) delta to User RX-time sec:0.0000 (14.990 usec) XDP RX-time: 1679819246792981987 (sec:1679819246.7930) delta to User RX-time sec:0.0000 (4.271 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x5640a37972d0: ping-pong with csum=ab19 (want 315b) csum_start=34 csum_offset=6 0x5640a37972d0: complete tx idx=9999 addr=f010 HW TX-complete-time: 1679819246793036971 (sec:1679819246.7930) delta to User TX-complete-time sec:0.0001 (77.656 usec) XDP RX-time: 1679819246792981987 (sec:1679819246.7930) delta to User TX-complete-time sec:0.0001 (132.640 usec) HW RX-time: 1679819246792971268 (sec:1679819246.7930) delta to HW TX-complete-time sec:0.0001 (65.703 usec) 0x5640a37972d0: complete rx idx=10127 addr=f2110 Result of iperf3 without tx hwts request in step 5: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.74 GBytes 2.36 Gbits/sec 0 sender [ 5] 0.00-10.05 sec 2.74 GBytes 2.34 Gbits/sec receiver Result of iperf3 running parallel with trafgen command in step 6: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.74 GBytes 2.36 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 2.74 GBytes 2.34 Gbits/sec receiver Co-developed-by: Lai Peter Jun Ann <jun.ann.lai@intel.com> Signed-off-by: Lai Peter Jun Ann <jun.ann.lai@intel.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240424210256.3440903-1-anthony.l.nguyen@intel.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Jiri Pirko says: ==================== selftests: virtio_net: introduce initial testing infrastructure This patchset aims at introducing very basic initial infrastructure for virtio_net testing, namely it focuses on virtio feature testing. The first patch adds support for debugfs for virtio devices, allowing user to filter features to pretend to be driver that is not capable of the filtered feature. Example: $ cat /sys/bus/virtio/devices/virtio0/features 1110010111111111111101010000110010000000100000000000000000000000 $ echo "5" >/sys/kernel/debug/virtio/virtio0/filter_feature_add $ cat /sys/kernel/debug/virtio/virtio0/filter_features 5 $ echo "virtio0" > /sys/bus/virtio/drivers/virtio_net/unbind $ echo "virtio0" > /sys/bus/virtio/drivers/virtio_net/bind $ cat /sys/bus/virtio/devices/virtio0/features 1110000111111111111101010000110010000000100000000000000000000000 Leverage that in the last patch that lays ground for virtio_net selftests testing, including very basic F_MAC feature test. To run this, do: $ make -C tools/testing/selftests/ TARGETS=drivers/net/virtio_net/ run_tests It is assumed, as with lot of other selftests in the net group, that there are netdevices connected back-to-back. In this case, two virtio_net devices connected back to back. If you use "tap" qemu netdevice type, to configure this loop on a hypervisor, one may use this script: DEV1="$1" DEV2="$2" sudo tc qdisc add dev $DEV1 clsact sudo tc qdisc add dev $DEV2 clsact sudo tc filter add dev $DEV1 ingress protocol all pref 1 matchall action mirred egress redirect dev $DEV2 sudo tc filter add dev $DEV2 ingress protocol all pref 1 matchall action mirred egress redirect dev $DEV1 sudo ip link set $DEV1 up sudo ip link set $DEV2 up Another possibility is to use virtme-ng like this: $ vng --network=loop or directly: $ vng --network=loop -- make -C tools/testing/selftests/ TARGETS=drivers/net/virtio_net/ run_tests "loop" network type will take care of creating two "hubport" qemu netdevs putting them into a single hub. To do it manually with qemu, pass following command line options: -nic hubport,hubid=1,id=nd0,model=virtio-net-pci -nic hubport,hubid=1,id=nd1,model=virtio-net-pci ==================== Link: https://lore.kernel.org/r/20240424104049.3935572-1-jiri@resnulli.usSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jiri Pirko authored
Introduce initial tests for virtio_net driver. Focus on feature testing leveraging previously introduced debugfs feature filtering infrastructure. Add very basic ping and F_MAC feature tests. To run this, do: $ make -C tools/testing/selftests/ TARGETS=drivers/net/virtio_net/ run_tests Run it on a system with 2 virtio_net devices connected back-to-back on the hypervisor. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Tested-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jiri Pirko authored
The existing setup_wait*() helper family check the status of the interface to be up. Introduce wait_for_dev() to wait for the netdevice to appear, for example after test script does manual device bind. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jiri Pirko authored
Add a helper to be used to check if the netdevice is backed by specified driver. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jiri Pirko authored
Allow driver tests to work without specifying the netdevice names. Introduce a possibility to search for available netdevices according to set driver name. Allow test to specify the name by setting NETIF_FIND_DRIVER variable. Note that user overrides this either by passing netdevice names on the command line or by declaring NETIFS array in custom forwarding.config configuration file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jiri Pirko authored
Currently there is no way for user to set what features the driver should obey or not, it is hard wired in the code. In order to be able to debug the device behavior in case some feature is disabled, introduce a debugfs infrastructure with couple of files allowing user to see what features the device advertises and to set filter for features used by driver. Example: $cat /sys/bus/virtio/devices/virtio0/features 1110010111111111111101010000110010000000100000000000000000000000 $ echo "5" >/sys/kernel/debug/virtio/virtio0/filter_feature_add $ cat /sys/kernel/debug/virtio/virtio0/filter_features 5 $ echo "virtio0" > /sys/bus/virtio/drivers/virtio_net/unbind $ echo "virtio0" > /sys/bus/virtio/drivers/virtio_net/bind $ cat /sys/bus/virtio/devices/virtio0/features 1110000111111111111101010000110010000000100000000000000000000000 Note that sysfs "features" now already exists, this patch does not touch it. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Lukasz Majewski says: ==================== net: hsr: Add support for HSR-SAN (RedBOX) This patch set provides v6 of HSR-SAN (RedBOX) as well as hsr_redbox.sh test script. The most straightforward way to test those patches is to use buildroot (2024.02.01) to create rootfs and QEMU based environment to run x86_64 Linux. Then one shall run hsr_redbox.sh and hsr_ping.sh from tools/testing/selftests/net/hsr. ==================== Link: https://lore.kernel.org/r/20240423124908.2073400-1-lukma@denx.deSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lukasz Majewski authored
This patch adds hsr_redbox.sh script to test if HSR-SAN mode of operation works correctly. Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lukasz Majewski authored
Current code checks if ping command output match hardcoded pattern: "10 packets transmitted, 10 received, 0% packet loss,". Such approach will work only from one ping program version (for which this test has been originally written). This patch address problem when ping with different summary output like "10 packets transmitted, 10 packets received, 0% packet" is used to run this test - for example one from busybox (as the test system runs in QEMU with rootfs created with buildroot). The fix is to modify output of ping command to be agnostic to ping version used on the platform. Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lukasz Majewski authored
Some of the code already present in the hsr_ping.sh test program can be moved to a separate script file, so it can be reused by other HSR functionality (like HSR-SAN) tests. Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lukasz Majewski authored
Some parts (like netns creation and cleanup) of hsr_ping.sh script are already implemented in ../lib.sh common script, so can be replaced by it. Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Lukasz Majewski authored
Introduce RedBox support (HSR-SAN to be more precise) for HSR networks. Following traffic reduction optimizations have been implemented: - Do not send HSR supervisory frames to Port C (interlink) - Do not forward to HSR ring frames addressed to Port C - Do not forward to Port C frames from HSR ring - Do not send duplicate HSR frame to HSR ring when destination is Port C The corresponding patch to modify iptable2 sources has already been sent: https://lore.kernel.org/netdev/20240308145729.490863-1-lukma@denx.de/T/ Testing procedure (veth and netns): ----------------------------------- One shall run: linux-vanila/tools/testing/selftests/net/hsr/hsr_redbox.sh (Detailed description of the setup one can find in the test script file). Testing procedure (real hardware): ---------------------------------- The EVB-KSZ9477 has been used for testing on net-next branch (SHA1: 5fc68320). Ports 4/5 were used for SW managed HSR (hsr1) as first hsr0 for ports 1/2 (with HW offloading for ksz9477) was created. Port 3 has been used as interlink port (single USB-ETH dongle). Configuration - RedBox (EVB-KSZ9477): if link set lan1 down;ip link set lan2 down ip link add name hsr0 type hsr slave1 lan1 slave2 lan2 supervision 45 version 1 ip link add name hsr1 type hsr slave1 lan4 slave2 lan5 interlink lan3 supervision 45 version 1 ip link set lan4 up;ip link set lan5 up ip link set lan3 up ip addr add 192.168.0.11/24 dev hsr1 ip link set hsr1 up Configuration - DAN-H (EVB-KSZ9477): ip link set lan1 down;ip link set lan2 down ip link add name hsr0 type hsr slave1 lan1 slave2 lan2 supervision 45 version 1 ip link add name hsr1 type hsr slave1 lan4 slave2 lan5 supervision 45 version 1 ip link set lan4 up;ip link set lan5 up ip addr add 192.168.0.12/24 dev hsr1 ip link set hsr1 up This approach uses only SW based HSR devices (hsr1). -------------- ----------------- ------------ DAN-H Port5 | <------> | Port5 | | Port4 | <------> | Port4 Port3 | <---> | PC | | (RedBox) | | (USB-ETH) EVB-KSZ9477 | | EVB-KSZ9477 | | -------------- ----------------- ------------ Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Davide Caratti authored
Xiumei and Christoph reported the following lockdep splat, complaining of the qdisc root lock being taken twice: ============================================ WARNING: possible recursive locking detected 6.7.0-rc3+ #598 Not tainted -------------------------------------------- swapper/2/0 is trying to acquire lock: ffff888177190110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70 but task is already holding lock: ffff88811995a110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&sch->q.lock); lock(&sch->q.lock); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by swapper/2/0: #0: ffff888135a09d98 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x11a/0x510 #1: ffffffffaaee5260 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x2c0/0x1ed0 #2: ffffffffaaee5200 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x209/0x2e70 #3: ffff88811995a110 (&sch->q.lock){+.-.}-{2:2}, at: __dev_queue_xmit+0x1560/0x2e70 #4: ffffffffaaee5200 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x209/0x2e70 stack backtrace: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 6.7.0-rc3+ #598 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x4a/0x80 __lock_acquire+0xfdd/0x3150 lock_acquire+0x1ca/0x540 _raw_spin_lock+0x34/0x80 __dev_queue_xmit+0x1560/0x2e70 tcf_mirred_act+0x82e/0x1260 [act_mirred] tcf_action_exec+0x161/0x480 tcf_classify+0x689/0x1170 prio_enqueue+0x316/0x660 [sch_prio] dev_qdisc_enqueue+0x46/0x220 __dev_queue_xmit+0x1615/0x2e70 ip_finish_output2+0x1218/0x1ed0 __ip_finish_output+0x8b3/0x1350 ip_output+0x163/0x4e0 igmp_ifc_timer_expire+0x44b/0x930 call_timer_fn+0x1a2/0x510 run_timer_softirq+0x54d/0x11a0 __do_softirq+0x1b3/0x88f irq_exit_rcu+0x18f/0x1e0 sysvec_apic_timer_interrupt+0x6f/0x90 </IRQ> This happens when TC does a mirred egress redirect from the root qdisc of device A to the root qdisc of device B. As long as these two locks aren't protecting the same qdisc, they can be acquired in chain: add a per-qdisc lockdep key to silence false warnings. This dynamic key should safely replace the static key we have in sch_htb: it was added to allow enqueueing to the device "direct qdisc" while still holding the qdisc root lock. v2: don't use static keys anymore in HTB direct qdiscs (thanks Eric Dumazet) CC: Maxim Mikityanskiy <maxim@isovalent.com> CC: Xiumei Mu <xmu@redhat.com> Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/451Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7dc06d6158f72053cf877a82e2a7a5bd23692faa.1713448007.git.dcaratti@redhat.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueJakub Kicinski authored
Tony Nguyen says: ==================== net: intel: start The Great Code Dedup + Page Pool for iavf Alexander Lobakin says: Here's a two-shot: introduce {,Intel} Ethernet common library (libeth and libie) and switch iavf to Page Pool. Details are in the commit messages; here's a summary: Not a secret there's a ton of code duplication between two and more Intel ethernet modules. Before introducing new changes, which would need to be copied over again, start decoupling the already existing duplicate functionality into a new module, which will be shared between several Intel Ethernet drivers. The first name that came to my mind was "libie" -- "Intel Ethernet common library". Also this sounds like "lovelie" (-> one word, no "lib I E" pls) and can be expanded as "lib Internet Explorer" :P The "generic", pure-software part is placed separately, so that it can be easily reused in any driver by any vendor without linking to the Intel pre-200G guts. In a few words, it's something any modern driver does the same way, but nobody moved it level up (yet). The series is only the beginning. From now on, adding every new feature or doing any good driver refactoring will remove much more lines than add for quite some time. There's a basic roadmap with some deduplications planned already, not speaking of that touching every line now asks: "can I share this?". The final destination is very ambitious: have only one unified driver for at least i40e, ice, iavf, and idpf with a struct ops for each generation. That's never gonna happen, right? But you still can at least try. PP conversion for iavf lands within the same series as these two are tied closely. libie will support Page Pool model only, so that a driver can't use much of the lib until it's converted. iavf is only the example, the rest will eventually be converted soon on a per-driver basis. That is when it gets really interesting. Stay tech. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: MAINTAINERS: add entry for libeth and libie iavf: switch to Page Pool iavf: pack iavf_ring more efficiently libeth: add Rx buffer management page_pool: add DMA-sync-for-CPU inline helper page_pool: constify some read-only function arguments slab: introduce kvmalloc_array_node() and kvcalloc_node() iavf: drop page splitting and recycling iavf: kill "legacy-rx" for good net: intel: introduce {, Intel} Ethernet common library ==================== Link: https://lore.kernel.org/r/20240424203559.3420468-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Asbjørn Sloth Tønnesen says: ==================== net: lan966x: flower: validate control flags This series adds flower control flags validation to the lan966x driver, and changes it from assuming that it handles all control flags, to instead reject rules if they have masked any unknown/unsupported control flags. v1: https://lore.kernel.org/netdev/20240423102720.228728-1-ast@fiberby.net/ ==================== Link: https://lore.kernel.org/r/20240424125347.461995-1-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Asbjørn Sloth Tønnesen authored
Use flow_rule_is_supp_control_flags() to reject filters with unsupported control flags. In case any unsupported control flags are masked, flow_rule_is_supp_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20240424125347.461995-4-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Asbjørn Sloth Tønnesen authored
Rename goto label, as the error message is specific to the fragment flags. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20240424125347.461995-3-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Asbjørn Sloth Tønnesen authored
Define extack locally, to reduce line lengths and aid future users. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20240424125347.461995-2-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Asbjørn Sloth Tønnesen says: ==================== net: sparx5: flower: validate control flags This series adds flower control flags validation to the sparx5 driver, and changes it from assuming that it handles all control flags, to instead reject rules if they have masked any unknown/unsupported control flags. Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Tested-by: Daniel Machon <daniel.machon@microchip.com> v1: https://lore.kernel.org/netdev/20240423102728.228765-1-ast@fiberby.net/ ==================== Link: https://lore.kernel.org/r/20240424121632.459022-1-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Asbjørn Sloth Tønnesen authored
Use flow_rule_is_supp_control_flags() to reject filters with unsupported control flags. In case any unsupported control flags are masked, flow_rule_is_supp_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Tested-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20240424121632.459022-5-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Asbjørn Sloth Tønnesen authored
Remove goto, as it's only used once, and the error message is specific to that context. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Tested-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20240424121632.459022-4-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Asbjørn Sloth Tønnesen authored
Define extack locally, to reduce line lengths and aid future users. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Tested-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20240424121632.459022-3-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Asbjørn Sloth Tønnesen authored
The fragment lookup should only be performed, when at least one of the fragment flags are set. This change was deliberately not included in commit 68aba004 ("net: sparx5: flower: fix fragment flags handling") as it's only needed for future proffing the code, since "mask" is currently only set in conjunction with the fragment flags. (The 3rd flag FLOW_DIS_ENCAPSULATION is only used with "key") Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Tested-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20240424121632.459022-2-ast@fiberby.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Breno Leitao authored
Embedding net_device into structures prohibits the usage of flexible arrays in the net_device structure. For more details, see the discussion at [1]. Un-embed the net_device from the private struct by converting it into a pointer. Then use the leverage the new alloc_netdev_dummy() helper to allocate and initialize dummy devices. [1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240424161108.3397057-1-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Simon Horman says: ==================== net: microchip: Correct spelling in comments Correct spelling in comments in Microchip drivers. Flagged by codespell. v1: https://lore.kernel.org/r/20240419-lan743x-confirm-v1-0-2a087617a3e5@kernel.org ==================== Link: https://lore.kernel.org/r/20240424-lan743x-confirm-v2-0-f0480542e39f@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
Correct spelling in comments, as flagged by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20240424-lan743x-confirm-v2-4-f0480542e39f@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
Correct spelling in comments, as flagged by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20240424-lan743x-confirm-v2-3-f0480542e39f@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
Correct spelling in comments, as flagged by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20240424-lan743x-confirm-v2-2-f0480542e39f@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
Correct spelling in comments, as flagged by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://lore.kernel.org/r/20240424-lan743x-confirm-v2-1-f0480542e39f@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Hayes Wang authored
Someone complains the message appears continuously. This occurs because the device is woken from UPS mode, and the driver re-loads the firmware. When the device enters runtime suspend and cable is unplugged, the device would enter UPS mode. If the runtime resume occurs, and the device is woken from UPS mode, the driver has to re-load the firmware and causes the message. If someone wakes the device continuously, the message would be shown continuously, too. Use dev_dbg to avoid it. Note that, the function could be called before register_netdev(), so I don't use netif_info() or netif_dbg(). Signed-off-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20240424084532.159649-1-hayeswang@realtek.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ma Ke authored
To avoid the failure of usbnet_get_endpoints(), we should check the return value of the usbnet_get_endpoints(). Signed-off-by: Ma Ke <make_ruc2021@163.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240424065634.1870027-1-make_ruc2021@163.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Daniel Golle authored
Add quirk for ATS SFP-GE-T 1000Base-TX module. This copper module comes with broken TX_FAULT indicator which must be ignored for it to work. Co-authored-by: Josef Schlehofer <pepe.schlehofer@gmail.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> [ rebased on top of net-next ] Signed-off-by: Marek Behún <kabel@kernel.org> Link: https://lore.kernel.org/r/20240423090025.29231-1-kabel@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Marek Behún authored
Enhance the quirk for Fibrestore 2.5G copper SFP module. The original commit e27aca37 ("net: sfp: add quirk for FS's 2.5G copper SFP") introducing the quirk says that the PHY is inaccessible, but that is not true. The module uses Rollball protocol to talk to the PHY, and needs a 4 second wait before probing it, same as FS 10G module. The PHY inside the module is Realtek RTL8221B-VB-CG PHY. The realtek driver recently gained support to set it up via clause 45 accesses. Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240423085039.26957-2-kabel@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Marek Behún authored
Update the comment for the Fibrestore SFP-10G-T module: since commit e9301af3 ("net: sfp: fix PHY discovery for FS SFP-10G-T module") we also do a 4 second wait before probing the PHY. Fixes: e9301af3 ("net: sfp: fix PHY discovery for FS SFP-10G-T module") Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240423085039.26957-1-kabel@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 25 Apr, 2024 3 commits
-
-
Eric Dumazet authored
I had failures with pmtu.sh selftests lately, with netns dismantles firing ref_tracking alerts [1]. After much debugging, I found that some queued rcu callbacks were delayed by minutes, because of CONFIG_RCU_LAZY=y option. Joel Fernandes had a similar issue in the past, fixed with commit 483c26ff ("net: Use call_rcu_hurry() for dst_release()") In this commit, I make sure nexthop_free_rcu() and free_fib_info_rcu() are not delayed too much because they both can release device references. tools/testing/selftests/net/pmtu.sh no longer fails. Traces were: [ 968.179860] ref_tracker: veth_A-R1@00000000d0ff3fe2 has 3/5 users at dst_alloc+0x76/0x160 ip6_dst_alloc+0x25/0x80 ip6_pol_route+0x2a8/0x450 ip6_pol_route_output+0x1f/0x30 fib6_rule_lookup+0x163/0x270 ip6_route_output_flags+0xda/0x190 ip6_dst_lookup_tail.constprop.0+0x1d0/0x260 ip6_dst_lookup_flow+0x47/0xa0 udp_tunnel6_dst_lookup+0x158/0x210 vxlan_xmit_one+0x4c2/0x1550 [vxlan] vxlan_xmit+0x52d/0x14f0 [vxlan] dev_hard_start_xmit+0x7b/0x1e0 __dev_queue_xmit+0x20b/0xe40 ip6_finish_output2+0x2ea/0x6e0 ip6_finish_output+0x143/0x320 ip6_output+0x74/0x140 [ 968.179860] ref_tracker: veth_A-R1@00000000d0ff3fe2 has 1/5 users at netdev_get_by_index+0xc0/0xe0 fib6_nh_init+0x1a9/0xa90 rtm_new_nexthop+0x6fa/0x1580 rtnetlink_rcv_msg+0x155/0x3e0 netlink_rcv_skb+0x61/0x110 rtnetlink_rcv+0x19/0x20 netlink_unicast+0x23f/0x380 netlink_sendmsg+0x1fc/0x430 ____sys_sendmsg+0x2ef/0x320 ___sys_sendmsg+0x86/0xd0 __sys_sendmsg+0x67/0xc0 __x64_sys_sendmsg+0x21/0x30 x64_sys_call+0x252/0x2030 do_syscall_64+0x6c/0x190 entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 968.179860] ref_tracker: veth_A-R1@00000000d0ff3fe2 has 1/5 users at ipv6_add_dev+0x136/0x530 addrconf_notify+0x19d/0x770 notifier_call_chain+0x65/0xd0 raw_notifier_call_chain+0x1a/0x20 call_netdevice_notifiers_info+0x54/0x90 register_netdevice+0x61e/0x790 veth_newlink+0x230/0x440 __rtnl_newlink+0x7d2/0xaa0 rtnl_newlink+0x4c/0x70 rtnetlink_rcv_msg+0x155/0x3e0 netlink_rcv_skb+0x61/0x110 rtnetlink_rcv+0x19/0x20 netlink_unicast+0x23f/0x380 netlink_sendmsg+0x1fc/0x430 ____sys_sendmsg+0x2ef/0x320 ___sys_sendmsg+0x86/0xd0 .... [ 1079.316024] ? show_regs+0x68/0x80 [ 1079.316087] ? __warn+0x8c/0x140 [ 1079.316103] ? ref_tracker_free+0x1a0/0x270 [ 1079.316117] ? report_bug+0x196/0x1c0 [ 1079.316135] ? handle_bug+0x42/0x80 [ 1079.316149] ? exc_invalid_op+0x1c/0x70 [ 1079.316162] ? asm_exc_invalid_op+0x1f/0x30 [ 1079.316193] ? ref_tracker_free+0x1a0/0x270 [ 1079.316208] ? _raw_spin_unlock+0x1a/0x40 [ 1079.316222] ? free_unref_page+0x126/0x1a0 [ 1079.316239] ? destroy_large_folio+0x69/0x90 [ 1079.316251] ? __folio_put+0x99/0xd0 [ 1079.316276] dst_dev_put+0x69/0xd0 [ 1079.316308] fib6_nh_release_dsts.part.0+0x3d/0x80 [ 1079.316327] fib6_nh_release+0x45/0x70 [ 1079.316340] nexthop_free_rcu+0x131/0x170 [ 1079.316356] rcu_do_batch+0x1ee/0x820 [ 1079.316370] ? rcu_do_batch+0x179/0x820 [ 1079.316388] rcu_core+0x1aa/0x4d0 [ 1079.316405] rcu_core_si+0x12/0x20 [ 1079.316417] __do_softirq+0x13a/0x3dc [ 1079.316435] __irq_exit_rcu+0xa3/0x110 [ 1079.316449] irq_exit_rcu+0x12/0x30 [ 1079.316462] sysvec_apic_timer_interrupt+0x5b/0xe0 [ 1079.316474] asm_sysvec_apic_timer_interrupt+0x1f/0x30 [ 1079.316569] RIP: 0033:0x7f06b65c63f0 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240423205408.39632-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c 89884459 ("wifi: mac80211: fix idle calculation with multi-link") 87f55002 ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c 1971d13f ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") 4090fa37 ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c 4dcd0e83 ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") e2dc7bfd ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
While testing TCP performance with latest trees, I saw suspect SOCKET_BACKLOG drops. tcp_add_backlog() computes its limit with : limit = (u32)READ_ONCE(sk->sk_rcvbuf) + (u32)(READ_ONCE(sk->sk_sndbuf) >> 1); limit += 64 * 1024; This does not take into account that sk->sk_backlog.len is reset only at the very end of __release_sock(). Both sk->sk_backlog.len and sk->sk_rmem_alloc could reach sk_rcvbuf in normal conditions. We should double sk->sk_rcvbuf contribution in the formula to absorb bubbles in the backlog, which happen more often for very fast flows. This change maintains decent protection against abuses. Fixes: c377411f ("net: sk_add_backlog() take rmem_alloc into account") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240423125620.3309458-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-