- 01 Dec, 2022 20 commits
-
-
David Howells authored
Split the packet input handler to make the softirq side just dump the received packet into the local endpoint receive queue and then call the remainder of the input function from the I/O thread. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Create a per-local receive queue to which, in a future patch, all incoming packets will be directed and an I/O thread that will process those packets and perform all transmission of packets. Destruction of the local endpoint is also moved from the local processor work item (which will be absorbed) to the thread. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Split the code that handles packet reception in softirq mode as a prelude to moving all the packet processing beyond routing to the appropriate call and setting up of a new call out into process context. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Currently, rxrpc gives the connection's work item a ref on the connection when it queues it - and this is called from the timer expiration function. The problem comes when queue_work() fails (ie. the work item is already queued): the timer routine must put the ref - but this may cause the cleanup code to run. This has the unfortunate effect that the cleanup code may then be run in softirq context - which means that any spinlocks it might need to touch have to be guarded to disable softirqs (ie. they need a "_bh" suffix). (1) Don't give a ref to the work item. (2) Simplify handling of service connections by adding a separate active count so that the refcount isn't also used for this. (3) Connection destruction for both client and service connections can then be cleaned up by putting rxrpc_put_connection() out of line and making a tidy progression through the destruction code (offloaded to a workqueue if put from softirq or processor function context). The RCU part of the cleanup then only deals with the freeing at the end. (4) Make rxrpc_queue_conn() return immediately if it sees the active count is -1 rather then queuing the connection. (5) Make sure that the cleanup routine waits for the work item to complete. (6) Stash the rxrpc_net pointer in the conn struct so that the rcu free routine can use it, even if the local endpoint has been freed. Unfortunately, neither the timer nor the work item can simply get around the problem by just using refcount_inc_not_zero() as the waits would still have to be done, and there would still be the possibility of having to put the ref in the expiration function. Note the connection work item is mostly going to go away with the main event work being transferred to the I/O thread, so the wait in (6) will become obsolete. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Currently, rxrpc gives the call timer a ref on the call when it starts it and this is passed along to the workqueue by the timer expiration function. The problem comes when queue_work() fails (ie. the work item is already queued): the timer routine must put the ref - but this may cause the cleanup code to run. This has the unfortunate effect that the cleanup code may then be run in softirq context - which means that any spinlocks it might need to touch have to be guarded to disable softirqs (ie. they need a "_bh" suffix). Fix this by: (1) Don't give a ref to the timer. (2) Making the expiration function not do anything if the refcount is 0. Note that this is more of an optimisation. (3) Make sure that the cleanup routine waits for timer to complete. However, this has a consequence that timer cannot give a ref to the work item. Therefore the following fixes are also necessary: (4) Don't give a ref to the work item. (5) Make the work item return asap if it sees the ref count is 0. (6) Make sure that the cleanup routine waits for the work item to complete. Unfortunately, neither the timer nor the work item can simply get around the problem by just using refcount_inc_not_zero() as the waits would still have to be done, and there would still be the possibility of having to put the ref in the expiration function. Note the call work item is going to go away with the work being transferred to the I/O thread, so the wait in (6) will become obsolete. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the sk_buff tracepoint. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Add a tracepoint for the rxrpc_bundle refcounting. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_call tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_conn tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_peer tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
In rxrpc tracing, use enums to generate lists of points of interest rather than __builtin_return_address() for the rxrpc_local tracepoint Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Extract the code from a received rx ABORT packet much earlier and in a single place and harmonise the responses to malformed ABORT packets. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Remove the rxrpc_conn_parameters struct from the rxrpc_connection and rxrpc_bundle structs and emplace the members directly. These are going to get filled in from the rxrpc_call struct in future. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Remove the _net() and knet() debugging macros in favour of tracepoints. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Remove the kproto() and _proto() debugging macros in preference to using tracepoints for this. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
We should not now see duplicate packets in the recvmsg_queue. At one point, jumbo packets that overlapped with already queued data would be added to the queue and dealt with in recvmsg rather than in the softirq input code, but now jumbo packets are split/cloned before being processed by the input code and the subpackets can be discarded individually. So remove the recvmsg-side code for handling this. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
rxrpc_kernel_call_is_complete() has been removed, so remove its declaration too. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
When retransmitting a packet, rxrpc_resend() shouldn't be attaching a ref to the call to the txbuf as that pins the call and prevents the call from clearing the packet buffer. Signed-off-by: David Howells <dhowells@redhat.com> Fixes: d57a3a15 ("rxrpc: Save last ACK's SACK table rather than marking txbufs") cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
David Howells authored
Implement an in-kernel rxperf server to allow kernel-based rxrpc services to be tested directly, unlike with AFS where they're accessed by the fileserver when the latter decides it wants to. This is implemented as a module that, if loaded, opens UDP port 7009 (afs3-rmtsys) and listens on it for incoming calls. Calls can be generated using the rxperf command shipped with OpenAFS, for example. Changes ======= ver #2) - Use min_t() instead of min(). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: Jakub Kicinski <kuba@kernel.org>
-
David Howells authored
Fix the following checker warning: ../net/rxrpc/key.c:692:9: error: subtraction of different types can't work (different address spaces) Checker is wrong in this case, but cast the pointers to unsigned long to avoid the warning. Whilst we're at it, reduce the assertions to WARN_ON() and return an error. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
-
- 30 Nov, 2022 15 commits
-
-
Willem de Bruijn authored
Test NIC hardware checksum offload: - Rx + Tx - IPv4 + IPv6 - TCP + UDP Optional features: - zero checksum 0xFFFF - checksum disable 0x0000 - transport encap headers - randomization See file header for detailed comments. Expected results differ depending on NIC features: - CHECKSUM_UNNECESSARY vs CHECKSUM_COMPLETE - NETIF_F_HW_CSUM (csum_start/csum_off) vs NETIF_F_IP(V6)_CSUM Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20221128140210.553391-1-willemdebruijn.kernel@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextJakub Kicinski authored
Steffen Klassert says: ==================== ipsec-next 2022-11-26 1) Remove redundant variable in esp6. From Colin Ian King. 2) Update x->lastused for every packet. It was used only for outgoing mobile IPv6 packets, but showed to be usefull to check if the a SA is still in use in general. From Antony Antony. 3) Remove unused variable in xfrm_byidx_resize. From Leon Romanovsky. 4) Finalize extack support for xfrm. From Sabrina Dubroca. * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: add extack to xfrm_set_spdinfo xfrm: add extack to xfrm_alloc_userspi xfrm: add extack to xfrm_do_migrate xfrm: add extack to xfrm_new_ae and xfrm_replay_verify_len xfrm: add extack to xfrm_del_sa xfrm: add extack to xfrm_add_sa_expire xfrm: a few coding style clean ups xfrm: Remove not-used total variable xfrm: update x->lastused for every packet esp6: remove redundant variable err ==================== Link: https://lore.kernel.org/r/20221126110303.1859238-1-steffen.klassert@secunet.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Maxime Chevallier says: ==================== net: pcs: altera-tse: simplify and clean-up the driver This small series does a bit of code cleanup in the altera TSE pcs driver, removing unused register definitions, handling 1000BaseX speed configuration correctly according to the datasheet, and making use of proper poll_timeout helpers. No functional change is introduced. ==================== Link: https://lore.kernel.org/r/20221125131801.64234-1-maxime.chevallier@bootlin.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxime Chevallier authored
remove unused register definitions, left from the split with the altera-tse mac driver. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxime Chevallier authored
When disabling the SGMII mode bit, the PCS defaults to 1000BaseX mode. In that mode, we don't need to set the speed since it's always 1000Mbps. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxime Chevallier authored
Software resets on the TSE PCS don't clear registers, but rather reset all internal state machines regarding AN, comma detection and encoding/decoding. Use read_poll_timeout to wait for the reset to clear instead of manually polling the register. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Matthieu Baerts says: ==================== mptcp: MSG_FASTOPEN and TFO listener side support Before this series, only the initiator of a connection was able to combine both TCP FastOpen and MPTCP when using TCP_FASTOPEN_CONNECT socket option. These new patches here add (in theory) the full support of TFO with MPTCP, which means: - MSG_FASTOPEN sendmsg flag support (patch 1/8) - TFO support for the listener side (patches 2-5/8) - TCP_FASTOPEN socket option (patch 6/8) - TCP_FASTOPEN_KEY socket option (patch 7/8) To support TFO for the server side, a few preparation patches are needed (patches 2 to 5/8). Some of them were inspired by a previous work from Benjamin Hesmans. Note that TFO support with MPTCP has been validated with selftests (patch 8/8) but also with Packetdrill tests running with a modified but still very WIP version supporting MPTCP. Both the modified tool and the tests are available online: https://github.com/multipath-tcp/packetdrill/ ==================== Link: https://lore.kernel.org/r/20221125222958.958636-1-matthieu.baerts@tessares.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dmytro Shytyi authored
This patch first adds TFO support in mptcp_connect.c. This can be enabled via a new option: -o MPTFO. Once enabled, the TCP_FASTOPEN socket option is enabled for the server side and a sendto() with MSG_FASTOPEN is used instead of a connect() for the client side. Note that the first SYN has a limit of bytes it can carry. In other words, it is allowed to send less data than the provided one. We then need to track more status info to properly allow the next sendmsg() starting from the next part of the data to send the rest. Also in TFO scenarios, we need to completely spool the partially xmitted buffer -- and account for that -- before starting sendfile/mmap xmit, otherwise the relevant tests will fail. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Matthieu Baerts authored
The goal of this socket option is to set different keys per listener, see commit 1fba70e5 ("tcp: socket option to set TCP fast open key") for more details about this socket option. The only thing to do here with MPTCP is to relay the request to the first subflow like it is already done for the other TCP_FASTOPEN* socket options. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dmytro Shytyi authored
The TCP_FASTOPEN socket option is one way for the application to tell the kernel TFO support has to be enabled for the listener socket. The only thing to do here with MPTCP is to relay the request to the first subflow like it is already done for the other TCP_FASTOPEN* socket options. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dmytro Shytyi authored
The send_synack() needs to be overridden for MPTCP to support TFO for two reasons: - There is not be enough space in the TCP options if the TFO cookie has to be added in the SYN+ACK with other options: MSS (4), SACK OK (2), Timestamps (10), Window Scale (3+1), TFO (10+2), MP_CAPABLE (12). MPTCPv1 specs -- RFC 8684, section B.1 [1] -- suggest to drop the TCP timestamps option in this case. - The data received in the SYN has to be handled: the SKB can be dequeued from the subflow sk and transferred to the MPTCP sk. Counters need to be updated accordingly and the application can be notified at the end because some bytes have been received. [1] https://www.rfc-editor.org/rfc/rfc8684.html#section-b.1Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dmytro Shytyi authored
With fastopen in place, the first subflow socket is created before the MPC handshake completes, and we need to properly initialize the sequence numbers at MPC ACK reception. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Currently the initial ack sequence is generated on demand whenever it's requested and the remote key is handy. The relevant code is scattered in different places and can lead to multiple, unneeded, crypto operations. This change consolidates the ack sequence generation code in a single helper, storing the sequence number at the subflow level. The above additionally saves a few conditional in fast-path and will simplify the upcoming fast-open implementation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Currently in the receive path we don't need to discriminate between MPC SYN, MPC SYN-ACK and MPC ACK, but soon the fastopen code will need that info to properly track the fully established status. Track the exact MPC suboption type into the receive opt bitmap. No functional change intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dmytro Shytyi authored
Since commit 54f1944e ("mptcp: factor out mptcp_connect()"), all the infrastructure is now in place to support the MSG_FASTOPEN flag, we just need to call into the fastopen path in mptcp_sendmsg(). Co-developed-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 29 Nov, 2022 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
tools/lib/bpf/ringbuf.c 927cbb47 ("libbpf: Handle size overflow for ringbuf mmap") b486d19a ("libbpf: checkpatch: Fixed code alignments in ringbuf.c") https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and wifi. Current release - new code bugs: - eth: mlx5e: - use kvfree() in mlx5e_accel_fs_tcp_create() - MACsec, fix RX data path 16 RX security channel limit - MACsec, fix memory leak when MACsec device is deleted - MACsec, fix update Rx secure channel active field - MACsec, fix add Rx security association (SA) rule memory leak Previous releases - regressions: - wifi: cfg80211: don't allow multi-BSSID in S1G - stmmac: set MAC's flow control register to reflect current settings - eth: mlx5: - E-switch, fix duplicate lag creation - fix use-after-free when reverting termination table Previous releases - always broken: - ipv4: fix route deletion when nexthop info is not specified - bpf: fix a local storage BPF map bug where the value's spin lock field can get initialized incorrectly - tipc: re-fetch skb cb after tipc_msg_validate - wifi: wilc1000: fix Information Element parsing - packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE - sctp: fix memory leak in sctp_stream_outq_migrate() - can: can327: fix potential skb leak when netdev is down - can: add number of missing netdev freeing on error paths - aquantia: do not purge addresses when setting the number of rings - wwan: iosm: - fix incorrect skb length leading to truncated packet - fix crash in peek throughput test due to skb UAF" * tag 'net-6.1-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) net: ethernet: renesas: ravb: Fix promiscuous mode after system resumed MAINTAINERS: Update maintainer list for chelsio drivers ionic: update MAINTAINERS entry sctp: fix memory leak in sctp_stream_outq_migrate() packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE net/mlx5: Lag, Fix for loop when checking lag Revert "net/mlx5e: MACsec, remove replay window size limitation in offload path" net: marvell: prestera: Fix a NULL vs IS_ERR() check in some functions net: tun: Fix use-after-free in tun_detach() net: mdiobus: fix unbalanced node reference count net: hsr: Fix potential use-after-free tipc: re-fetch skb cb after tipc_msg_validate mptcp: fix sleep in atomic at close time mptcp: don't orphan ssk in mptcp_close() dsa: lan9303: Correct stat name ipv4: Fix route deletion when nexthop info is not specified net: wwan: iosm: fix incorrect skb length net: wwan: iosm: fix crash in peek throughput test net: wwan: iosm: fix dma_alloc_coherent incompatible pointer type net: wwan: iosm: fix kernel test robot reported error ...
-
Yuan Can authored
As the nla_nest_start() may fail with NULL returned, the return value should be checked. Note that this is not a real bug, nothing will break here. The next nla_put() will fail as well and we'll bail (and nla_nest_cancel() can handle NULL). But we keep getting those "fixes" so whatever. Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20221129013934.55184-1-yuancan@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yoshihiro Shimoda authored
After system resumed on some environment board, the promiscuous mode is disabled because the SoC turned off. So, call ravb_set_rx_mode() in the ravb_resume() to fix the issue. Reported-by: Tho Vu <tho.vu.wh@renesas.com> Fixes: 0184165b ("ravb: add sleep PM suspend/resume support") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20221128065604.1864391-1-yoshihiro.shimoda.uh@renesas.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ayush Sawal authored
This updates the maintainers for chelsio inline crypto drivers and chelsio crypto drivers. Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Link: https://lore.kernel.org/r/20221128231348.8225-1-ayush.sawal@chelsio.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-