Commit e33200bc authored by David S. Miller's avatar David S. Miller

Merge branch 'tls-offload-netdev-and-mlx5-support'

Boris Pismenny says:

====================
TLS offload, netdev & MLX5 support

The following series provides TLS TX inline crypto offload.

v1->v2:
   - Added IS_ENABLED(CONFIG_TLS_DEVICE) and a STATIC_KEY for icsk_clean_acked
   - File license fix
   - Fix spelling, comment by DaveW
   - Move memory allocations out of tls_set_device_offload and other misc fixes,
	comments by Kiril.

v2->v3:
   - Reversed xmas tree where needed and style fixes
   - Removed the need for skb_page_frag_refill, per Eric's comment
   - IPv6 dependency fixes

v3->v4:
   - Remove "inline" from functions in C files
   - Make clean_acked_data_enabled a static variable and add enable/disable functions to control it.
   - Remove unnecessary variable initialization mentioned by ShannonN
   - Rebase over TLS RX
   - Refactor the tls_software_fallback to reduce the number of variables mentioned by KirilT

v4->v5:
   - Add missing CONFIG_TLS_DEVICE

v5->v6:
   - Move changes to the software implementation into a seperate patch
   - Fix some checkpatch warnings
   - GPL export the enable/disable clean_acked_data functions

v6->v7:
   - Use the dst_entry to obtain the netdev in dev_get_by_index
   - Remove the IPv6 patch since it is redundent now

v7->v8:
   - Fix a merge conflict in mlx5 header

v8->v9:
   - Fix false -Wmaybe-uninitialized warning
   - Fix empty space in the end of new files

v9->v10:
   - Remove default "n" in net/Kconfig

This series adds a generic infrastructure to offload TLS crypto to a
network devices. It enables the kernel TLS socket to skip encryption and
authentication operations on the transmit side of the data path. Leaving
those computationally expensive operations to the NIC.

The NIC offload infrastructure builds TLS records and pushes them to the
TCP layer just like the SW KTLS implementation and using the same API.
TCP segmentation is mostly unaffected. Currently the only exception is
that we prevent mixed SKBs where only part of the payload requires
offload. In the future we are likely to add a similar restriction
following a change cipher spec record.

The notable differences between SW KTLS and NIC offloaded TLS
implementations are as follows:
1. The offloaded implementation builds "plaintext TLS record", those
records contain plaintext instead of ciphertext and place holder bytes
instead of authentication tags.
2. The offloaded implementation maintains a mapping from TCP sequence
number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
TLS socket, we can use the tls NIC offload infrastructure to obtain
enough context to encrypt the payload of the SKB.
A TLS record is released when the last byte of the record is ack'ed,
this is done through the new icsk_clean_acked callback.

The infrastructure should be extendable to support various NIC offload
implementations.  However it is currently written with the
implementation below in mind:
The NIC assumes that packets from each offloaded stream are sent as
plaintext and in-order. It keeps track of the TLS records in the TCP
stream. When a packet marked for offload is transmitted, the NIC
encrypts the payload in-place and puts authentication tags in the
relevant place holders.

The responsibility for handling out-of-order packets (i.e. TCP
retransmission, qdisc drops) falls on the netdev driver.

The netdev driver keeps track of the expected TCP SN from the NIC's
perspective.  If the next packet to transmit matches the expected TCP
SN, the driver advances the expected TCP SN, and transmits the packet
with TLS offload indication.

If the next packet to transmit does not match the expected TCP SN. The
driver calls the TLS layer to obtain the TLS record that includes the
TCP of the packet for transmission. Using this TLS record, the driver
posts a work entry on the transmit queue to reconstruct the NIC TLS
state required for the offload of the out-of-order packet. It updates
the expected TCP SN accordingly and transmit the now in-order packet.
The same queue is used for packet transmission and TLS context
reconstruction to avoid the need for flushing the transmit queue before
issuing the context reconstruction request.

Expected TCP SN is accessed without a lock, under the assumption that
TCP doesn't transmit SKBs from different TX queue concurrently.

If packets are rerouted to a different netdevice, then a software
fallback routine handles encryption.

Paper: https://www.netdevconf.org/1.2/papers/netdevconf-TLS.pdf
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 1a1f4a28 f9c8141f
......@@ -9037,26 +9037,17 @@ W: http://www.mellanox.com
Q: http://patchwork.ozlabs.org/project/netdev/list/
F: drivers/net/ethernet/mellanox/mlx5/core/en_*
MELLANOX ETHERNET INNOVA DRIVER
M: Ilan Tayari <ilant@mellanox.com>
R: Boris Pismenny <borisp@mellanox.com>
MELLANOX ETHERNET INNOVA DRIVERS
M: Boris Pismenny <borisp@mellanox.com>
L: netdev@vger.kernel.org
S: Supported
W: http://www.mellanox.com
Q: http://patchwork.ozlabs.org/project/netdev/list/
F: drivers/net/ethernet/mellanox/mlx5/core/en_accel/*
F: drivers/net/ethernet/mellanox/mlx5/core/accel/*
F: drivers/net/ethernet/mellanox/mlx5/core/fpga/*
F: include/linux/mlx5/mlx5_ifc_fpga.h
MELLANOX ETHERNET INNOVA IPSEC DRIVER
M: Ilan Tayari <ilant@mellanox.com>
R: Boris Pismenny <borisp@mellanox.com>
L: netdev@vger.kernel.org
S: Supported
W: http://www.mellanox.com
Q: http://patchwork.ozlabs.org/project/netdev/list/
F: drivers/net/ethernet/mellanox/mlx5/core/en_ipsec/*
F: drivers/net/ethernet/mellanox/mlx5/core/ipsec*
MELLANOX ETHERNET SWITCH DRIVERS
M: Jiri Pirko <jiri@mellanox.com>
M: Ido Schimmel <idosch@mellanox.com>
......@@ -9848,7 +9839,7 @@ F: net/netfilter/xt_CONNSECMARK.c
F: net/netfilter/xt_SECMARK.c
NETWORKING [TLS]
M: Ilya Lesokhin <ilyal@mellanox.com>
M: Boris Pismenny <borisp@mellanox.com>
M: Aviad Yehezkel <aviadye@mellanox.com>
M: Dave Watson <davejwatson@fb.com>
L: netdev@vger.kernel.org
......
......@@ -86,3 +86,14 @@ config MLX5_EN_IPSEC
Build support for IPsec cryptography-offload accelaration in the NIC.
Note: Support for hardware with this capability needs to be selected
for this option to become available.
config MLX5_EN_TLS
bool "TLS cryptography-offload accelaration"
depends on MLX5_CORE_EN
depends on TLS_DEVICE
depends on MLX5_ACCEL
default n
---help---
Build support for TLS cryptography-offload accelaration in the NIC.
Note: Support for hardware with this capability needs to be selected
for this option to become available.
......@@ -8,10 +8,10 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
fs_counters.o rl.o lag.o dev.o wq.o lib/gid.o lib/clock.o \
diag/fs_tracepoint.o
mlx5_core-$(CONFIG_MLX5_ACCEL) += accel/ipsec.o
mlx5_core-$(CONFIG_MLX5_ACCEL) += accel/ipsec.o accel/tls.o
mlx5_core-$(CONFIG_MLX5_FPGA) += fpga/cmd.o fpga/core.o fpga/conn.o fpga/sdk.o \
fpga/ipsec.o
fpga/ipsec.o fpga/tls.o
mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
en_tx.o en_rx.o en_dim.o en_txrx.o en_stats.o vxlan.o \
......@@ -28,4 +28,6 @@ mlx5_core-$(CONFIG_MLX5_CORE_IPOIB) += ipoib/ipoib.o ipoib/ethtool.o ipoib/ipoib
mlx5_core-$(CONFIG_MLX5_EN_IPSEC) += en_accel/ipsec.o en_accel/ipsec_rxtx.o \
en_accel/ipsec_stats.o
mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/tls.o en_accel/tls_rxtx.o en_accel/tls_stats.o
CFLAGS_tracepoint.o := -I$(src)
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#include <linux/mlx5/device.h>
#include "accel/tls.h"
#include "mlx5_core.h"
#include "fpga/tls.h"
int mlx5_accel_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid)
{
return mlx5_fpga_tls_add_tx_flow(mdev, flow, crypto_info,
start_offload_tcp_sn, p_swid);
}
void mlx5_accel_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid)
{
mlx5_fpga_tls_del_tx_flow(mdev, swid, GFP_KERNEL);
}
bool mlx5_accel_is_tls_device(struct mlx5_core_dev *mdev)
{
return mlx5_fpga_is_tls_device(mdev);
}
u32 mlx5_accel_tls_device_caps(struct mlx5_core_dev *mdev)
{
return mlx5_fpga_tls_device_caps(mdev);
}
int mlx5_accel_tls_init(struct mlx5_core_dev *mdev)
{
return mlx5_fpga_tls_init(mdev);
}
void mlx5_accel_tls_cleanup(struct mlx5_core_dev *mdev)
{
mlx5_fpga_tls_cleanup(mdev);
}
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#ifndef __MLX5_ACCEL_TLS_H__
#define __MLX5_ACCEL_TLS_H__
#include <linux/mlx5/driver.h>
#include <linux/tls.h>
#ifdef CONFIG_MLX5_ACCEL
enum {
MLX5_ACCEL_TLS_TX = BIT(0),
MLX5_ACCEL_TLS_RX = BIT(1),
MLX5_ACCEL_TLS_V12 = BIT(2),
MLX5_ACCEL_TLS_V13 = BIT(3),
MLX5_ACCEL_TLS_LRO = BIT(4),
MLX5_ACCEL_TLS_IPV6 = BIT(5),
MLX5_ACCEL_TLS_AES_GCM128 = BIT(30),
MLX5_ACCEL_TLS_AES_GCM256 = BIT(31),
};
struct mlx5_ifc_tls_flow_bits {
u8 src_port[0x10];
u8 dst_port[0x10];
union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits src_ipv4_src_ipv6;
union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits dst_ipv4_dst_ipv6;
u8 ipv6[0x1];
u8 direction_sx[0x1];
u8 reserved_at_2[0x1e];
};
int mlx5_accel_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid);
void mlx5_accel_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid);
bool mlx5_accel_is_tls_device(struct mlx5_core_dev *mdev);
u32 mlx5_accel_tls_device_caps(struct mlx5_core_dev *mdev);
int mlx5_accel_tls_init(struct mlx5_core_dev *mdev);
void mlx5_accel_tls_cleanup(struct mlx5_core_dev *mdev);
#else
static inline int
mlx5_accel_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid) { return 0; }
static inline void mlx5_accel_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid) { }
static inline bool mlx5_accel_is_tls_device(struct mlx5_core_dev *mdev) { return false; }
static inline u32 mlx5_accel_tls_device_caps(struct mlx5_core_dev *mdev) { return 0; }
static inline int mlx5_accel_tls_init(struct mlx5_core_dev *mdev) { return 0; }
static inline void mlx5_accel_tls_cleanup(struct mlx5_core_dev *mdev) { }
#endif
#endif /* __MLX5_ACCEL_TLS_H__ */
......@@ -55,6 +55,9 @@
struct page_pool;
#define MLX5E_METADATA_ETHER_TYPE (0x8CE4)
#define MLX5E_METADATA_ETHER_LEN 8
#define MLX5_SET_CFG(p, f, v) MLX5_SET(create_flow_group_in, p, f, v)
#define MLX5E_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)
......@@ -332,6 +335,7 @@ enum {
MLX5E_SQ_STATE_RECOVERING,
MLX5E_SQ_STATE_IPSEC,
MLX5E_SQ_STATE_AM,
MLX5E_SQ_STATE_TLS,
};
struct mlx5e_sq_wqe_info {
......@@ -797,6 +801,9 @@ struct mlx5e_priv {
#ifdef CONFIG_MLX5_EN_IPSEC
struct mlx5e_ipsec *ipsec;
#endif
#ifdef CONFIG_MLX5_EN_TLS
struct mlx5e_tls *tls;
#endif
};
struct mlx5e_profile {
......@@ -827,6 +834,8 @@ void mlx5e_build_ptys2ethtool_map(void);
u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
void *accel_priv, select_queue_fallback_t fallback);
netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
struct mlx5e_tx_wqe *wqe, u16 pi);
void mlx5e_completion_event(struct mlx5_core_cq *mcq);
void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
......@@ -942,6 +951,18 @@ static inline bool mlx5e_tunnel_inner_ft_supported(struct mlx5_core_dev *mdev)
MLX5_CAP_FLOWTABLE_NIC_RX(mdev, ft_field_support.inner_ip_version));
}
static inline void mlx5e_sq_fetch_wqe(struct mlx5e_txqsq *sq,
struct mlx5e_tx_wqe **wqe,
u16 *pi)
{
struct mlx5_wq_cyc *wq;
wq = &sq->wq;
*pi = sq->pc & wq->sz_m1;
*wqe = mlx5_wq_cyc_get_wqe(wq, *pi);
memset(*wqe, 0, sizeof(**wqe));
}
static inline
struct mlx5e_tx_wqe *mlx5e_post_nop(struct mlx5_wq_cyc *wq, u32 sqn, u16 *pc)
{
......
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#ifndef __MLX5E_EN_ACCEL_H__
#define __MLX5E_EN_ACCEL_H__
#ifdef CONFIG_MLX5_ACCEL
#include <linux/skbuff.h>
#include <linux/netdevice.h>
#include "en_accel/ipsec_rxtx.h"
#include "en_accel/tls_rxtx.h"
#include "en.h"
static inline struct sk_buff *mlx5e_accel_handle_tx(struct sk_buff *skb,
struct mlx5e_txqsq *sq,
struct net_device *dev,
struct mlx5e_tx_wqe **wqe,
u16 *pi)
{
#ifdef CONFIG_MLX5_EN_TLS
if (sq->state & BIT(MLX5E_SQ_STATE_TLS)) {
skb = mlx5e_tls_handle_tx_skb(dev, sq, skb, wqe, pi);
if (unlikely(!skb))
return NULL;
}
#endif
#ifdef CONFIG_MLX5_EN_IPSEC
if (sq->state & BIT(MLX5E_SQ_STATE_IPSEC)) {
skb = mlx5e_ipsec_handle_tx_skb(dev, *wqe, skb);
if (unlikely(!skb))
return NULL;
}
#endif
return skb;
}
#endif /* CONFIG_MLX5_ACCEL */
#endif /* __MLX5E_EN_ACCEL_H__ */
......@@ -45,9 +45,6 @@
#define MLX5E_IPSEC_SADB_RX_BITS 10
#define MLX5E_IPSEC_ESN_SCOPE_MID 0x80000000L
#define MLX5E_METADATA_ETHER_TYPE (0x8CE4)
#define MLX5E_METADATA_ETHER_LEN 8
struct mlx5e_priv;
struct mlx5e_ipsec_sw_stats {
......
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#include <linux/netdevice.h>
#include <net/ipv6.h>
#include "en_accel/tls.h"
#include "accel/tls.h"
static void mlx5e_tls_set_ipv4_flow(void *flow, struct sock *sk)
{
struct inet_sock *inet = inet_sk(sk);
MLX5_SET(tls_flow, flow, ipv6, 0);
memcpy(MLX5_ADDR_OF(tls_flow, flow, dst_ipv4_dst_ipv6.ipv4_layout.ipv4),
&inet->inet_daddr, MLX5_FLD_SZ_BYTES(ipv4_layout, ipv4));
memcpy(MLX5_ADDR_OF(tls_flow, flow, src_ipv4_src_ipv6.ipv4_layout.ipv4),
&inet->inet_rcv_saddr, MLX5_FLD_SZ_BYTES(ipv4_layout, ipv4));
}
#if IS_ENABLED(CONFIG_IPV6)
static void mlx5e_tls_set_ipv6_flow(void *flow, struct sock *sk)
{
struct ipv6_pinfo *np = inet6_sk(sk);
MLX5_SET(tls_flow, flow, ipv6, 1);
memcpy(MLX5_ADDR_OF(tls_flow, flow, dst_ipv4_dst_ipv6.ipv6_layout.ipv6),
&sk->sk_v6_daddr, MLX5_FLD_SZ_BYTES(ipv6_layout, ipv6));
memcpy(MLX5_ADDR_OF(tls_flow, flow, src_ipv4_src_ipv6.ipv6_layout.ipv6),
&np->saddr, MLX5_FLD_SZ_BYTES(ipv6_layout, ipv6));
}
#endif
static void mlx5e_tls_set_flow_tcp_ports(void *flow, struct sock *sk)
{
struct inet_sock *inet = inet_sk(sk);
memcpy(MLX5_ADDR_OF(tls_flow, flow, src_port), &inet->inet_sport,
MLX5_FLD_SZ_BYTES(tls_flow, src_port));
memcpy(MLX5_ADDR_OF(tls_flow, flow, dst_port), &inet->inet_dport,
MLX5_FLD_SZ_BYTES(tls_flow, dst_port));
}
static int mlx5e_tls_set_flow(void *flow, struct sock *sk, u32 caps)
{
switch (sk->sk_family) {
case AF_INET:
mlx5e_tls_set_ipv4_flow(flow, sk);
break;
#if IS_ENABLED(CONFIG_IPV6)
case AF_INET6:
if (!sk->sk_ipv6only &&
ipv6_addr_type(&sk->sk_v6_daddr) == IPV6_ADDR_MAPPED) {
mlx5e_tls_set_ipv4_flow(flow, sk);
break;
}
if (!(caps & MLX5_ACCEL_TLS_IPV6))
goto error_out;
mlx5e_tls_set_ipv6_flow(flow, sk);
break;
#endif
default:
goto error_out;
}
mlx5e_tls_set_flow_tcp_ports(flow, sk);
return 0;
error_out:
return -EINVAL;
}
static int mlx5e_tls_add(struct net_device *netdev, struct sock *sk,
enum tls_offload_ctx_dir direction,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn)
{
struct mlx5e_priv *priv = netdev_priv(netdev);
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct mlx5_core_dev *mdev = priv->mdev;
u32 caps = mlx5_accel_tls_device_caps(mdev);
int ret = -ENOMEM;
void *flow;
if (direction != TLS_OFFLOAD_CTX_DIR_TX)
return -EINVAL;
flow = kzalloc(MLX5_ST_SZ_BYTES(tls_flow), GFP_KERNEL);
if (!flow)
return ret;
ret = mlx5e_tls_set_flow(flow, sk, caps);
if (ret)
goto free_flow;
if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
struct mlx5e_tls_offload_context *tx_ctx =
mlx5e_get_tls_tx_context(tls_ctx);
u32 swid;
ret = mlx5_accel_tls_add_tx_flow(mdev, flow, crypto_info,
start_offload_tcp_sn, &swid);
if (ret < 0)
goto free_flow;
tx_ctx->swid = htonl(swid);
tx_ctx->expected_seq = start_offload_tcp_sn;
}
return 0;
free_flow:
kfree(flow);
return ret;
}
static void mlx5e_tls_del(struct net_device *netdev,
struct tls_context *tls_ctx,
enum tls_offload_ctx_dir direction)
{
struct mlx5e_priv *priv = netdev_priv(netdev);
if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
u32 swid = ntohl(mlx5e_get_tls_tx_context(tls_ctx)->swid);
mlx5_accel_tls_del_tx_flow(priv->mdev, swid);
} else {
netdev_err(netdev, "unsupported direction %d\n", direction);
}
}
static const struct tlsdev_ops mlx5e_tls_ops = {
.tls_dev_add = mlx5e_tls_add,
.tls_dev_del = mlx5e_tls_del,
};
void mlx5e_tls_build_netdev(struct mlx5e_priv *priv)
{
struct net_device *netdev = priv->netdev;
if (!mlx5_accel_is_tls_device(priv->mdev))
return;
netdev->features |= NETIF_F_HW_TLS_TX;
netdev->hw_features |= NETIF_F_HW_TLS_TX;
netdev->tlsdev_ops = &mlx5e_tls_ops;
}
int mlx5e_tls_init(struct mlx5e_priv *priv)
{
struct mlx5e_tls *tls = kzalloc(sizeof(*tls), GFP_KERNEL);
if (!tls)
return -ENOMEM;
priv->tls = tls;
return 0;
}
void mlx5e_tls_cleanup(struct mlx5e_priv *priv)
{
struct mlx5e_tls *tls = priv->tls;
if (!tls)
return;
kfree(tls);
priv->tls = NULL;
}
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#ifndef __MLX5E_TLS_H__
#define __MLX5E_TLS_H__
#ifdef CONFIG_MLX5_EN_TLS
#include <net/tls.h>
#include "en.h"
struct mlx5e_tls_sw_stats {
atomic64_t tx_tls_drop_metadata;
atomic64_t tx_tls_drop_resync_alloc;
atomic64_t tx_tls_drop_no_sync_data;
atomic64_t tx_tls_drop_bypass_required;
};
struct mlx5e_tls {
struct mlx5e_tls_sw_stats sw_stats;
};
struct mlx5e_tls_offload_context {
struct tls_offload_context base;
u32 expected_seq;
__be32 swid;
};
static inline struct mlx5e_tls_offload_context *
mlx5e_get_tls_tx_context(struct tls_context *tls_ctx)
{
BUILD_BUG_ON(sizeof(struct mlx5e_tls_offload_context) >
TLS_OFFLOAD_CONTEXT_SIZE);
return container_of(tls_offload_ctx(tls_ctx),
struct mlx5e_tls_offload_context,
base);
}
void mlx5e_tls_build_netdev(struct mlx5e_priv *priv);
int mlx5e_tls_init(struct mlx5e_priv *priv);
void mlx5e_tls_cleanup(struct mlx5e_priv *priv);
int mlx5e_tls_get_count(struct mlx5e_priv *priv);
int mlx5e_tls_get_strings(struct mlx5e_priv *priv, uint8_t *data);
int mlx5e_tls_get_stats(struct mlx5e_priv *priv, u64 *data);
#else
static inline void mlx5e_tls_build_netdev(struct mlx5e_priv *priv) { }
static inline int mlx5e_tls_init(struct mlx5e_priv *priv) { return 0; }
static inline void mlx5e_tls_cleanup(struct mlx5e_priv *priv) { }
static inline int mlx5e_tls_get_count(struct mlx5e_priv *priv) { return 0; }
static inline int mlx5e_tls_get_strings(struct mlx5e_priv *priv, uint8_t *data) { return 0; }
static inline int mlx5e_tls_get_stats(struct mlx5e_priv *priv, u64 *data) { return 0; }
#endif
#endif /* __MLX5E_TLS_H__ */
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#include "en_accel/tls.h"
#include "en_accel/tls_rxtx.h"
#define SYNDROME_OFFLOAD_REQUIRED 32
#define SYNDROME_SYNC 33
struct sync_info {
u64 rcd_sn;
s32 sync_len;
int nr_frags;
skb_frag_t frags[MAX_SKB_FRAGS];
};
struct mlx5e_tls_metadata {
/* One byte of syndrome followed by 3 bytes of swid */
__be32 syndrome_swid;
__be16 first_seq;
/* packet type ID field */
__be16 ethertype;
} __packed;
static int mlx5e_tls_add_metadata(struct sk_buff *skb, __be32 swid)
{
struct mlx5e_tls_metadata *pet;
struct ethhdr *eth;
if (skb_cow_head(skb, sizeof(struct mlx5e_tls_metadata)))
return -ENOMEM;
eth = (struct ethhdr *)skb_push(skb, sizeof(struct mlx5e_tls_metadata));
skb->mac_header -= sizeof(struct mlx5e_tls_metadata);
pet = (struct mlx5e_tls_metadata *)(eth + 1);
memmove(skb->data, skb->data + sizeof(struct mlx5e_tls_metadata),
2 * ETH_ALEN);
eth->h_proto = cpu_to_be16(MLX5E_METADATA_ETHER_TYPE);
pet->syndrome_swid = htonl(SYNDROME_OFFLOAD_REQUIRED << 24) | swid;
return 0;
}
static int mlx5e_tls_get_sync_data(struct mlx5e_tls_offload_context *context,
u32 tcp_seq, struct sync_info *info)
{
int remaining, i = 0, ret = -EINVAL;
struct tls_record_info *record;
unsigned long flags;
s32 sync_size;
spin_lock_irqsave(&context->base.lock, flags);
record = tls_get_record(&context->base, tcp_seq, &info->rcd_sn);
if (unlikely(!record))
goto out;
sync_size = tcp_seq - tls_record_start_seq(record);
info->sync_len = sync_size;
if (unlikely(sync_size < 0)) {
if (tls_record_is_start_marker(record))
goto done;
goto out;
}
remaining = sync_size;
while (remaining > 0) {
info->frags[i] = record->frags[i];
__skb_frag_ref(&info->frags[i]);
remaining -= skb_frag_size(&info->frags[i]);
if (remaining < 0)
skb_frag_size_add(&info->frags[i], remaining);
i++;
}
info->nr_frags = i;
done:
ret = 0;
out:
spin_unlock_irqrestore(&context->base.lock, flags);
return ret;
}
static void mlx5e_tls_complete_sync_skb(struct sk_buff *skb,
struct sk_buff *nskb, u32 tcp_seq,
int headln, __be64 rcd_sn)
{
struct mlx5e_tls_metadata *pet;
u8 syndrome = SYNDROME_SYNC;
struct iphdr *iph;
struct tcphdr *th;
int data_len, mss;
nskb->dev = skb->dev;
skb_reset_mac_header(nskb);
skb_set_network_header(nskb, skb_network_offset(skb));
skb_set_transport_header(nskb, skb_transport_offset(skb));
memcpy(nskb->data, skb->data, headln);
memcpy(nskb->data + headln, &rcd_sn, sizeof(rcd_sn));
iph = ip_hdr(nskb);
iph->tot_len = htons(nskb->len - skb_network_offset(nskb));
th = tcp_hdr(nskb);
data_len = nskb->len - headln;
tcp_seq -= data_len;
th->seq = htonl(tcp_seq);
mss = nskb->dev->mtu - (headln - skb_network_offset(nskb));
skb_shinfo(nskb)->gso_size = 0;
if (data_len > mss) {
skb_shinfo(nskb)->gso_size = mss;
skb_shinfo(nskb)->gso_segs = DIV_ROUND_UP(data_len, mss);
}
skb_shinfo(nskb)->gso_type = skb_shinfo(skb)->gso_type;
pet = (struct mlx5e_tls_metadata *)(nskb->data + sizeof(struct ethhdr));
memcpy(pet, &syndrome, sizeof(syndrome));
pet->first_seq = htons(tcp_seq);
/* MLX5 devices don't care about the checksum partial start, offset
* and pseudo header
*/
nskb->ip_summed = CHECKSUM_PARTIAL;
nskb->xmit_more = 1;
nskb->queue_mapping = skb->queue_mapping;
}
static struct sk_buff *
mlx5e_tls_handle_ooo(struct mlx5e_tls_offload_context *context,
struct mlx5e_txqsq *sq, struct sk_buff *skb,
struct mlx5e_tx_wqe **wqe,
u16 *pi,
struct mlx5e_tls *tls)
{
u32 tcp_seq = ntohl(tcp_hdr(skb)->seq);
struct sync_info info;
struct sk_buff *nskb;
int linear_len = 0;
int headln;
int i;
sq->stats.tls_ooo++;
if (mlx5e_tls_get_sync_data(context, tcp_seq, &info)) {
/* We might get here if a retransmission reaches the driver
* after the relevant record is acked.
* It should be safe to drop the packet in this case
*/
atomic64_inc(&tls->sw_stats.tx_tls_drop_no_sync_data);
goto err_out;
}
if (unlikely(info.sync_len < 0)) {
u32 payload;
headln = skb_transport_offset(skb) + tcp_hdrlen(skb);
payload = skb->len - headln;
if (likely(payload <= -info.sync_len))
/* SKB payload doesn't require offload
*/
return skb;
atomic64_inc(&tls->sw_stats.tx_tls_drop_bypass_required);
goto err_out;
}
if (unlikely(mlx5e_tls_add_metadata(skb, context->swid))) {
atomic64_inc(&tls->sw_stats.tx_tls_drop_metadata);
goto err_out;
}
headln = skb_transport_offset(skb) + tcp_hdrlen(skb);
linear_len += headln + sizeof(info.rcd_sn);
nskb = alloc_skb(linear_len, GFP_ATOMIC);
if (unlikely(!nskb)) {
atomic64_inc(&tls->sw_stats.tx_tls_drop_resync_alloc);
goto err_out;
}
context->expected_seq = tcp_seq + skb->len - headln;
skb_put(nskb, linear_len);
for (i = 0; i < info.nr_frags; i++)
skb_shinfo(nskb)->frags[i] = info.frags[i];
skb_shinfo(nskb)->nr_frags = info.nr_frags;
nskb->data_len = info.sync_len;
nskb->len += info.sync_len;
sq->stats.tls_resync_bytes += nskb->len;
mlx5e_tls_complete_sync_skb(skb, nskb, tcp_seq, headln,
cpu_to_be64(info.rcd_sn));
mlx5e_sq_xmit(sq, nskb, *wqe, *pi);
mlx5e_sq_fetch_wqe(sq, wqe, pi);
return skb;
err_out:
dev_kfree_skb_any(skb);
return NULL;
}
struct sk_buff *mlx5e_tls_handle_tx_skb(struct net_device *netdev,
struct mlx5e_txqsq *sq,
struct sk_buff *skb,
struct mlx5e_tx_wqe **wqe,
u16 *pi)
{
struct mlx5e_priv *priv = netdev_priv(netdev);
struct mlx5e_tls_offload_context *context;
struct tls_context *tls_ctx;
u32 expected_seq;
int datalen;
u32 skb_seq;
if (!skb->sk || !tls_is_sk_tx_device_offloaded(skb->sk))
goto out;
datalen = skb->len - (skb_transport_offset(skb) + tcp_hdrlen(skb));
if (!datalen)
goto out;
tls_ctx = tls_get_ctx(skb->sk);
if (unlikely(tls_ctx->netdev != netdev))
goto out;
skb_seq = ntohl(tcp_hdr(skb)->seq);
context = mlx5e_get_tls_tx_context(tls_ctx);
expected_seq = context->expected_seq;
if (unlikely(expected_seq != skb_seq)) {
skb = mlx5e_tls_handle_ooo(context, sq, skb, wqe, pi, priv->tls);
goto out;
}
if (unlikely(mlx5e_tls_add_metadata(skb, context->swid))) {
atomic64_inc(&priv->tls->sw_stats.tx_tls_drop_metadata);
dev_kfree_skb_any(skb);
skb = NULL;
goto out;
}
context->expected_seq = skb_seq + datalen;
out:
return skb;
}
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#ifndef __MLX5E_TLS_RXTX_H__
#define __MLX5E_TLS_RXTX_H__
#ifdef CONFIG_MLX5_EN_TLS
#include <linux/skbuff.h>
#include "en.h"
struct sk_buff *mlx5e_tls_handle_tx_skb(struct net_device *netdev,
struct mlx5e_txqsq *sq,
struct sk_buff *skb,
struct mlx5e_tx_wqe **wqe,
u16 *pi);
#endif /* CONFIG_MLX5_EN_TLS */
#endif /* __MLX5E_TLS_RXTX_H__ */
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#include <linux/ethtool.h>
#include <net/sock.h>
#include "en.h"
#include "accel/tls.h"
#include "fpga/sdk.h"
#include "en_accel/tls.h"
static const struct counter_desc mlx5e_tls_sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, tx_tls_drop_metadata) },
{ MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, tx_tls_drop_resync_alloc) },
{ MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, tx_tls_drop_no_sync_data) },
{ MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, tx_tls_drop_bypass_required) },
};
#define MLX5E_READ_CTR_ATOMIC64(ptr, dsc, i) \
atomic64_read((atomic64_t *)((char *)(ptr) + (dsc)[i].offset))
#define NUM_TLS_SW_COUNTERS ARRAY_SIZE(mlx5e_tls_sw_stats_desc)
int mlx5e_tls_get_count(struct mlx5e_priv *priv)
{
if (!priv->tls)
return 0;
return NUM_TLS_SW_COUNTERS;
}
int mlx5e_tls_get_strings(struct mlx5e_priv *priv, uint8_t *data)
{
unsigned int i, idx = 0;
if (!priv->tls)
return 0;
for (i = 0; i < NUM_TLS_SW_COUNTERS; i++)
strcpy(data + (idx++) * ETH_GSTRING_LEN,
mlx5e_tls_sw_stats_desc[i].format);
return NUM_TLS_SW_COUNTERS;
}
int mlx5e_tls_get_stats(struct mlx5e_priv *priv, u64 *data)
{
int i, idx = 0;
if (!priv->tls)
return 0;
for (i = 0; i < NUM_TLS_SW_COUNTERS; i++)
data[idx++] =
MLX5E_READ_CTR_ATOMIC64(&priv->tls->sw_stats,
mlx5e_tls_sw_stats_desc, i);
return NUM_TLS_SW_COUNTERS;
}
......@@ -42,7 +42,9 @@
#include "en_rep.h"
#include "en_accel/ipsec.h"
#include "en_accel/ipsec_rxtx.h"
#include "en_accel/tls.h"
#include "accel/ipsec.h"
#include "accel/tls.h"
#include "vxlan.h"
struct mlx5e_rq_param {
......@@ -1014,6 +1016,8 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
INIT_WORK(&sq->recover.recover_work, mlx5e_sq_recover);
if (MLX5_IPSEC_DEV(c->priv->mdev))
set_bit(MLX5E_SQ_STATE_IPSEC, &sq->state);
if (mlx5_accel_is_tls_device(c->priv->mdev))
set_bit(MLX5E_SQ_STATE_TLS, &sq->state);
param->wq.db_numa_node = cpu_to_node(c->cpu);
err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
......@@ -4376,6 +4380,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
#endif
mlx5e_ipsec_build_netdev(priv);
mlx5e_tls_build_netdev(priv);
}
static void mlx5e_create_q_counters(struct mlx5e_priv *priv)
......@@ -4417,12 +4422,16 @@ static void mlx5e_nic_init(struct mlx5_core_dev *mdev,
err = mlx5e_ipsec_init(priv);
if (err)
mlx5_core_err(mdev, "IPSec initialization failed, %d\n", err);
err = mlx5e_tls_init(priv);
if (err)
mlx5_core_err(mdev, "TLS initialization failed, %d\n", err);
mlx5e_build_nic_netdev(netdev);
mlx5e_vxlan_init(priv);
}
static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
{
mlx5e_tls_cleanup(priv);
mlx5e_ipsec_cleanup(priv);
mlx5e_vxlan_cleanup(priv);
}
......
......@@ -32,6 +32,7 @@
#include "en.h"
#include "en_accel/ipsec.h"
#include "en_accel/tls.h"
static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_packets) },
......@@ -43,6 +44,12 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tso_inner_packets) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tso_inner_bytes) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_added_vlan_packets) },
#ifdef CONFIG_MLX5_EN_TLS
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tls_ooo) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tls_resync_bytes) },
#endif
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_lro_packets) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_lro_bytes) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_removed_vlan_packets) },
......@@ -161,6 +168,10 @@ static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->tx_csum_partial_inner += sq_stats->csum_partial_inner;
s->tx_csum_none += sq_stats->csum_none;
s->tx_csum_partial += sq_stats->csum_partial;
#ifdef CONFIG_MLX5_EN_TLS
s->tx_tls_ooo += sq_stats->tls_ooo;
s->tx_tls_resync_bytes += sq_stats->tls_resync_bytes;
#endif
}
}
......@@ -1065,6 +1076,22 @@ static void mlx5e_grp_ipsec_update_stats(struct mlx5e_priv *priv)
mlx5e_ipsec_update_stats(priv);
}
static int mlx5e_grp_tls_get_num_stats(struct mlx5e_priv *priv)
{
return mlx5e_tls_get_count(priv);
}
static int mlx5e_grp_tls_fill_strings(struct mlx5e_priv *priv, u8 *data,
int idx)
{
return idx + mlx5e_tls_get_strings(priv, data + idx * ETH_GSTRING_LEN);
}
static int mlx5e_grp_tls_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx)
{
return idx + mlx5e_tls_get_stats(priv, data + idx);
}
static const struct counter_desc rq_stats_desc[] = {
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, packets) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, bytes) },
......@@ -1267,6 +1294,11 @@ const struct mlx5e_stats_grp mlx5e_stats_grps[] = {
.fill_stats = mlx5e_grp_ipsec_fill_stats,
.update_stats = mlx5e_grp_ipsec_update_stats,
},
{
.get_num_stats = mlx5e_grp_tls_get_num_stats,
.fill_strings = mlx5e_grp_tls_fill_strings,
.fill_stats = mlx5e_grp_tls_fill_stats,
},
{
.get_num_stats = mlx5e_grp_channels_get_num_stats,
.fill_strings = mlx5e_grp_channels_fill_strings,
......
......@@ -93,6 +93,11 @@ struct mlx5e_sw_stats {
u64 rx_cache_waive;
u64 ch_eq_rearm;
#ifdef CONFIG_MLX5_EN_TLS
u64 tx_tls_ooo;
u64 tx_tls_resync_bytes;
#endif
/* Special handling counters */
u64 link_down_events_phy;
};
......@@ -194,6 +199,10 @@ struct mlx5e_sq_stats {
u64 csum_partial_inner;
u64 added_vlan_packets;
u64 nop;
#ifdef CONFIG_MLX5_EN_TLS
u64 tls_ooo;
u64 tls_resync_bytes;
#endif
/* less likely accessed in data path */
u64 csum_none;
u64 stopped;
......
......@@ -35,12 +35,21 @@
#include <net/dsfield.h>
#include "en.h"
#include "ipoib/ipoib.h"
#include "en_accel/ipsec_rxtx.h"
#include "en_accel/en_accel.h"
#include "lib/clock.h"
#define MLX5E_SQ_NOPS_ROOM MLX5_SEND_WQE_MAX_WQEBBS
#ifndef CONFIG_MLX5_EN_TLS
#define MLX5E_SQ_STOP_ROOM (MLX5_SEND_WQE_MAX_WQEBBS +\
MLX5E_SQ_NOPS_ROOM)
#else
/* TLS offload requires MLX5E_SQ_STOP_ROOM to have
* enough room for a resync SKB, a normal SKB and a NOP
*/
#define MLX5E_SQ_STOP_ROOM (2 * MLX5_SEND_WQE_MAX_WQEBBS +\
MLX5E_SQ_NOPS_ROOM)
#endif
static inline void mlx5e_tx_dma_unmap(struct device *pdev,
struct mlx5e_sq_dma *dma)
......@@ -325,8 +334,8 @@ mlx5e_txwqe_complete(struct mlx5e_txqsq *sq, struct sk_buff *skb,
}
}
static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
struct mlx5e_tx_wqe *wqe, u16 pi)
netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
struct mlx5e_tx_wqe *wqe, u16 pi)
{
struct mlx5e_tx_wqe_info *wi = &sq->db.wqe_info[pi];
......@@ -399,21 +408,19 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5e_txqsq *sq = priv->txq2sq[skb_get_queue_mapping(skb)];
struct mlx5_wq_cyc *wq = &sq->wq;
u16 pi = sq->pc & wq->sz_m1;
struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi);
struct mlx5e_tx_wqe *wqe;
struct mlx5e_txqsq *sq;
u16 pi;
memset(wqe, 0, sizeof(*wqe));
sq = priv->txq2sq[skb_get_queue_mapping(skb)];
mlx5e_sq_fetch_wqe(sq, &wqe, &pi);
#ifdef CONFIG_MLX5_EN_IPSEC
if (sq->state & BIT(MLX5E_SQ_STATE_IPSEC)) {
skb = mlx5e_ipsec_handle_tx_skb(dev, wqe, skb);
if (unlikely(!skb))
return NETDEV_TX_OK;
}
#ifdef CONFIG_MLX5_ACCEL
/* might send skbs and update wqe and pi */
skb = mlx5e_accel_handle_tx(skb, sq, dev, &wqe, &pi);
if (unlikely(!skb))
return NETDEV_TX_OK;
#endif
return mlx5e_sq_xmit(sq, skb, wqe, pi);
}
......
......@@ -53,6 +53,7 @@ struct mlx5_fpga_device {
} conn_res;
struct mlx5_fpga_ipsec *ipsec;
struct mlx5_fpga_tls *tls;
};
#define mlx5_fpga_dbg(__adev, format, ...) \
......
......@@ -43,9 +43,6 @@
#include "fpga/sdk.h"
#include "fpga/core.h"
#define SBU_QP_QUEUE_SIZE 8
#define MLX5_FPGA_IPSEC_CMD_TIMEOUT_MSEC (60 * 1000)
enum mlx5_fpga_ipsec_cmd_status {
MLX5_FPGA_IPSEC_CMD_PENDING,
MLX5_FPGA_IPSEC_CMD_SEND_FAIL,
......@@ -258,7 +255,7 @@ static int mlx5_fpga_ipsec_cmd_wait(void *ctx)
{
struct mlx5_fpga_ipsec_cmd_context *context = ctx;
unsigned long timeout =
msecs_to_jiffies(MLX5_FPGA_IPSEC_CMD_TIMEOUT_MSEC);
msecs_to_jiffies(MLX5_FPGA_CMD_TIMEOUT_MSEC);
int res;
res = wait_for_completion_timeout(&context->complete, timeout);
......
......@@ -41,6 +41,8 @@
* DOC: Innova SDK
* This header defines the in-kernel API for Innova FPGA client drivers.
*/
#define SBU_QP_QUEUE_SIZE 8
#define MLX5_FPGA_CMD_TIMEOUT_MSEC (60 * 1000)
enum mlx5_fpga_access_type {
MLX5_FPGA_ACCESS_TYPE_I2C = 0x0,
......
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#include <linux/mlx5/device.h>
#include "fpga/tls.h"
#include "fpga/cmd.h"
#include "fpga/sdk.h"
#include "fpga/core.h"
#include "accel/tls.h"
struct mlx5_fpga_tls_command_context;
typedef void (*mlx5_fpga_tls_command_complete)
(struct mlx5_fpga_conn *conn, struct mlx5_fpga_device *fdev,
struct mlx5_fpga_tls_command_context *ctx,
struct mlx5_fpga_dma_buf *resp);
struct mlx5_fpga_tls_command_context {
struct list_head list;
/* There is no guarantee on the order between the TX completion
* and the command response.
* The TX completion is going to touch cmd->buf even in
* the case of successful transmission.
* So instead of requiring separate allocations for cmd
* and cmd->buf we've decided to use a reference counter
*/
refcount_t ref;
struct mlx5_fpga_dma_buf buf;
mlx5_fpga_tls_command_complete complete;
};
static void
mlx5_fpga_tls_put_command_ctx(struct mlx5_fpga_tls_command_context *ctx)
{
if (refcount_dec_and_test(&ctx->ref))
kfree(ctx);
}
static void mlx5_fpga_tls_cmd_complete(struct mlx5_fpga_device *fdev,
struct mlx5_fpga_dma_buf *resp)
{
struct mlx5_fpga_conn *conn = fdev->tls->conn;
struct mlx5_fpga_tls_command_context *ctx;
struct mlx5_fpga_tls *tls = fdev->tls;
unsigned long flags;
spin_lock_irqsave(&tls->pending_cmds_lock, flags);
ctx = list_first_entry(&tls->pending_cmds,
struct mlx5_fpga_tls_command_context, list);
list_del(&ctx->list);
spin_unlock_irqrestore(&tls->pending_cmds_lock, flags);
ctx->complete(conn, fdev, ctx, resp);
}
static void mlx5_fpga_cmd_send_complete(struct mlx5_fpga_conn *conn,
struct mlx5_fpga_device *fdev,
struct mlx5_fpga_dma_buf *buf,
u8 status)
{
struct mlx5_fpga_tls_command_context *ctx =
container_of(buf, struct mlx5_fpga_tls_command_context, buf);
mlx5_fpga_tls_put_command_ctx(ctx);
if (unlikely(status))
mlx5_fpga_tls_cmd_complete(fdev, NULL);
}
static void mlx5_fpga_tls_cmd_send(struct mlx5_fpga_device *fdev,
struct mlx5_fpga_tls_command_context *cmd,
mlx5_fpga_tls_command_complete complete)
{
struct mlx5_fpga_tls *tls = fdev->tls;
unsigned long flags;
int ret;
refcount_set(&cmd->ref, 2);
cmd->complete = complete;
cmd->buf.complete = mlx5_fpga_cmd_send_complete;
spin_lock_irqsave(&tls->pending_cmds_lock, flags);
/* mlx5_fpga_sbu_conn_sendmsg is called under pending_cmds_lock
* to make sure commands are inserted to the tls->pending_cmds list
* and the command QP in the same order.
*/
ret = mlx5_fpga_sbu_conn_sendmsg(tls->conn, &cmd->buf);
if (likely(!ret))
list_add_tail(&cmd->list, &tls->pending_cmds);
else
complete(tls->conn, fdev, cmd, NULL);
spin_unlock_irqrestore(&tls->pending_cmds_lock, flags);
}
/* Start of context identifiers range (inclusive) */
#define SWID_START 0
/* End of context identifiers range (exclusive) */
#define SWID_END BIT(24)
static int mlx5_fpga_tls_alloc_swid(struct idr *idr, spinlock_t *idr_spinlock,
void *ptr)
{
int ret;
/* TLS metadata format is 1 byte for syndrome followed
* by 3 bytes of swid (software ID)
* swid must not exceed 3 bytes.
* See tls_rxtx.c:insert_pet() for details
*/
BUILD_BUG_ON((SWID_END - 1) & 0xFF000000);
idr_preload(GFP_KERNEL);
spin_lock_irq(idr_spinlock);
ret = idr_alloc(idr, ptr, SWID_START, SWID_END, GFP_ATOMIC);
spin_unlock_irq(idr_spinlock);
idr_preload_end();
return ret;
}
static void mlx5_fpga_tls_release_swid(struct idr *idr,
spinlock_t *idr_spinlock, u32 swid)
{
unsigned long flags;
spin_lock_irqsave(idr_spinlock, flags);
idr_remove(idr, swid);
spin_unlock_irqrestore(idr_spinlock, flags);
}
struct mlx5_teardown_stream_context {
struct mlx5_fpga_tls_command_context cmd;
u32 swid;
};
static void
mlx5_fpga_tls_teardown_completion(struct mlx5_fpga_conn *conn,
struct mlx5_fpga_device *fdev,
struct mlx5_fpga_tls_command_context *cmd,
struct mlx5_fpga_dma_buf *resp)
{
struct mlx5_teardown_stream_context *ctx =
container_of(cmd, struct mlx5_teardown_stream_context, cmd);
if (resp) {
u32 syndrome = MLX5_GET(tls_resp, resp->sg[0].data, syndrome);
if (syndrome)
mlx5_fpga_err(fdev,
"Teardown stream failed with syndrome = %d",
syndrome);
else
mlx5_fpga_tls_release_swid(&fdev->tls->tx_idr,
&fdev->tls->idr_spinlock,
ctx->swid);
}
mlx5_fpga_tls_put_command_ctx(cmd);
}
static void mlx5_fpga_tls_flow_to_cmd(void *flow, void *cmd)
{
memcpy(MLX5_ADDR_OF(tls_cmd, cmd, src_port), flow,
MLX5_BYTE_OFF(tls_flow, ipv6));
MLX5_SET(tls_cmd, cmd, ipv6, MLX5_GET(tls_flow, flow, ipv6));
MLX5_SET(tls_cmd, cmd, direction_sx,
MLX5_GET(tls_flow, flow, direction_sx));
}
void mlx5_fpga_tls_send_teardown_cmd(struct mlx5_core_dev *mdev, void *flow,
u32 swid, gfp_t flags)
{
struct mlx5_teardown_stream_context *ctx;
struct mlx5_fpga_dma_buf *buf;
void *cmd;
ctx = kzalloc(sizeof(*ctx) + MLX5_TLS_COMMAND_SIZE, flags);
if (!ctx)
return;
buf = &ctx->cmd.buf;
cmd = (ctx + 1);
MLX5_SET(tls_cmd, cmd, command_type, CMD_TEARDOWN_STREAM);
MLX5_SET(tls_cmd, cmd, swid, swid);
mlx5_fpga_tls_flow_to_cmd(flow, cmd);
kfree(flow);
buf->sg[0].data = cmd;
buf->sg[0].size = MLX5_TLS_COMMAND_SIZE;
ctx->swid = swid;
mlx5_fpga_tls_cmd_send(mdev->fpga, &ctx->cmd,
mlx5_fpga_tls_teardown_completion);
}
void mlx5_fpga_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid,
gfp_t flags)
{
struct mlx5_fpga_tls *tls = mdev->fpga->tls;
void *flow;
rcu_read_lock();
flow = idr_find(&tls->tx_idr, swid);
rcu_read_unlock();
if (!flow) {
mlx5_fpga_err(mdev->fpga, "No flow information for swid %u\n",
swid);
return;
}
mlx5_fpga_tls_send_teardown_cmd(mdev, flow, swid, flags);
}
enum mlx5_fpga_setup_stream_status {
MLX5_FPGA_CMD_PENDING,
MLX5_FPGA_CMD_SEND_FAILED,
MLX5_FPGA_CMD_RESPONSE_RECEIVED,
MLX5_FPGA_CMD_ABANDONED,
};
struct mlx5_setup_stream_context {
struct mlx5_fpga_tls_command_context cmd;
atomic_t status;
u32 syndrome;
struct completion comp;
};
static void
mlx5_fpga_tls_setup_completion(struct mlx5_fpga_conn *conn,
struct mlx5_fpga_device *fdev,
struct mlx5_fpga_tls_command_context *cmd,
struct mlx5_fpga_dma_buf *resp)
{
struct mlx5_setup_stream_context *ctx =
container_of(cmd, struct mlx5_setup_stream_context, cmd);
int status = MLX5_FPGA_CMD_SEND_FAILED;
void *tls_cmd = ctx + 1;
/* If we failed to send to command resp == NULL */
if (resp) {
ctx->syndrome = MLX5_GET(tls_resp, resp->sg[0].data, syndrome);
status = MLX5_FPGA_CMD_RESPONSE_RECEIVED;
}
status = atomic_xchg_release(&ctx->status, status);
if (likely(status != MLX5_FPGA_CMD_ABANDONED)) {
complete(&ctx->comp);
return;
}
mlx5_fpga_err(fdev, "Command was abandoned, syndrome = %u\n",
ctx->syndrome);
if (!ctx->syndrome) {
/* The process was killed while waiting for the context to be
* added, and the add completed successfully.
* We need to destroy the HW context, and we can't can't reuse
* the command context because we might not have received
* the tx completion yet.
*/
mlx5_fpga_tls_del_tx_flow(fdev->mdev,
MLX5_GET(tls_cmd, tls_cmd, swid),
GFP_ATOMIC);
}
mlx5_fpga_tls_put_command_ctx(cmd);
}
static int mlx5_fpga_tls_setup_stream_cmd(struct mlx5_core_dev *mdev,
struct mlx5_setup_stream_context *ctx)
{
struct mlx5_fpga_dma_buf *buf;
void *cmd = ctx + 1;
int status, ret = 0;
buf = &ctx->cmd.buf;
buf->sg[0].data = cmd;
buf->sg[0].size = MLX5_TLS_COMMAND_SIZE;
MLX5_SET(tls_cmd, cmd, command_type, CMD_SETUP_STREAM);
init_completion(&ctx->comp);
atomic_set(&ctx->status, MLX5_FPGA_CMD_PENDING);
ctx->syndrome = -1;
mlx5_fpga_tls_cmd_send(mdev->fpga, &ctx->cmd,
mlx5_fpga_tls_setup_completion);
wait_for_completion_killable(&ctx->comp);
status = atomic_xchg_acquire(&ctx->status, MLX5_FPGA_CMD_ABANDONED);
if (unlikely(status == MLX5_FPGA_CMD_PENDING))
/* ctx is going to be released in mlx5_fpga_tls_setup_completion */
return -EINTR;
if (unlikely(ctx->syndrome))
ret = -ENOMEM;
mlx5_fpga_tls_put_command_ctx(&ctx->cmd);
return ret;
}
static void mlx5_fpga_tls_hw_qp_recv_cb(void *cb_arg,
struct mlx5_fpga_dma_buf *buf)
{
struct mlx5_fpga_device *fdev = (struct mlx5_fpga_device *)cb_arg;
mlx5_fpga_tls_cmd_complete(fdev, buf);
}
bool mlx5_fpga_is_tls_device(struct mlx5_core_dev *mdev)
{
if (!mdev->fpga || !MLX5_CAP_GEN(mdev, fpga))
return false;
if (MLX5_CAP_FPGA(mdev, ieee_vendor_id) !=
MLX5_FPGA_CAP_SANDBOX_VENDOR_ID_MLNX)
return false;
if (MLX5_CAP_FPGA(mdev, sandbox_product_id) !=
MLX5_FPGA_CAP_SANDBOX_PRODUCT_ID_TLS)
return false;
if (MLX5_CAP_FPGA(mdev, sandbox_product_version) != 0)
return false;
return true;
}
static int mlx5_fpga_tls_get_caps(struct mlx5_fpga_device *fdev,
u32 *p_caps)
{
int err, cap_size = MLX5_ST_SZ_BYTES(tls_extended_cap);
u32 caps = 0;
void *buf;
buf = kzalloc(cap_size, GFP_KERNEL);
if (!buf)
return -ENOMEM;
err = mlx5_fpga_get_sbu_caps(fdev, cap_size, buf);
if (err)
goto out;
if (MLX5_GET(tls_extended_cap, buf, tx))
caps |= MLX5_ACCEL_TLS_TX;
if (MLX5_GET(tls_extended_cap, buf, rx))
caps |= MLX5_ACCEL_TLS_RX;
if (MLX5_GET(tls_extended_cap, buf, tls_v12))
caps |= MLX5_ACCEL_TLS_V12;
if (MLX5_GET(tls_extended_cap, buf, tls_v13))
caps |= MLX5_ACCEL_TLS_V13;
if (MLX5_GET(tls_extended_cap, buf, lro))
caps |= MLX5_ACCEL_TLS_LRO;
if (MLX5_GET(tls_extended_cap, buf, ipv6))
caps |= MLX5_ACCEL_TLS_IPV6;
if (MLX5_GET(tls_extended_cap, buf, aes_gcm_128))
caps |= MLX5_ACCEL_TLS_AES_GCM128;
if (MLX5_GET(tls_extended_cap, buf, aes_gcm_256))
caps |= MLX5_ACCEL_TLS_AES_GCM256;
*p_caps = caps;
err = 0;
out:
kfree(buf);
return err;
}
int mlx5_fpga_tls_init(struct mlx5_core_dev *mdev)
{
struct mlx5_fpga_device *fdev = mdev->fpga;
struct mlx5_fpga_conn_attr init_attr = {0};
struct mlx5_fpga_conn *conn;
struct mlx5_fpga_tls *tls;
int err = 0;
if (!mlx5_fpga_is_tls_device(mdev) || !fdev)
return 0;
tls = kzalloc(sizeof(*tls), GFP_KERNEL);
if (!tls)
return -ENOMEM;
err = mlx5_fpga_tls_get_caps(fdev, &tls->caps);
if (err)
goto error;
if (!(tls->caps & (MLX5_ACCEL_TLS_TX | MLX5_ACCEL_TLS_V12 |
MLX5_ACCEL_TLS_AES_GCM128))) {
err = -ENOTSUPP;
goto error;
}
init_attr.rx_size = SBU_QP_QUEUE_SIZE;
init_attr.tx_size = SBU_QP_QUEUE_SIZE;
init_attr.recv_cb = mlx5_fpga_tls_hw_qp_recv_cb;
init_attr.cb_arg = fdev;
conn = mlx5_fpga_sbu_conn_create(fdev, &init_attr);
if (IS_ERR(conn)) {
err = PTR_ERR(conn);
mlx5_fpga_err(fdev, "Error creating TLS command connection %d\n",
err);
goto error;
}
tls->conn = conn;
spin_lock_init(&tls->pending_cmds_lock);
INIT_LIST_HEAD(&tls->pending_cmds);
idr_init(&tls->tx_idr);
spin_lock_init(&tls->idr_spinlock);
fdev->tls = tls;
return 0;
error:
kfree(tls);
return err;
}
void mlx5_fpga_tls_cleanup(struct mlx5_core_dev *mdev)
{
struct mlx5_fpga_device *fdev = mdev->fpga;
if (!fdev || !fdev->tls)
return;
mlx5_fpga_sbu_conn_destroy(fdev->tls->conn);
kfree(fdev->tls);
fdev->tls = NULL;
}
static void mlx5_fpga_tls_set_aes_gcm128_ctx(void *cmd,
struct tls_crypto_info *info,
__be64 *rcd_sn)
{
struct tls12_crypto_info_aes_gcm_128 *crypto_info =
(struct tls12_crypto_info_aes_gcm_128 *)info;
memcpy(MLX5_ADDR_OF(tls_cmd, cmd, tls_rcd_sn), crypto_info->rec_seq,
TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE);
memcpy(MLX5_ADDR_OF(tls_cmd, cmd, tls_implicit_iv),
crypto_info->salt, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
memcpy(MLX5_ADDR_OF(tls_cmd, cmd, encryption_key),
crypto_info->key, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
/* in AES-GCM 128 we need to write the key twice */
memcpy(MLX5_ADDR_OF(tls_cmd, cmd, encryption_key) +
TLS_CIPHER_AES_GCM_128_KEY_SIZE,
crypto_info->key, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
MLX5_SET(tls_cmd, cmd, alg, MLX5_TLS_ALG_AES_GCM_128);
}
static int mlx5_fpga_tls_set_key_material(void *cmd, u32 caps,
struct tls_crypto_info *crypto_info)
{
__be64 rcd_sn;
switch (crypto_info->cipher_type) {
case TLS_CIPHER_AES_GCM_128:
if (!(caps & MLX5_ACCEL_TLS_AES_GCM128))
return -EINVAL;
mlx5_fpga_tls_set_aes_gcm128_ctx(cmd, crypto_info, &rcd_sn);
break;
default:
return -EINVAL;
}
return 0;
}
static int mlx5_fpga_tls_add_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info, u32 swid,
u32 tcp_sn)
{
u32 caps = mlx5_fpga_tls_device_caps(mdev);
struct mlx5_setup_stream_context *ctx;
int ret = -ENOMEM;
size_t cmd_size;
void *cmd;
cmd_size = MLX5_TLS_COMMAND_SIZE + sizeof(*ctx);
ctx = kzalloc(cmd_size, GFP_KERNEL);
if (!ctx)
goto out;
cmd = ctx + 1;
ret = mlx5_fpga_tls_set_key_material(cmd, caps, crypto_info);
if (ret)
goto free_ctx;
mlx5_fpga_tls_flow_to_cmd(flow, cmd);
MLX5_SET(tls_cmd, cmd, swid, swid);
MLX5_SET(tls_cmd, cmd, tcp_sn, tcp_sn);
return mlx5_fpga_tls_setup_stream_cmd(mdev, ctx);
free_ctx:
kfree(ctx);
out:
return ret;
}
int mlx5_fpga_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid)
{
struct mlx5_fpga_tls *tls = mdev->fpga->tls;
int ret = -ENOMEM;
u32 swid;
ret = mlx5_fpga_tls_alloc_swid(&tls->tx_idr, &tls->idr_spinlock, flow);
if (ret < 0)
return ret;
swid = ret;
MLX5_SET(tls_flow, flow, direction_sx, 1);
ret = mlx5_fpga_tls_add_flow(mdev, flow, crypto_info, swid,
start_offload_tcp_sn);
if (ret && ret != -EINTR)
goto free_swid;
*p_swid = swid;
return 0;
free_swid:
mlx5_fpga_tls_release_swid(&tls->tx_idr, &tls->idr_spinlock, swid);
return ret;
}
/*
* Copyright (c) 2018 Mellanox Technologies. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*
*/
#ifndef __MLX5_FPGA_TLS_H__
#define __MLX5_FPGA_TLS_H__
#include <linux/mlx5/driver.h>
#include <net/tls.h>
#include "fpga/core.h"
struct mlx5_fpga_tls {
struct list_head pending_cmds;
spinlock_t pending_cmds_lock; /* Protects pending_cmds */
u32 caps;
struct mlx5_fpga_conn *conn;
struct idr tx_idr;
spinlock_t idr_spinlock; /* protects the IDR */
};
int mlx5_fpga_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid);
void mlx5_fpga_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid,
gfp_t flags);
bool mlx5_fpga_is_tls_device(struct mlx5_core_dev *mdev);
int mlx5_fpga_tls_init(struct mlx5_core_dev *mdev);
void mlx5_fpga_tls_cleanup(struct mlx5_core_dev *mdev);
static inline u32 mlx5_fpga_tls_device_caps(struct mlx5_core_dev *mdev)
{
return mdev->fpga->tls->caps;
}
#endif /* __MLX5_FPGA_TLS_H__ */
......@@ -60,6 +60,7 @@
#include "fpga/core.h"
#include "fpga/ipsec.h"
#include "accel/ipsec.h"
#include "accel/tls.h"
#include "lib/clock.h"
MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
......@@ -1190,6 +1191,12 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
goto err_ipsec_start;
}
err = mlx5_accel_tls_init(dev);
if (err) {
dev_err(&pdev->dev, "TLS device start failed %d\n", err);
goto err_tls_start;
}
err = mlx5_init_fs(dev);
if (err) {
dev_err(&pdev->dev, "Failed to init flow steering\n");
......@@ -1231,6 +1238,9 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
mlx5_cleanup_fs(dev);
err_fs:
mlx5_accel_tls_cleanup(dev);
err_tls_start:
mlx5_accel_ipsec_cleanup(dev);
err_ipsec_start:
......@@ -1306,6 +1316,7 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
mlx5_sriov_detach(dev);
mlx5_cleanup_fs(dev);
mlx5_accel_ipsec_cleanup(dev);
mlx5_accel_tls_cleanup(dev);
mlx5_fpga_device_stop(dev);
mlx5_irq_clear_affinity_hints(dev);
free_comp_eqs(dev);
......
......@@ -356,22 +356,6 @@ struct mlx5_ifc_odp_per_transport_service_cap_bits {
u8 reserved_at_6[0x1a];
};
struct mlx5_ifc_ipv4_layout_bits {
u8 reserved_at_0[0x60];
u8 ipv4[0x20];
};
struct mlx5_ifc_ipv6_layout_bits {
u8 ipv6[16][0x8];
};
union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits {
struct mlx5_ifc_ipv6_layout_bits ipv6_layout;
struct mlx5_ifc_ipv4_layout_bits ipv4_layout;
u8 reserved_at_0[0x80];
};
struct mlx5_ifc_fte_match_set_lyr_2_4_bits {
u8 smac_47_16[0x20];
......
......@@ -32,12 +32,29 @@
#ifndef MLX5_IFC_FPGA_H
#define MLX5_IFC_FPGA_H
struct mlx5_ifc_ipv4_layout_bits {
u8 reserved_at_0[0x60];
u8 ipv4[0x20];
};
struct mlx5_ifc_ipv6_layout_bits {
u8 ipv6[16][0x8];
};
union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits {
struct mlx5_ifc_ipv6_layout_bits ipv6_layout;
struct mlx5_ifc_ipv4_layout_bits ipv4_layout;
u8 reserved_at_0[0x80];
};
enum {
MLX5_FPGA_CAP_SANDBOX_VENDOR_ID_MLNX = 0x2c9,
};
enum {
MLX5_FPGA_CAP_SANDBOX_PRODUCT_ID_IPSEC = 0x2,
MLX5_FPGA_CAP_SANDBOX_PRODUCT_ID_TLS = 0x3,
};
struct mlx5_ifc_fpga_shell_caps_bits {
......@@ -370,6 +387,27 @@ struct mlx5_ifc_fpga_destroy_qp_out_bits {
u8 reserved_at_40[0x40];
};
struct mlx5_ifc_tls_extended_cap_bits {
u8 aes_gcm_128[0x1];
u8 aes_gcm_256[0x1];
u8 reserved_at_2[0x1e];
u8 reserved_at_20[0x20];
u8 context_capacity_total[0x20];
u8 context_capacity_rx[0x20];
u8 context_capacity_tx[0x20];
u8 reserved_at_a0[0x10];
u8 tls_counter_size[0x10];
u8 tls_counters_addr_low[0x20];
u8 tls_counters_addr_high[0x20];
u8 rx[0x1];
u8 tx[0x1];
u8 tls_v12[0x1];
u8 tls_v13[0x1];
u8 lro[0x1];
u8 ipv6[0x1];
u8 reserved_at_106[0x1a];
};
struct mlx5_ifc_ipsec_extended_cap_bits {
u8 encapsulation[0x20];
......@@ -519,4 +557,43 @@ struct mlx5_ifc_fpga_ipsec_sa {
__be16 reserved2;
} __packed;
enum fpga_tls_cmds {
CMD_SETUP_STREAM = 0x1001,
CMD_TEARDOWN_STREAM = 0x1002,
};
#define MLX5_TLS_1_2 (0)
#define MLX5_TLS_ALG_AES_GCM_128 (0)
#define MLX5_TLS_ALG_AES_GCM_256 (1)
struct mlx5_ifc_tls_cmd_bits {
u8 command_type[0x20];
u8 ipv6[0x1];
u8 direction_sx[0x1];
u8 tls_version[0x2];
u8 reserved[0x1c];
u8 swid[0x20];
u8 src_port[0x10];
u8 dst_port[0x10];
union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits src_ipv4_src_ipv6;
union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits dst_ipv4_dst_ipv6;
u8 tls_rcd_sn[0x40];
u8 tcp_sn[0x20];
u8 tls_implicit_iv[0x20];
u8 tls_xor_iv[0x40];
u8 encryption_key[0x100];
u8 alg[4];
u8 reserved2[0x1c];
u8 reserved3[0x4a0];
};
struct mlx5_ifc_tls_resp_bits {
u8 syndrome[0x20];
u8 stream_id[0x20];
u8 reserverd[0x40];
};
#define MLX5_TLS_COMMAND_SIZE (0x100)
#endif /* MLX5_IFC_FPGA_H */
......@@ -78,6 +78,7 @@ enum {
NETIF_F_HW_ESP_BIT, /* Hardware ESP transformation offload */
NETIF_F_HW_ESP_TX_CSUM_BIT, /* ESP with TX checksum offload */
NETIF_F_RX_UDP_TUNNEL_PORT_BIT, /* Offload of RX port for UDP tunnels */
NETIF_F_HW_TLS_TX_BIT, /* Hardware TLS TX offload */
NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */
NETIF_F_HW_TLS_RECORD_BIT, /* Offload TLS record */
......@@ -149,6 +150,7 @@ enum {
#define NETIF_F_RX_UDP_TUNNEL_PORT __NETIF_F(RX_UDP_TUNNEL_PORT)
#define NETIF_F_HW_TLS_RECORD __NETIF_F(HW_TLS_RECORD)
#define NETIF_F_GSO_UDP_L4 __NETIF_F(GSO_UDP_L4)
#define NETIF_F_HW_TLS_TX __NETIF_F(HW_TLS_TX)
#define for_each_netdev_feature(mask_addr, bit) \
for_each_set_bit(bit, (unsigned long *)mask_addr, NETDEV_FEATURE_COUNT)
......
......@@ -865,6 +865,26 @@ struct xfrmdev_ops {
};
#endif
#if IS_ENABLED(CONFIG_TLS_DEVICE)
enum tls_offload_ctx_dir {
TLS_OFFLOAD_CTX_DIR_RX,
TLS_OFFLOAD_CTX_DIR_TX,
};
struct tls_crypto_info;
struct tls_context;
struct tlsdev_ops {
int (*tls_dev_add)(struct net_device *netdev, struct sock *sk,
enum tls_offload_ctx_dir direction,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn);
void (*tls_dev_del)(struct net_device *netdev,
struct tls_context *ctx,
enum tls_offload_ctx_dir direction);
};
#endif
struct dev_ifalias {
struct rcu_head rcuhead;
char ifalias[];
......@@ -1750,6 +1770,10 @@ struct net_device {
const struct xfrmdev_ops *xfrmdev_ops;
#endif
#if IS_ENABLED(CONFIG_TLS_DEVICE)
const struct tlsdev_ops *tlsdev_ops;
#endif
const struct header_ops *header_ops;
unsigned int flags;
......
......@@ -1034,6 +1034,7 @@ static inline struct sk_buff *alloc_skb_fclone(unsigned int size,
struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src);
int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask);
struct sk_buff *skb_clone(struct sk_buff *skb, gfp_t priority);
void skb_copy_header(struct sk_buff *new, const struct sk_buff *old);
struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t priority);
struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
gfp_t gfp_mask, bool fclone);
......
......@@ -77,6 +77,7 @@ struct inet_connection_sock_af_ops {
* @icsk_af_ops Operations which are AF_INET{4,6} specific
* @icsk_ulp_ops Pluggable ULP control hook
* @icsk_ulp_data ULP private data
* @icsk_clean_acked Clean acked data hook
* @icsk_listen_portaddr_node hash to the portaddr listener hashtable
* @icsk_ca_state: Congestion control state
* @icsk_retransmits: Number of unrecovered [RTO] timeouts
......@@ -102,6 +103,7 @@ struct inet_connection_sock {
const struct inet_connection_sock_af_ops *icsk_af_ops;
const struct tcp_ulp_ops *icsk_ulp_ops;
void *icsk_ulp_data;
void (*icsk_clean_acked)(struct sock *sk, u32 acked_seq);
struct hlist_node icsk_listen_portaddr_node;
unsigned int (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
__u8 icsk_ca_state:6,
......
......@@ -481,6 +481,11 @@ struct sock {
void (*sk_error_report)(struct sock *sk);
int (*sk_backlog_rcv)(struct sock *sk,
struct sk_buff *skb);
#ifdef CONFIG_SOCK_VALIDATE_XMIT
struct sk_buff* (*sk_validate_xmit_skb)(struct sock *sk,
struct net_device *dev,
struct sk_buff *skb);
#endif
void (*sk_destruct)(struct sock *sk);
struct sock_reuseport __rcu *sk_reuseport_cb;
struct rcu_head sk_rcu;
......@@ -2332,6 +2337,22 @@ static inline bool sk_fullsock(const struct sock *sk)
return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
}
/* Checks if this SKB belongs to an HW offloaded socket
* and whether any SW fallbacks are required based on dev.
*/
static inline struct sk_buff *sk_validate_xmit_skb(struct sk_buff *skb,
struct net_device *dev)
{
#ifdef CONFIG_SOCK_VALIDATE_XMIT
struct sock *sk = skb->sk;
if (sk && sk_fullsock(sk) && sk->sk_validate_xmit_skb)
skb = sk->sk_validate_xmit_skb(sk, dev, skb);
#endif
return skb;
}
/* This helper checks if a socket is a LISTEN or NEW_SYN_RECV
* SYNACK messages can be attached to either ones (depending on SYNCOOKIE)
*/
......
......@@ -2105,4 +2105,12 @@ static inline bool tcp_bpf_ca_needs_ecn(struct sock *sk)
#if IS_ENABLED(CONFIG_SMC)
extern struct static_key_false tcp_have_smc;
#endif
#if IS_ENABLED(CONFIG_TLS_DEVICE)
void clean_acked_data_enable(struct inet_connection_sock *icsk,
void (*cad)(struct sock *sk, u32 ack_seq));
void clean_acked_data_disable(struct inet_connection_sock *icsk);
#endif
#endif /* _TCP_H */
......@@ -83,21 +83,10 @@ struct tls_device {
void (*unhash)(struct tls_device *device, struct sock *sk);
};
struct tls_sw_context {
struct tls_sw_context_tx {
struct crypto_aead *aead_send;
struct crypto_aead *aead_recv;
struct crypto_wait async_wait;
/* Receive context */
struct strparser strp;
void (*saved_data_ready)(struct sock *sk);
unsigned int (*sk_poll)(struct file *file, struct socket *sock,
struct poll_table_struct *wait);
struct sk_buff *recv_pkt;
u8 control;
bool decrypted;
/* Sending context */
char aad_space[TLS_AAD_SPACE_SIZE];
unsigned int sg_plaintext_size;
......@@ -114,6 +103,50 @@ struct tls_sw_context {
struct scatterlist sg_aead_out[2];
};
struct tls_sw_context_rx {
struct crypto_aead *aead_recv;
struct crypto_wait async_wait;
struct strparser strp;
void (*saved_data_ready)(struct sock *sk);
unsigned int (*sk_poll)(struct file *file, struct socket *sock,
struct poll_table_struct *wait);
struct sk_buff *recv_pkt;
u8 control;
bool decrypted;
};
struct tls_record_info {
struct list_head list;
u32 end_seq;
int len;
int num_frags;
skb_frag_t frags[MAX_SKB_FRAGS];
};
struct tls_offload_context {
struct crypto_aead *aead_send;
spinlock_t lock; /* protects records list */
struct list_head records_list;
struct tls_record_info *open_record;
struct tls_record_info *retransmit_hint;
u64 hint_record_sn;
u64 unacked_record_sn;
struct scatterlist sg_tx_data[MAX_SKB_FRAGS];
void (*sk_destruct)(struct sock *sk);
u8 driver_state[];
/* The TLS layer reserves room for driver specific state
* Currently the belief is that there is not enough
* driver specific state to justify another layer of indirection
*/
#define TLS_DRIVER_STATE_SIZE (max_t(size_t, 8, sizeof(void *)))
};
#define TLS_OFFLOAD_CONTEXT_SIZE \
(ALIGN(sizeof(struct tls_offload_context), sizeof(void *)) + \
TLS_DRIVER_STATE_SIZE)
enum {
TLS_PENDING_CLOSED_RECORD
};
......@@ -138,9 +171,15 @@ struct tls_context {
struct tls12_crypto_info_aes_gcm_128 crypto_recv_aes_gcm_128;
};
void *priv_ctx;
struct list_head list;
struct net_device *netdev;
refcount_t refcount;
void *priv_ctx_tx;
void *priv_ctx_rx;
u8 conf:3;
u8 tx_conf:3;
u8 rx_conf:3;
struct cipher_context tx;
struct cipher_context rx;
......@@ -177,7 +216,8 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
int tls_sw_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags);
void tls_sw_close(struct sock *sk, long timeout);
void tls_sw_free_resources(struct sock *sk);
void tls_sw_free_resources_tx(struct sock *sk);
void tls_sw_free_resources_rx(struct sock *sk);
int tls_sw_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int nonblock, int flags, int *addr_len);
unsigned int tls_sw_poll(struct file *file, struct socket *sock,
......@@ -186,9 +226,28 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
struct pipe_inode_info *pipe,
size_t len, unsigned int flags);
void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
void tls_icsk_clean_acked(struct sock *sk);
int tls_set_device_offload(struct sock *sk, struct tls_context *ctx);
int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
int tls_device_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags);
void tls_device_sk_destruct(struct sock *sk);
void tls_device_init(void);
void tls_device_cleanup(void);
struct tls_record_info *tls_get_record(struct tls_offload_context *context,
u32 seq, u64 *p_record_sn);
static inline bool tls_record_is_start_marker(struct tls_record_info *rec)
{
return rec->len == 0;
}
static inline u32 tls_record_start_seq(struct tls_record_info *rec)
{
return rec->end_seq - rec->len;
}
void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
int tls_push_sg(struct sock *sk, struct tls_context *ctx,
struct scatterlist *sg, u16 first_offset,
int flags);
......@@ -225,6 +284,13 @@ static inline bool tls_is_pending_open_record(struct tls_context *tls_ctx)
return tls_ctx->pending_open_record_frags;
}
static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk)
{
return sk_fullsock(sk) &&
/* matches smp_store_release in tls_set_device_offload */
smp_load_acquire(&sk->sk_destruct) == &tls_device_sk_destruct;
}
static inline void tls_err_abort(struct sock *sk, int err)
{
sk->sk_err = err;
......@@ -297,16 +363,22 @@ static inline struct tls_context *tls_get_ctx(const struct sock *sk)
return icsk->icsk_ulp_data;
}
static inline struct tls_sw_context *tls_sw_ctx(
static inline struct tls_sw_context_rx *tls_sw_ctx_rx(
const struct tls_context *tls_ctx)
{
return (struct tls_sw_context *)tls_ctx->priv_ctx;
return (struct tls_sw_context_rx *)tls_ctx->priv_ctx_rx;
}
static inline struct tls_sw_context_tx *tls_sw_ctx_tx(
const struct tls_context *tls_ctx)
{
return (struct tls_sw_context_tx *)tls_ctx->priv_ctx_tx;
}
static inline struct tls_offload_context *tls_offload_ctx(
const struct tls_context *tls_ctx)
{
return (struct tls_offload_context *)tls_ctx->priv_ctx;
return (struct tls_offload_context *)tls_ctx->priv_ctx_tx;
}
int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
......@@ -314,4 +386,12 @@ int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
void tls_register_device(struct tls_device *device);
void tls_unregister_device(struct tls_device *device);
struct sk_buff *tls_validate_xmit_skb(struct sock *sk,
struct net_device *dev,
struct sk_buff *skb);
int tls_sw_fallback_init(struct sock *sk,
struct tls_offload_context *offload_ctx,
struct tls_crypto_info *crypto_info);
#endif /* _TLS_OFFLOAD_H */
......@@ -407,6 +407,9 @@ config GRO_CELLS
bool
default n
config SOCK_VALIDATE_XMIT
bool
config NET_DEVLINK
tristate "Network physical/parent device Netlink interface"
help
......
......@@ -3112,6 +3112,10 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
if (unlikely(!skb))
goto out_null;
skb = sk_validate_xmit_skb(skb, dev);
if (unlikely(!skb))
goto out_null;
if (netif_needs_gso(skb, features)) {
struct sk_buff *segs;
......
......@@ -110,6 +110,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
[NETIF_F_HW_ESP_TX_CSUM_BIT] = "esp-tx-csum-hw-offload",
[NETIF_F_RX_UDP_TUNNEL_PORT_BIT] = "rx-udp_tunnel-port-offload",
[NETIF_F_HW_TLS_RECORD_BIT] = "tls-hw-record",
[NETIF_F_HW_TLS_TX_BIT] = "tls-hw-tx-offload",
};
static const char
......
......@@ -1305,7 +1305,7 @@ static void skb_headers_offset_update(struct sk_buff *skb, int off)
skb->inner_mac_header += off;
}
static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
void skb_copy_header(struct sk_buff *new, const struct sk_buff *old)
{
__copy_skb_header(new, old);
......@@ -1313,6 +1313,7 @@ static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
skb_shinfo(new)->gso_segs = skb_shinfo(old)->gso_segs;
skb_shinfo(new)->gso_type = skb_shinfo(old)->gso_type;
}
EXPORT_SYMBOL(skb_copy_header);
static inline int skb_alloc_rx_flag(const struct sk_buff *skb)
{
......@@ -1355,7 +1356,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
copy_skb_header(n, skb);
skb_copy_header(n, skb);
return n;
}
EXPORT_SYMBOL(skb_copy);
......@@ -1419,7 +1420,7 @@ struct sk_buff *__pskb_copy_fclone(struct sk_buff *skb, int headroom,
skb_clone_fraglist(n);
}
copy_skb_header(n, skb);
skb_copy_header(n, skb);
out:
return n;
}
......@@ -1599,7 +1600,7 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
BUG_ON(skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
skb->len + head_copy_len));
copy_skb_header(n, skb);
skb_copy_header(n, skb);
skb_headers_offset_update(n, newheadroom - oldheadroom);
......
......@@ -111,6 +111,25 @@ int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
#define REXMIT_LOST 1 /* retransmit packets marked lost */
#define REXMIT_NEW 2 /* FRTO-style transmit of unsent/new packets */
#if IS_ENABLED(CONFIG_TLS_DEVICE)
static DEFINE_STATIC_KEY_FALSE(clean_acked_data_enabled);
void clean_acked_data_enable(struct inet_connection_sock *icsk,
void (*cad)(struct sock *sk, u32 ack_seq))
{
icsk->icsk_clean_acked = cad;
static_branch_inc(&clean_acked_data_enabled);
}
EXPORT_SYMBOL_GPL(clean_acked_data_enable);
void clean_acked_data_disable(struct inet_connection_sock *icsk)
{
static_branch_dec(&clean_acked_data_enabled);
icsk->icsk_clean_acked = NULL;
}
EXPORT_SYMBOL_GPL(clean_acked_data_disable);
#endif
static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb,
unsigned int len)
{
......@@ -3560,6 +3579,12 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
if (after(ack, prior_snd_una)) {
flag |= FLAG_SND_UNA_ADVANCED;
icsk->icsk_retransmits = 0;
#if IS_ENABLED(CONFIG_TLS_DEVICE)
if (static_branch_unlikely(&clean_acked_data_enabled))
if (icsk->icsk_clean_acked)
icsk->icsk_clean_acked(sk, ack);
#endif
}
prior_fack = tcp_is_sack(tp) ? tcp_highest_sack_seq(tp) : tp->snd_una;
......
......@@ -14,3 +14,13 @@ config TLS
encryption handling of the TLS protocol to be done in-kernel.
If unsure, say N.
config TLS_DEVICE
bool "Transport Layer Security HW offload"
depends on TLS
select SOCK_VALIDATE_XMIT
default n
help
Enable kernel support for HW offload of the TLS protocol.
If unsure, say N.
......@@ -5,3 +5,5 @@
obj-$(CONFIG_TLS) += tls.o
tls-y := tls_main.o tls_sw.o
tls-$(CONFIG_TLS_DEVICE) += tls_device.o tls_device_fallback.o
/* Copyright (c) 2018, Mellanox Technologies All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
#include <crypto/aead.h>
#include <linux/highmem.h>
#include <linux/module.h>
#include <linux/netdevice.h>
#include <net/dst.h>
#include <net/inet_connection_sock.h>
#include <net/tcp.h>
#include <net/tls.h>
/* device_offload_lock is used to synchronize tls_dev_add
* against NETDEV_DOWN notifications.
*/
static DECLARE_RWSEM(device_offload_lock);
static void tls_device_gc_task(struct work_struct *work);
static DECLARE_WORK(tls_device_gc_work, tls_device_gc_task);
static LIST_HEAD(tls_device_gc_list);
static LIST_HEAD(tls_device_list);
static DEFINE_SPINLOCK(tls_device_lock);
static void tls_device_free_ctx(struct tls_context *ctx)
{
struct tls_offload_context *offload_ctx = tls_offload_ctx(ctx);
kfree(offload_ctx);
kfree(ctx);
}
static void tls_device_gc_task(struct work_struct *work)
{
struct tls_context *ctx, *tmp;
unsigned long flags;
LIST_HEAD(gc_list);
spin_lock_irqsave(&tls_device_lock, flags);
list_splice_init(&tls_device_gc_list, &gc_list);
spin_unlock_irqrestore(&tls_device_lock, flags);
list_for_each_entry_safe(ctx, tmp, &gc_list, list) {
struct net_device *netdev = ctx->netdev;
if (netdev) {
netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
TLS_OFFLOAD_CTX_DIR_TX);
dev_put(netdev);
}
list_del(&ctx->list);
tls_device_free_ctx(ctx);
}
}
static void tls_device_queue_ctx_destruction(struct tls_context *ctx)
{
unsigned long flags;
spin_lock_irqsave(&tls_device_lock, flags);
list_move_tail(&ctx->list, &tls_device_gc_list);
/* schedule_work inside the spinlock
* to make sure tls_device_down waits for that work.
*/
schedule_work(&tls_device_gc_work);
spin_unlock_irqrestore(&tls_device_lock, flags);
}
/* We assume that the socket is already connected */
static struct net_device *get_netdev_for_sock(struct sock *sk)
{
struct dst_entry *dst = sk_dst_get(sk);
struct net_device *netdev = NULL;
if (likely(dst)) {
netdev = dst->dev;
dev_hold(netdev);
}
dst_release(dst);
return netdev;
}
static void destroy_record(struct tls_record_info *record)
{
int nr_frags = record->num_frags;
skb_frag_t *frag;
while (nr_frags-- > 0) {
frag = &record->frags[nr_frags];
__skb_frag_unref(frag);
}
kfree(record);
}
static void delete_all_records(struct tls_offload_context *offload_ctx)
{
struct tls_record_info *info, *temp;
list_for_each_entry_safe(info, temp, &offload_ctx->records_list, list) {
list_del(&info->list);
destroy_record(info);
}
offload_ctx->retransmit_hint = NULL;
}
static void tls_icsk_clean_acked(struct sock *sk, u32 acked_seq)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_record_info *info, *temp;
struct tls_offload_context *ctx;
u64 deleted_records = 0;
unsigned long flags;
if (!tls_ctx)
return;
ctx = tls_offload_ctx(tls_ctx);
spin_lock_irqsave(&ctx->lock, flags);
info = ctx->retransmit_hint;
if (info && !before(acked_seq, info->end_seq)) {
ctx->retransmit_hint = NULL;
list_del(&info->list);
destroy_record(info);
deleted_records++;
}
list_for_each_entry_safe(info, temp, &ctx->records_list, list) {
if (before(acked_seq, info->end_seq))
break;
list_del(&info->list);
destroy_record(info);
deleted_records++;
}
ctx->unacked_record_sn += deleted_records;
spin_unlock_irqrestore(&ctx->lock, flags);
}
/* At this point, there should be no references on this
* socket and no in-flight SKBs associated with this
* socket, so it is safe to free all the resources.
*/
void tls_device_sk_destruct(struct sock *sk)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
if (ctx->open_record)
destroy_record(ctx->open_record);
delete_all_records(ctx);
crypto_free_aead(ctx->aead_send);
ctx->sk_destruct(sk);
clean_acked_data_disable(inet_csk(sk));
if (refcount_dec_and_test(&tls_ctx->refcount))
tls_device_queue_ctx_destruction(tls_ctx);
}
EXPORT_SYMBOL(tls_device_sk_destruct);
static void tls_append_frag(struct tls_record_info *record,
struct page_frag *pfrag,
int size)
{
skb_frag_t *frag;
frag = &record->frags[record->num_frags - 1];
if (frag->page.p == pfrag->page &&
frag->page_offset + frag->size == pfrag->offset) {
frag->size += size;
} else {
++frag;
frag->page.p = pfrag->page;
frag->page_offset = pfrag->offset;
frag->size = size;
++record->num_frags;
get_page(pfrag->page);
}
pfrag->offset += size;
record->len += size;
}
static int tls_push_record(struct sock *sk,
struct tls_context *ctx,
struct tls_offload_context *offload_ctx,
struct tls_record_info *record,
struct page_frag *pfrag,
int flags,
unsigned char record_type)
{
struct tcp_sock *tp = tcp_sk(sk);
struct page_frag dummy_tag_frag;
skb_frag_t *frag;
int i;
/* fill prepend */
frag = &record->frags[0];
tls_fill_prepend(ctx,
skb_frag_address(frag),
record->len - ctx->tx.prepend_size,
record_type);
/* HW doesn't care about the data in the tag, because it fills it. */
dummy_tag_frag.page = skb_frag_page(frag);
dummy_tag_frag.offset = 0;
tls_append_frag(record, &dummy_tag_frag, ctx->tx.tag_size);
record->end_seq = tp->write_seq + record->len;
spin_lock_irq(&offload_ctx->lock);
list_add_tail(&record->list, &offload_ctx->records_list);
spin_unlock_irq(&offload_ctx->lock);
offload_ctx->open_record = NULL;
set_bit(TLS_PENDING_CLOSED_RECORD, &ctx->flags);
tls_advance_record_sn(sk, &ctx->tx);
for (i = 0; i < record->num_frags; i++) {
frag = &record->frags[i];
sg_unmark_end(&offload_ctx->sg_tx_data[i]);
sg_set_page(&offload_ctx->sg_tx_data[i], skb_frag_page(frag),
frag->size, frag->page_offset);
sk_mem_charge(sk, frag->size);
get_page(skb_frag_page(frag));
}
sg_mark_end(&offload_ctx->sg_tx_data[record->num_frags - 1]);
/* all ready, send */
return tls_push_sg(sk, ctx, offload_ctx->sg_tx_data, 0, flags);
}
static int tls_create_new_record(struct tls_offload_context *offload_ctx,
struct page_frag *pfrag,
size_t prepend_size)
{
struct tls_record_info *record;
skb_frag_t *frag;
record = kmalloc(sizeof(*record), GFP_KERNEL);
if (!record)
return -ENOMEM;
frag = &record->frags[0];
__skb_frag_set_page(frag, pfrag->page);
frag->page_offset = pfrag->offset;
skb_frag_size_set(frag, prepend_size);
get_page(pfrag->page);
pfrag->offset += prepend_size;
record->num_frags = 1;
record->len = prepend_size;
offload_ctx->open_record = record;
return 0;
}
static int tls_do_allocation(struct sock *sk,
struct tls_offload_context *offload_ctx,
struct page_frag *pfrag,
size_t prepend_size)
{
int ret;
if (!offload_ctx->open_record) {
if (unlikely(!skb_page_frag_refill(prepend_size, pfrag,
sk->sk_allocation))) {
sk->sk_prot->enter_memory_pressure(sk);
sk_stream_moderate_sndbuf(sk);
return -ENOMEM;
}
ret = tls_create_new_record(offload_ctx, pfrag, prepend_size);
if (ret)
return ret;
if (pfrag->size > pfrag->offset)
return 0;
}
if (!sk_page_frag_refill(sk, pfrag))
return -ENOMEM;
return 0;
}
static int tls_push_data(struct sock *sk,
struct iov_iter *msg_iter,
size_t size, int flags,
unsigned char record_type)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
struct tls_record_info *record = ctx->open_record;
struct page_frag *pfrag;
size_t orig_size = size;
u32 max_open_record_len;
int copy, rc = 0;
bool done = false;
long timeo;
if (flags &
~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST))
return -ENOTSUPP;
if (sk->sk_err)
return -sk->sk_err;
timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
rc = tls_complete_pending_work(sk, tls_ctx, flags, &timeo);
if (rc < 0)
return rc;
pfrag = sk_page_frag(sk);
/* TLS_HEADER_SIZE is not counted as part of the TLS record, and
* we need to leave room for an authentication tag.
*/
max_open_record_len = TLS_MAX_PAYLOAD_SIZE +
tls_ctx->tx.prepend_size;
do {
rc = tls_do_allocation(sk, ctx, pfrag,
tls_ctx->tx.prepend_size);
if (rc) {
rc = sk_stream_wait_memory(sk, &timeo);
if (!rc)
continue;
record = ctx->open_record;
if (!record)
break;
handle_error:
if (record_type != TLS_RECORD_TYPE_DATA) {
/* avoid sending partial
* record with type !=
* application_data
*/
size = orig_size;
destroy_record(record);
ctx->open_record = NULL;
} else if (record->len > tls_ctx->tx.prepend_size) {
goto last_record;
}
break;
}
record = ctx->open_record;
copy = min_t(size_t, size, (pfrag->size - pfrag->offset));
copy = min_t(size_t, copy, (max_open_record_len - record->len));
if (copy_from_iter_nocache(page_address(pfrag->page) +
pfrag->offset,
copy, msg_iter) != copy) {
rc = -EFAULT;
goto handle_error;
}
tls_append_frag(record, pfrag, copy);
size -= copy;
if (!size) {
last_record:
tls_push_record_flags = flags;
if (more) {
tls_ctx->pending_open_record_frags =
record->num_frags;
break;
}
done = true;
}
if (done || record->len >= max_open_record_len ||
(record->num_frags >= MAX_SKB_FRAGS - 1)) {
rc = tls_push_record(sk,
tls_ctx,
ctx,
record,
pfrag,
tls_push_record_flags,
record_type);
if (rc < 0)
break;
}
} while (!done);
if (orig_size - size > 0)
rc = orig_size - size;
return rc;
}
int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
{
unsigned char record_type = TLS_RECORD_TYPE_DATA;
int rc;
lock_sock(sk);
if (unlikely(msg->msg_controllen)) {
rc = tls_proccess_cmsg(sk, msg, &record_type);
if (rc)
goto out;
}
rc = tls_push_data(sk, &msg->msg_iter, size,
msg->msg_flags, record_type);
out:
release_sock(sk);
return rc;
}
int tls_device_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
struct iov_iter msg_iter;
char *kaddr = kmap(page);
struct kvec iov;
int rc;
if (flags & MSG_SENDPAGE_NOTLAST)
flags |= MSG_MORE;
lock_sock(sk);
if (flags & MSG_OOB) {
rc = -ENOTSUPP;
goto out;
}
iov.iov_base = kaddr + offset;
iov.iov_len = size;
iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
rc = tls_push_data(sk, &msg_iter, size,
flags, TLS_RECORD_TYPE_DATA);
kunmap(page);
out:
release_sock(sk);
return rc;
}
struct tls_record_info *tls_get_record(struct tls_offload_context *context,
u32 seq, u64 *p_record_sn)
{
u64 record_sn = context->hint_record_sn;
struct tls_record_info *info;
info = context->retransmit_hint;
if (!info ||
before(seq, info->end_seq - info->len)) {
/* if retransmit_hint is irrelevant start
* from the beggining of the list
*/
info = list_first_entry(&context->records_list,
struct tls_record_info, list);
record_sn = context->unacked_record_sn;
}
list_for_each_entry_from(info, &context->records_list, list) {
if (before(seq, info->end_seq)) {
if (!context->retransmit_hint ||
after(info->end_seq,
context->retransmit_hint->end_seq)) {
context->hint_record_sn = record_sn;
context->retransmit_hint = info;
}
*p_record_sn = record_sn;
return info;
}
record_sn++;
}
return NULL;
}
EXPORT_SYMBOL(tls_get_record);
static int tls_device_push_pending_record(struct sock *sk, int flags)
{
struct iov_iter msg_iter;
iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA);
}
int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
{
u16 nonce_size, tag_size, iv_size, rec_seq_size;
struct tls_record_info *start_marker_record;
struct tls_offload_context *offload_ctx;
struct tls_crypto_info *crypto_info;
struct net_device *netdev;
char *iv, *rec_seq;
struct sk_buff *skb;
int rc = -EINVAL;
__be64 rcd_sn;
if (!ctx)
goto out;
if (ctx->priv_ctx_tx) {
rc = -EEXIST;
goto out;
}
start_marker_record = kmalloc(sizeof(*start_marker_record), GFP_KERNEL);
if (!start_marker_record) {
rc = -ENOMEM;
goto out;
}
offload_ctx = kzalloc(TLS_OFFLOAD_CONTEXT_SIZE, GFP_KERNEL);
if (!offload_ctx) {
rc = -ENOMEM;
goto free_marker_record;
}
crypto_info = &ctx->crypto_send;
switch (crypto_info->cipher_type) {
case TLS_CIPHER_AES_GCM_128:
nonce_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
iv = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->iv;
rec_seq_size = TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE;
rec_seq =
((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->rec_seq;
break;
default:
rc = -EINVAL;
goto free_offload_ctx;
}
ctx->tx.prepend_size = TLS_HEADER_SIZE + nonce_size;
ctx->tx.tag_size = tag_size;
ctx->tx.overhead_size = ctx->tx.prepend_size + ctx->tx.tag_size;
ctx->tx.iv_size = iv_size;
ctx->tx.iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
GFP_KERNEL);
if (!ctx->tx.iv) {
rc = -ENOMEM;
goto free_offload_ctx;
}
memcpy(ctx->tx.iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, iv, iv_size);
ctx->tx.rec_seq_size = rec_seq_size;
ctx->tx.rec_seq = kmalloc(rec_seq_size, GFP_KERNEL);
if (!ctx->tx.rec_seq) {
rc = -ENOMEM;
goto free_iv;
}
memcpy(ctx->tx.rec_seq, rec_seq, rec_seq_size);
rc = tls_sw_fallback_init(sk, offload_ctx, crypto_info);
if (rc)
goto free_rec_seq;
/* start at rec_seq - 1 to account for the start marker record */
memcpy(&rcd_sn, ctx->tx.rec_seq, sizeof(rcd_sn));
offload_ctx->unacked_record_sn = be64_to_cpu(rcd_sn) - 1;
start_marker_record->end_seq = tcp_sk(sk)->write_seq;
start_marker_record->len = 0;
start_marker_record->num_frags = 0;
INIT_LIST_HEAD(&offload_ctx->records_list);
list_add_tail(&start_marker_record->list, &offload_ctx->records_list);
spin_lock_init(&offload_ctx->lock);
clean_acked_data_enable(inet_csk(sk), &tls_icsk_clean_acked);
ctx->push_pending_record = tls_device_push_pending_record;
offload_ctx->sk_destruct = sk->sk_destruct;
/* TLS offload is greatly simplified if we don't send
* SKBs where only part of the payload needs to be encrypted.
* So mark the last skb in the write queue as end of record.
*/
skb = tcp_write_queue_tail(sk);
if (skb)
TCP_SKB_CB(skb)->eor = 1;
refcount_set(&ctx->refcount, 1);
/* We support starting offload on multiple sockets
* concurrently, so we only need a read lock here.
* This lock must precede get_netdev_for_sock to prevent races between
* NETDEV_DOWN and setsockopt.
*/
down_read(&device_offload_lock);
netdev = get_netdev_for_sock(sk);
if (!netdev) {
pr_err_ratelimited("%s: netdev not found\n", __func__);
rc = -EINVAL;
goto release_lock;
}
if (!(netdev->features & NETIF_F_HW_TLS_TX)) {
rc = -ENOTSUPP;
goto release_netdev;
}
/* Avoid offloading if the device is down
* We don't want to offload new flows after
* the NETDEV_DOWN event
*/
if (!(netdev->flags & IFF_UP)) {
rc = -EINVAL;
goto release_netdev;
}
ctx->priv_ctx_tx = offload_ctx;
rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX,
&ctx->crypto_send,
tcp_sk(sk)->write_seq);
if (rc)
goto release_netdev;
ctx->netdev = netdev;
spin_lock_irq(&tls_device_lock);
list_add_tail(&ctx->list, &tls_device_list);
spin_unlock_irq(&tls_device_lock);
sk->sk_validate_xmit_skb = tls_validate_xmit_skb;
/* following this assignment tls_is_sk_tx_device_offloaded
* will return true and the context might be accessed
* by the netdev's xmit function.
*/
smp_store_release(&sk->sk_destruct,
&tls_device_sk_destruct);
up_read(&device_offload_lock);
goto out;
release_netdev:
dev_put(netdev);
release_lock:
up_read(&device_offload_lock);
clean_acked_data_disable(inet_csk(sk));
crypto_free_aead(offload_ctx->aead_send);
free_rec_seq:
kfree(ctx->tx.rec_seq);
free_iv:
kfree(ctx->tx.iv);
free_offload_ctx:
kfree(offload_ctx);
ctx->priv_ctx_tx = NULL;
free_marker_record:
kfree(start_marker_record);
out:
return rc;
}
static int tls_device_down(struct net_device *netdev)
{
struct tls_context *ctx, *tmp;
unsigned long flags;
LIST_HEAD(list);
/* Request a write lock to block new offload attempts */
down_write(&device_offload_lock);
spin_lock_irqsave(&tls_device_lock, flags);
list_for_each_entry_safe(ctx, tmp, &tls_device_list, list) {
if (ctx->netdev != netdev ||
!refcount_inc_not_zero(&ctx->refcount))
continue;
list_move(&ctx->list, &list);
}
spin_unlock_irqrestore(&tls_device_lock, flags);
list_for_each_entry_safe(ctx, tmp, &list, list) {
netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
TLS_OFFLOAD_CTX_DIR_TX);
ctx->netdev = NULL;
dev_put(netdev);
list_del_init(&ctx->list);
if (refcount_dec_and_test(&ctx->refcount))
tls_device_free_ctx(ctx);
}
up_write(&device_offload_lock);
flush_work(&tls_device_gc_work);
return NOTIFY_DONE;
}
static int tls_dev_event(struct notifier_block *this, unsigned long event,
void *ptr)
{
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
if (!(dev->features & NETIF_F_HW_TLS_TX))
return NOTIFY_DONE;
switch (event) {
case NETDEV_REGISTER:
case NETDEV_FEAT_CHANGE:
if (dev->tlsdev_ops &&
dev->tlsdev_ops->tls_dev_add &&
dev->tlsdev_ops->tls_dev_del)
return NOTIFY_DONE;
else
return NOTIFY_BAD;
case NETDEV_DOWN:
return tls_device_down(dev);
}
return NOTIFY_DONE;
}
static struct notifier_block tls_dev_notifier = {
.notifier_call = tls_dev_event,
};
void __init tls_device_init(void)
{
register_netdevice_notifier(&tls_dev_notifier);
}
void __exit tls_device_cleanup(void)
{
unregister_netdevice_notifier(&tls_dev_notifier);
flush_work(&tls_device_gc_work);
}
/* Copyright (c) 2018, Mellanox Technologies All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
* General Public License (GPL) Version 2, available from the file
* COPYING in the main directory of this source tree, or the
* OpenIB.org BSD license below:
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
* conditions are met:
*
* - Redistributions of source code must retain the above
* copyright notice, this list of conditions and the following
* disclaimer.
*
* - Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
#include <net/tls.h>
#include <crypto/aead.h>
#include <crypto/scatterwalk.h>
#include <net/ip6_checksum.h>
static void chain_to_walk(struct scatterlist *sg, struct scatter_walk *walk)
{
struct scatterlist *src = walk->sg;
int diff = walk->offset - src->offset;
sg_set_page(sg, sg_page(src),
src->length - diff, walk->offset);
scatterwalk_crypto_chain(sg, sg_next(src), 0, 2);
}
static int tls_enc_record(struct aead_request *aead_req,
struct crypto_aead *aead, char *aad,
char *iv, __be64 rcd_sn,
struct scatter_walk *in,
struct scatter_walk *out, int *in_len)
{
unsigned char buf[TLS_HEADER_SIZE + TLS_CIPHER_AES_GCM_128_IV_SIZE];
struct scatterlist sg_in[3];
struct scatterlist sg_out[3];
u16 len;
int rc;
len = min_t(int, *in_len, ARRAY_SIZE(buf));
scatterwalk_copychunks(buf, in, len, 0);
scatterwalk_copychunks(buf, out, len, 1);
*in_len -= len;
if (!*in_len)
return 0;
scatterwalk_pagedone(in, 0, 1);
scatterwalk_pagedone(out, 1, 1);
len = buf[4] | (buf[3] << 8);
len -= TLS_CIPHER_AES_GCM_128_IV_SIZE;
tls_make_aad(aad, len - TLS_CIPHER_AES_GCM_128_TAG_SIZE,
(char *)&rcd_sn, sizeof(rcd_sn), buf[0]);
memcpy(iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, buf + TLS_HEADER_SIZE,
TLS_CIPHER_AES_GCM_128_IV_SIZE);
sg_init_table(sg_in, ARRAY_SIZE(sg_in));
sg_init_table(sg_out, ARRAY_SIZE(sg_out));
sg_set_buf(sg_in, aad, TLS_AAD_SPACE_SIZE);
sg_set_buf(sg_out, aad, TLS_AAD_SPACE_SIZE);
chain_to_walk(sg_in + 1, in);
chain_to_walk(sg_out + 1, out);
*in_len -= len;
if (*in_len < 0) {
*in_len += TLS_CIPHER_AES_GCM_128_TAG_SIZE;
/* the input buffer doesn't contain the entire record.
* trim len accordingly. The resulting authentication tag
* will contain garbage, but we don't care, so we won't
* include any of it in the output skb
* Note that we assume the output buffer length
* is larger then input buffer length + tag size
*/
if (*in_len < 0)
len += *in_len;
*in_len = 0;
}
if (*in_len) {
scatterwalk_copychunks(NULL, in, len, 2);
scatterwalk_pagedone(in, 0, 1);
scatterwalk_copychunks(NULL, out, len, 2);
scatterwalk_pagedone(out, 1, 1);
}
len -= TLS_CIPHER_AES_GCM_128_TAG_SIZE;
aead_request_set_crypt(aead_req, sg_in, sg_out, len, iv);
rc = crypto_aead_encrypt(aead_req);
return rc;
}
static void tls_init_aead_request(struct aead_request *aead_req,
struct crypto_aead *aead)
{
aead_request_set_tfm(aead_req, aead);
aead_request_set_ad(aead_req, TLS_AAD_SPACE_SIZE);
}
static struct aead_request *tls_alloc_aead_request(struct crypto_aead *aead,
gfp_t flags)
{
unsigned int req_size = sizeof(struct aead_request) +
crypto_aead_reqsize(aead);
struct aead_request *aead_req;
aead_req = kzalloc(req_size, flags);
if (aead_req)
tls_init_aead_request(aead_req, aead);
return aead_req;
}
static int tls_enc_records(struct aead_request *aead_req,
struct crypto_aead *aead, struct scatterlist *sg_in,
struct scatterlist *sg_out, char *aad, char *iv,
u64 rcd_sn, int len)
{
struct scatter_walk out, in;
int rc;
scatterwalk_start(&in, sg_in);
scatterwalk_start(&out, sg_out);
do {
rc = tls_enc_record(aead_req, aead, aad, iv,
cpu_to_be64(rcd_sn), &in, &out, &len);
rcd_sn++;
} while (rc == 0 && len);
scatterwalk_done(&in, 0, 0);
scatterwalk_done(&out, 1, 0);
return rc;
}
/* Can't use icsk->icsk_af_ops->send_check here because the ip addresses
* might have been changed by NAT.
*/
static void update_chksum(struct sk_buff *skb, int headln)
{
struct tcphdr *th = tcp_hdr(skb);
int datalen = skb->len - headln;
const struct ipv6hdr *ipv6h;
const struct iphdr *iph;
/* We only changed the payload so if we are using partial we don't
* need to update anything.
*/
if (likely(skb->ip_summed == CHECKSUM_PARTIAL))
return;
skb->ip_summed = CHECKSUM_PARTIAL;
skb->csum_start = skb_transport_header(skb) - skb->head;
skb->csum_offset = offsetof(struct tcphdr, check);
if (skb->sk->sk_family == AF_INET6) {
ipv6h = ipv6_hdr(skb);
th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
datalen, IPPROTO_TCP, 0);
} else {
iph = ip_hdr(skb);
th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, datalen,
IPPROTO_TCP, 0);
}
}
static void complete_skb(struct sk_buff *nskb, struct sk_buff *skb, int headln)
{
skb_copy_header(nskb, skb);
skb_put(nskb, skb->len);
memcpy(nskb->data, skb->data, headln);
update_chksum(nskb, headln);
nskb->destructor = skb->destructor;
nskb->sk = skb->sk;
skb->destructor = NULL;
skb->sk = NULL;
refcount_add(nskb->truesize - skb->truesize,
&nskb->sk->sk_wmem_alloc);
}
/* This function may be called after the user socket is already
* closed so make sure we don't use anything freed during
* tls_sk_proto_close here
*/
static int fill_sg_in(struct scatterlist *sg_in,
struct sk_buff *skb,
struct tls_offload_context *ctx,
u64 *rcd_sn,
s32 *sync_size,
int *resync_sgs)
{
int tcp_payload_offset = skb_transport_offset(skb) + tcp_hdrlen(skb);
int payload_len = skb->len - tcp_payload_offset;
u32 tcp_seq = ntohl(tcp_hdr(skb)->seq);
struct tls_record_info *record;
unsigned long flags;
int remaining;
int i;
spin_lock_irqsave(&ctx->lock, flags);
record = tls_get_record(ctx, tcp_seq, rcd_sn);
if (!record) {
spin_unlock_irqrestore(&ctx->lock, flags);
WARN(1, "Record not found for seq %u\n", tcp_seq);
return -EINVAL;
}
*sync_size = tcp_seq - tls_record_start_seq(record);
if (*sync_size < 0) {
int is_start_marker = tls_record_is_start_marker(record);
spin_unlock_irqrestore(&ctx->lock, flags);
/* This should only occur if the relevant record was
* already acked. In that case it should be ok
* to drop the packet and avoid retransmission.
*
* There is a corner case where the packet contains
* both an acked and a non-acked record.
* We currently don't handle that case and rely
* on TCP to retranmit a packet that doesn't contain
* already acked payload.
*/
if (!is_start_marker)
*sync_size = 0;
return -EINVAL;
}
remaining = *sync_size;
for (i = 0; remaining > 0; i++) {
skb_frag_t *frag = &record->frags[i];
__skb_frag_ref(frag);
sg_set_page(sg_in + i, skb_frag_page(frag),
skb_frag_size(frag), frag->page_offset);
remaining -= skb_frag_size(frag);
if (remaining < 0)
sg_in[i].length += remaining;
}
*resync_sgs = i;
spin_unlock_irqrestore(&ctx->lock, flags);
if (skb_to_sgvec(skb, &sg_in[i], tcp_payload_offset, payload_len) < 0)
return -EINVAL;
return 0;
}
static void fill_sg_out(struct scatterlist sg_out[3], void *buf,
struct tls_context *tls_ctx,
struct sk_buff *nskb,
int tcp_payload_offset,
int payload_len,
int sync_size,
void *dummy_buf)
{
sg_set_buf(&sg_out[0], dummy_buf, sync_size);
sg_set_buf(&sg_out[1], nskb->data + tcp_payload_offset, payload_len);
/* Add room for authentication tag produced by crypto */
dummy_buf += sync_size;
sg_set_buf(&sg_out[2], dummy_buf, TLS_CIPHER_AES_GCM_128_TAG_SIZE);
}
static struct sk_buff *tls_enc_skb(struct tls_context *tls_ctx,
struct scatterlist sg_out[3],
struct scatterlist *sg_in,
struct sk_buff *skb,
s32 sync_size, u64 rcd_sn)
{
int tcp_payload_offset = skb_transport_offset(skb) + tcp_hdrlen(skb);
struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
int payload_len = skb->len - tcp_payload_offset;
void *buf, *iv, *aad, *dummy_buf;
struct aead_request *aead_req;
struct sk_buff *nskb = NULL;
int buf_len;
aead_req = tls_alloc_aead_request(ctx->aead_send, GFP_ATOMIC);
if (!aead_req)
return NULL;
buf_len = TLS_CIPHER_AES_GCM_128_SALT_SIZE +
TLS_CIPHER_AES_GCM_128_IV_SIZE +
TLS_AAD_SPACE_SIZE +
sync_size +
TLS_CIPHER_AES_GCM_128_TAG_SIZE;
buf = kmalloc(buf_len, GFP_ATOMIC);
if (!buf)
goto free_req;
iv = buf;
memcpy(iv, tls_ctx->crypto_send_aes_gcm_128.salt,
TLS_CIPHER_AES_GCM_128_SALT_SIZE);
aad = buf + TLS_CIPHER_AES_GCM_128_SALT_SIZE +
TLS_CIPHER_AES_GCM_128_IV_SIZE;
dummy_buf = aad + TLS_AAD_SPACE_SIZE;
nskb = alloc_skb(skb_headroom(skb) + skb->len, GFP_ATOMIC);
if (!nskb)
goto free_buf;
skb_reserve(nskb, skb_headroom(skb));
fill_sg_out(sg_out, buf, tls_ctx, nskb, tcp_payload_offset,
payload_len, sync_size, dummy_buf);
if (tls_enc_records(aead_req, ctx->aead_send, sg_in, sg_out, aad, iv,
rcd_sn, sync_size + payload_len) < 0)
goto free_nskb;
complete_skb(nskb, skb, tcp_payload_offset);
/* validate_xmit_skb_list assumes that if the skb wasn't segmented
* nskb->prev will point to the skb itself
*/
nskb->prev = nskb;
free_buf:
kfree(buf);
free_req:
kfree(aead_req);
return nskb;
free_nskb:
kfree_skb(nskb);
nskb = NULL;
goto free_buf;
}
static struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb)
{
int tcp_payload_offset = skb_transport_offset(skb) + tcp_hdrlen(skb);
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
int payload_len = skb->len - tcp_payload_offset;
struct scatterlist *sg_in, sg_out[3];
struct sk_buff *nskb = NULL;
int sg_in_max_elements;
int resync_sgs = 0;
s32 sync_size = 0;
u64 rcd_sn;
/* worst case is:
* MAX_SKB_FRAGS in tls_record_info
* MAX_SKB_FRAGS + 1 in SKB head and frags.
*/
sg_in_max_elements = 2 * MAX_SKB_FRAGS + 1;
if (!payload_len)
return skb;
sg_in = kmalloc_array(sg_in_max_elements, sizeof(*sg_in), GFP_ATOMIC);
if (!sg_in)
goto free_orig;
sg_init_table(sg_in, sg_in_max_elements);
sg_init_table(sg_out, ARRAY_SIZE(sg_out));
if (fill_sg_in(sg_in, skb, ctx, &rcd_sn, &sync_size, &resync_sgs)) {
/* bypass packets before kernel TLS socket option was set */
if (sync_size < 0 && payload_len <= -sync_size)
nskb = skb_get(skb);
goto put_sg;
}
nskb = tls_enc_skb(tls_ctx, sg_out, sg_in, skb, sync_size, rcd_sn);
put_sg:
while (resync_sgs)
put_page(sg_page(&sg_in[--resync_sgs]));
kfree(sg_in);
free_orig:
kfree_skb(skb);
return nskb;
}
struct sk_buff *tls_validate_xmit_skb(struct sock *sk,
struct net_device *dev,
struct sk_buff *skb)
{
if (dev == tls_get_ctx(sk)->netdev)
return skb;
return tls_sw_fallback(sk, skb);
}
int tls_sw_fallback_init(struct sock *sk,
struct tls_offload_context *offload_ctx,
struct tls_crypto_info *crypto_info)
{
const u8 *key;
int rc;
offload_ctx->aead_send =
crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
if (IS_ERR(offload_ctx->aead_send)) {
rc = PTR_ERR(offload_ctx->aead_send);
pr_err_ratelimited("crypto_alloc_aead failed rc=%d\n", rc);
offload_ctx->aead_send = NULL;
goto err_out;
}
key = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->key;
rc = crypto_aead_setkey(offload_ctx->aead_send, key,
TLS_CIPHER_AES_GCM_128_KEY_SIZE);
if (rc)
goto free_aead;
rc = crypto_aead_setauthsize(offload_ctx->aead_send,
TLS_CIPHER_AES_GCM_128_TAG_SIZE);
if (rc)
goto free_aead;
return 0;
free_aead:
crypto_free_aead(offload_ctx->aead_send);
err_out:
return rc;
}
......@@ -51,12 +51,12 @@ enum {
TLSV6,
TLS_NUM_PROTS,
};
enum {
TLS_BASE,
TLS_SW_TX,
TLS_SW_RX,
TLS_SW_RXTX,
TLS_SW,
#ifdef CONFIG_TLS_DEVICE
TLS_HW,
#endif
TLS_HW_RECORD,
TLS_NUM_CONFIG,
};
......@@ -65,14 +65,14 @@ static struct proto *saved_tcpv6_prot;
static DEFINE_MUTEX(tcpv6_prot_mutex);
static LIST_HEAD(device_list);
static DEFINE_MUTEX(device_mutex);
static struct proto tls_prots[TLS_NUM_PROTS][TLS_NUM_CONFIG];
static struct proto tls_prots[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG];
static struct proto_ops tls_sw_proto_ops;
static inline void update_sk_prot(struct sock *sk, struct tls_context *ctx)
static void update_sk_prot(struct sock *sk, struct tls_context *ctx)
{
int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
sk->sk_prot = &tls_prots[ip_ver][ctx->conf];
sk->sk_prot = &tls_prots[ip_ver][ctx->tx_conf][ctx->rx_conf];
}
int wait_on_pending_writer(struct sock *sk, long *timeo)
......@@ -245,10 +245,10 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
lock_sock(sk);
sk_proto_close = ctx->sk_proto_close;
if (ctx->conf == TLS_HW_RECORD)
if (ctx->tx_conf == TLS_HW_RECORD && ctx->rx_conf == TLS_HW_RECORD)
goto skip_tx_cleanup;
if (ctx->conf == TLS_BASE) {
if (ctx->tx_conf == TLS_BASE && ctx->rx_conf == TLS_BASE) {
kfree(ctx);
ctx = NULL;
goto skip_tx_cleanup;
......@@ -270,15 +270,26 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
}
}
kfree(ctx->tx.rec_seq);
kfree(ctx->tx.iv);
kfree(ctx->rx.rec_seq);
kfree(ctx->rx.iv);
/* We need these for tls_sw_fallback handling of other packets */
if (ctx->tx_conf == TLS_SW) {
kfree(ctx->tx.rec_seq);
kfree(ctx->tx.iv);
tls_sw_free_resources_tx(sk);
}
if (ctx->conf == TLS_SW_TX ||
ctx->conf == TLS_SW_RX ||
ctx->conf == TLS_SW_RXTX) {
tls_sw_free_resources(sk);
if (ctx->rx_conf == TLS_SW) {
kfree(ctx->rx.rec_seq);
kfree(ctx->rx.iv);
tls_sw_free_resources_rx(sk);
}
#ifdef CONFIG_TLS_DEVICE
if (ctx->tx_conf != TLS_HW) {
#else
{
#endif
kfree(ctx);
ctx = NULL;
}
skip_tx_cleanup:
......@@ -287,7 +298,8 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
/* free ctx for TLS_HW_RECORD, used by tcp_set_state
* for sk->sk_prot->unhash [tls_hw_unhash]
*/
if (ctx && ctx->conf == TLS_HW_RECORD)
if (ctx && ctx->tx_conf == TLS_HW_RECORD &&
ctx->rx_conf == TLS_HW_RECORD)
kfree(ctx);
}
......@@ -441,25 +453,29 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
goto err_crypto_info;
}
/* currently SW is default, we will have ethtool in future */
if (tx) {
rc = tls_set_sw_offload(sk, ctx, 1);
if (ctx->conf == TLS_SW_RX)
conf = TLS_SW_RXTX;
else
conf = TLS_SW_TX;
#ifdef CONFIG_TLS_DEVICE
rc = tls_set_device_offload(sk, ctx);
conf = TLS_HW;
if (rc) {
#else
{
#endif
rc = tls_set_sw_offload(sk, ctx, 1);
conf = TLS_SW;
}
} else {
rc = tls_set_sw_offload(sk, ctx, 0);
if (ctx->conf == TLS_SW_TX)
conf = TLS_SW_RXTX;
else
conf = TLS_SW_RX;
conf = TLS_SW;
}
if (rc)
goto err_crypto_info;
ctx->conf = conf;
if (tx)
ctx->tx_conf = conf;
else
ctx->rx_conf = conf;
update_sk_prot(sk, ctx);
if (tx) {
ctx->sk_write_space = sk->sk_write_space;
......@@ -535,7 +551,8 @@ static int tls_hw_prot(struct sock *sk)
ctx->hash = sk->sk_prot->hash;
ctx->unhash = sk->sk_prot->unhash;
ctx->sk_proto_close = sk->sk_prot->close;
ctx->conf = TLS_HW_RECORD;
ctx->rx_conf = TLS_HW_RECORD;
ctx->tx_conf = TLS_HW_RECORD;
update_sk_prot(sk, ctx);
rc = 1;
break;
......@@ -579,29 +596,40 @@ static int tls_hw_hash(struct sock *sk)
return err;
}
static void build_protos(struct proto *prot, struct proto *base)
static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG],
struct proto *base)
{
prot[TLS_BASE] = *base;
prot[TLS_BASE].setsockopt = tls_setsockopt;
prot[TLS_BASE].getsockopt = tls_getsockopt;
prot[TLS_BASE].close = tls_sk_proto_close;
prot[TLS_SW_TX] = prot[TLS_BASE];
prot[TLS_SW_TX].sendmsg = tls_sw_sendmsg;
prot[TLS_SW_TX].sendpage = tls_sw_sendpage;
prot[TLS_SW_RX] = prot[TLS_BASE];
prot[TLS_SW_RX].recvmsg = tls_sw_recvmsg;
prot[TLS_SW_RX].close = tls_sk_proto_close;
prot[TLS_SW_RXTX] = prot[TLS_SW_TX];
prot[TLS_SW_RXTX].recvmsg = tls_sw_recvmsg;
prot[TLS_SW_RXTX].close = tls_sk_proto_close;
prot[TLS_HW_RECORD] = *base;
prot[TLS_HW_RECORD].hash = tls_hw_hash;
prot[TLS_HW_RECORD].unhash = tls_hw_unhash;
prot[TLS_HW_RECORD].close = tls_sk_proto_close;
prot[TLS_BASE][TLS_BASE] = *base;
prot[TLS_BASE][TLS_BASE].setsockopt = tls_setsockopt;
prot[TLS_BASE][TLS_BASE].getsockopt = tls_getsockopt;
prot[TLS_BASE][TLS_BASE].close = tls_sk_proto_close;
prot[TLS_SW][TLS_BASE] = prot[TLS_BASE][TLS_BASE];
prot[TLS_SW][TLS_BASE].sendmsg = tls_sw_sendmsg;
prot[TLS_SW][TLS_BASE].sendpage = tls_sw_sendpage;
prot[TLS_BASE][TLS_SW] = prot[TLS_BASE][TLS_BASE];
prot[TLS_BASE][TLS_SW].recvmsg = tls_sw_recvmsg;
prot[TLS_BASE][TLS_SW].close = tls_sk_proto_close;
prot[TLS_SW][TLS_SW] = prot[TLS_SW][TLS_BASE];
prot[TLS_SW][TLS_SW].recvmsg = tls_sw_recvmsg;
prot[TLS_SW][TLS_SW].close = tls_sk_proto_close;
#ifdef CONFIG_TLS_DEVICE
prot[TLS_HW][TLS_BASE] = prot[TLS_BASE][TLS_BASE];
prot[TLS_HW][TLS_BASE].sendmsg = tls_device_sendmsg;
prot[TLS_HW][TLS_BASE].sendpage = tls_device_sendpage;
prot[TLS_HW][TLS_SW] = prot[TLS_BASE][TLS_SW];
prot[TLS_HW][TLS_SW].sendmsg = tls_device_sendmsg;
prot[TLS_HW][TLS_SW].sendpage = tls_device_sendpage;
#endif
prot[TLS_HW_RECORD][TLS_HW_RECORD] = *base;
prot[TLS_HW_RECORD][TLS_HW_RECORD].hash = tls_hw_hash;
prot[TLS_HW_RECORD][TLS_HW_RECORD].unhash = tls_hw_unhash;
prot[TLS_HW_RECORD][TLS_HW_RECORD].close = tls_sk_proto_close;
}
static int tls_init(struct sock *sk)
......@@ -632,7 +660,7 @@ static int tls_init(struct sock *sk)
ctx->getsockopt = sk->sk_prot->getsockopt;
ctx->sk_proto_close = sk->sk_prot->close;
/* Build IPv6 TLS whenever the address of tcpv6_prot changes */
/* Build IPv6 TLS whenever the address of tcpv6 _prot changes */
if (ip_ver == TLSV6 &&
unlikely(sk->sk_prot != smp_load_acquire(&saved_tcpv6_prot))) {
mutex_lock(&tcpv6_prot_mutex);
......@@ -643,7 +671,8 @@ static int tls_init(struct sock *sk)
mutex_unlock(&tcpv6_prot_mutex);
}
ctx->conf = TLS_BASE;
ctx->tx_conf = TLS_BASE;
ctx->rx_conf = TLS_BASE;
update_sk_prot(sk, ctx);
out:
return rc;
......@@ -681,6 +710,9 @@ static int __init tls_register(void)
tls_sw_proto_ops.poll = tls_sw_poll;
tls_sw_proto_ops.splice_read = tls_sw_splice_read;
#ifdef CONFIG_TLS_DEVICE
tls_device_init();
#endif
tcp_register_ulp(&tcp_tls_ulp_ops);
return 0;
......@@ -689,6 +721,9 @@ static int __init tls_register(void)
static void __exit tls_unregister(void)
{
tcp_unregister_ulp(&tcp_tls_ulp_ops);
#ifdef CONFIG_TLS_DEVICE
tls_device_cleanup();
#endif
}
module_init(tls_register);
......
......@@ -52,7 +52,7 @@ static int tls_do_decryption(struct sock *sk,
gfp_t flags)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
struct strp_msg *rxm = strp_msg(skb);
struct aead_request *aead_req;
......@@ -122,7 +122,7 @@ static void trim_sg(struct sock *sk, struct scatterlist *sg,
static void trim_both_sgl(struct sock *sk, int target_size)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
trim_sg(sk, ctx->sg_plaintext_data,
&ctx->sg_plaintext_num_elem,
......@@ -141,7 +141,7 @@ static void trim_both_sgl(struct sock *sk, int target_size)
static int alloc_encrypted_sg(struct sock *sk, int len)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
int rc = 0;
rc = sk_alloc_sg(sk, len,
......@@ -155,7 +155,7 @@ static int alloc_encrypted_sg(struct sock *sk, int len)
static int alloc_plaintext_sg(struct sock *sk, int len)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
int rc = 0;
rc = sk_alloc_sg(sk, len, ctx->sg_plaintext_data, 0,
......@@ -181,7 +181,7 @@ static void free_sg(struct sock *sk, struct scatterlist *sg,
static void tls_free_both_sg(struct sock *sk)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
free_sg(sk, ctx->sg_encrypted_data, &ctx->sg_encrypted_num_elem,
&ctx->sg_encrypted_size);
......@@ -191,7 +191,7 @@ static void tls_free_both_sg(struct sock *sk)
}
static int tls_do_encryption(struct tls_context *tls_ctx,
struct tls_sw_context *ctx, size_t data_len,
struct tls_sw_context_tx *ctx, size_t data_len,
gfp_t flags)
{
unsigned int req_size = sizeof(struct aead_request) +
......@@ -227,7 +227,7 @@ static int tls_push_record(struct sock *sk, int flags,
unsigned char record_type)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
int rc;
sg_mark_end(ctx->sg_plaintext_data + ctx->sg_plaintext_num_elem - 1);
......@@ -339,7 +339,7 @@ static int memcopy_from_iter(struct sock *sk, struct iov_iter *from,
int bytes)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
struct scatterlist *sg = ctx->sg_plaintext_data;
int copy, i, rc = 0;
......@@ -367,7 +367,7 @@ static int memcopy_from_iter(struct sock *sk, struct iov_iter *from,
int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
int ret = 0;
int required_size;
long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
......@@ -522,7 +522,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
int ret = 0;
long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
bool eor;
......@@ -636,7 +636,7 @@ static struct sk_buff *tls_wait_data(struct sock *sk, int flags,
long timeo, int *err)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
struct sk_buff *skb;
DEFINE_WAIT_FUNC(wait, woken_wake_function);
......@@ -674,7 +674,7 @@ static int decrypt_skb(struct sock *sk, struct sk_buff *skb,
struct scatterlist *sgout)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
char iv[TLS_CIPHER_AES_GCM_128_SALT_SIZE + MAX_IV_SIZE];
struct scatterlist sgin_arr[MAX_SKB_FRAGS + 2];
struct scatterlist *sgin = &sgin_arr[0];
......@@ -723,7 +723,7 @@ static bool tls_sw_advance_skb(struct sock *sk, struct sk_buff *skb,
unsigned int len)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
struct strp_msg *rxm = strp_msg(skb);
if (len < rxm->full_len) {
......@@ -749,7 +749,7 @@ int tls_sw_recvmsg(struct sock *sk,
int *addr_len)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
unsigned char control;
struct strp_msg *rxm;
struct sk_buff *skb;
......@@ -869,7 +869,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
size_t len, unsigned int flags)
{
struct tls_context *tls_ctx = tls_get_ctx(sock->sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
struct strp_msg *rxm = NULL;
struct sock *sk = sock->sk;
struct sk_buff *skb;
......@@ -922,7 +922,7 @@ unsigned int tls_sw_poll(struct file *file, struct socket *sock,
unsigned int ret;
struct sock *sk = sock->sk;
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
/* Grab POLLOUT and POLLHUP from the underlying socket */
ret = ctx->sk_poll(file, sock, wait);
......@@ -938,7 +938,7 @@ unsigned int tls_sw_poll(struct file *file, struct socket *sock,
static int tls_read_size(struct strparser *strp, struct sk_buff *skb)
{
struct tls_context *tls_ctx = tls_get_ctx(strp->sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
char header[tls_ctx->rx.prepend_size];
struct strp_msg *rxm = strp_msg(skb);
size_t cipher_overhead;
......@@ -987,7 +987,7 @@ static int tls_read_size(struct strparser *strp, struct sk_buff *skb)
static void tls_queue(struct strparser *strp, struct sk_buff *skb)
{
struct tls_context *tls_ctx = tls_get_ctx(strp->sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
struct strp_msg *rxm;
rxm = strp_msg(skb);
......@@ -1003,18 +1003,28 @@ static void tls_queue(struct strparser *strp, struct sk_buff *skb)
static void tls_data_ready(struct sock *sk)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
strp_data_ready(&ctx->strp);
}
void tls_sw_free_resources(struct sock *sk)
void tls_sw_free_resources_tx(struct sock *sk)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
if (ctx->aead_send)
crypto_free_aead(ctx->aead_send);
tls_free_both_sg(sk);
kfree(ctx);
}
void tls_sw_free_resources_rx(struct sock *sk)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
if (ctx->aead_recv) {
if (ctx->recv_pkt) {
kfree_skb(ctx->recv_pkt);
......@@ -1030,10 +1040,7 @@ void tls_sw_free_resources(struct sock *sk)
lock_sock(sk);
}
tls_free_both_sg(sk);
kfree(ctx);
kfree(tls_ctx);
}
int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
......@@ -1041,7 +1048,8 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
char keyval[TLS_CIPHER_AES_GCM_128_KEY_SIZE];
struct tls_crypto_info *crypto_info;
struct tls12_crypto_info_aes_gcm_128 *gcm_128_info;
struct tls_sw_context *sw_ctx;
struct tls_sw_context_tx *sw_ctx_tx = NULL;
struct tls_sw_context_rx *sw_ctx_rx = NULL;
struct cipher_context *cctx;
struct crypto_aead **aead;
struct strp_callbacks cb;
......@@ -1054,27 +1062,32 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
goto out;
}
if (!ctx->priv_ctx) {
sw_ctx = kzalloc(sizeof(*sw_ctx), GFP_KERNEL);
if (!sw_ctx) {
if (tx) {
sw_ctx_tx = kzalloc(sizeof(*sw_ctx_tx), GFP_KERNEL);
if (!sw_ctx_tx) {
rc = -ENOMEM;
goto out;
}
crypto_init_wait(&sw_ctx->async_wait);
crypto_init_wait(&sw_ctx_tx->async_wait);
ctx->priv_ctx_tx = sw_ctx_tx;
} else {
sw_ctx = ctx->priv_ctx;
sw_ctx_rx = kzalloc(sizeof(*sw_ctx_rx), GFP_KERNEL);
if (!sw_ctx_rx) {
rc = -ENOMEM;
goto out;
}
crypto_init_wait(&sw_ctx_rx->async_wait);
ctx->priv_ctx_rx = sw_ctx_rx;
}
ctx->priv_ctx = (struct tls_offload_context *)sw_ctx;
if (tx) {
crypto_info = &ctx->crypto_send;
cctx = &ctx->tx;
aead = &sw_ctx->aead_send;
aead = &sw_ctx_tx->aead_send;
} else {
crypto_info = &ctx->crypto_recv;
cctx = &ctx->rx;
aead = &sw_ctx->aead_recv;
aead = &sw_ctx_rx->aead_recv;
}
switch (crypto_info->cipher_type) {
......@@ -1121,22 +1134,24 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
}
memcpy(cctx->rec_seq, rec_seq, rec_seq_size);
if (tx) {
sg_init_table(sw_ctx->sg_encrypted_data,
ARRAY_SIZE(sw_ctx->sg_encrypted_data));
sg_init_table(sw_ctx->sg_plaintext_data,
ARRAY_SIZE(sw_ctx->sg_plaintext_data));
sg_init_table(sw_ctx->sg_aead_in, 2);
sg_set_buf(&sw_ctx->sg_aead_in[0], sw_ctx->aad_space,
sizeof(sw_ctx->aad_space));
sg_unmark_end(&sw_ctx->sg_aead_in[1]);
sg_chain(sw_ctx->sg_aead_in, 2, sw_ctx->sg_plaintext_data);
sg_init_table(sw_ctx->sg_aead_out, 2);
sg_set_buf(&sw_ctx->sg_aead_out[0], sw_ctx->aad_space,
sizeof(sw_ctx->aad_space));
sg_unmark_end(&sw_ctx->sg_aead_out[1]);
sg_chain(sw_ctx->sg_aead_out, 2, sw_ctx->sg_encrypted_data);
if (sw_ctx_tx) {
sg_init_table(sw_ctx_tx->sg_encrypted_data,
ARRAY_SIZE(sw_ctx_tx->sg_encrypted_data));
sg_init_table(sw_ctx_tx->sg_plaintext_data,
ARRAY_SIZE(sw_ctx_tx->sg_plaintext_data));
sg_init_table(sw_ctx_tx->sg_aead_in, 2);
sg_set_buf(&sw_ctx_tx->sg_aead_in[0], sw_ctx_tx->aad_space,
sizeof(sw_ctx_tx->aad_space));
sg_unmark_end(&sw_ctx_tx->sg_aead_in[1]);
sg_chain(sw_ctx_tx->sg_aead_in, 2,
sw_ctx_tx->sg_plaintext_data);
sg_init_table(sw_ctx_tx->sg_aead_out, 2);
sg_set_buf(&sw_ctx_tx->sg_aead_out[0], sw_ctx_tx->aad_space,
sizeof(sw_ctx_tx->aad_space));
sg_unmark_end(&sw_ctx_tx->sg_aead_out[1]);
sg_chain(sw_ctx_tx->sg_aead_out, 2,
sw_ctx_tx->sg_encrypted_data);
}
if (!*aead) {
......@@ -1161,22 +1176,22 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
if (rc)
goto free_aead;
if (!tx) {
if (sw_ctx_rx) {
/* Set up strparser */
memset(&cb, 0, sizeof(cb));
cb.rcv_msg = tls_queue;
cb.parse_msg = tls_read_size;
strp_init(&sw_ctx->strp, sk, &cb);
strp_init(&sw_ctx_rx->strp, sk, &cb);
write_lock_bh(&sk->sk_callback_lock);
sw_ctx->saved_data_ready = sk->sk_data_ready;
sw_ctx_rx->saved_data_ready = sk->sk_data_ready;
sk->sk_data_ready = tls_data_ready;
write_unlock_bh(&sk->sk_callback_lock);
sw_ctx->sk_poll = sk->sk_socket->ops->poll;
sw_ctx_rx->sk_poll = sk->sk_socket->ops->poll;
strp_check_rcv(&sw_ctx->strp);
strp_check_rcv(&sw_ctx_rx->strp);
}
goto out;
......@@ -1188,11 +1203,16 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
kfree(cctx->rec_seq);
cctx->rec_seq = NULL;
free_iv:
kfree(ctx->tx.iv);
ctx->tx.iv = NULL;
kfree(cctx->iv);
cctx->iv = NULL;
free_priv:
kfree(ctx->priv_ctx);
ctx->priv_ctx = NULL;
if (tx) {
kfree(ctx->priv_ctx_tx);
ctx->priv_ctx_tx = NULL;
} else {
kfree(ctx->priv_ctx_rx);
ctx->priv_ctx_rx = NULL;
}
out:
return rc;
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment