Commits · ba323f6bee1d1e70aed280f8c89ac06959559855 · Kirill Smelkov / linux

29 Jul, 2022 40 commits

dt-bindings: nfc: use spi-peripheral-props.yaml · ba323f6b

Krzysztof Kozlowski authored Jul 27, 2022

Instead of listing directly properties typical for SPI peripherals,
reference the spi-peripheral-props.yaml schema. This allows using all
properties typical for SPI-connected devices, even these which device
bindings author did not tried yet.

Remove the spi-* properties which now come via spi-peripheral-props.yaml
schema, except for the cases when device schema adds some constraints
like maximum frequency.

While changing additionalProperties->unevaluatedProperties, put it in
typical place, just before example DTS.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220727164130.385411-1-krzysztof.kozlowski@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

ba323f6b

Merge branch 'net-dsa-qca8k-code-split-for-qca8k' · 92b54e09

Jakub Kicinski authored Jul 28, 2022

Christian Marangi says:

====================
net: dsa: qca8k: code split for qca8k

This is needed ad ipq4019 SoC have an internal switch that is
based on qca8k with very minor changes. The general function is equal.

Because of this we split the driver to common and specific code.

As the common function needs to be moved to a different file to be
reused, we had to convert every remaining user of qca8k_read/write/rmw
to regmap variant.
We had also to generilized the special handling for the ethtool_stats
function that makes use of the autocast mib. (ipq4019 will have a
different tagger and use mmio so it could be quicker to use mmio instead
of automib feature)
And we had to convert the regmap read/write to bulk implementation to
drop the special function that makes use of it. This will be compatible
with ipq4019 and at the same time permits normal switch to use the eth
mgmt way to send the entire ATU table read/write in one go.
====================

Link: https://lore.kernel.org/r/20220727113523.19742-1-ansuelsmth@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

92b54e09

net: dsa: qca8k: move read_switch_id function to common code · 9d1bcb1f

Christian Marangi authored Jul 27, 2022

The same function to read the switch id is used by drivers based on
qca8k family switch. Move them to common code to make them accessible
also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

9d1bcb1f

net: dsa: qca8k: move port LAG functions to common code · e9bbf019

Christian Marangi authored Jul 27, 2022

The same port LAG functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

e9bbf019

net: dsa: qca8k: move port VLAN functions to common code · c5290f63

Christian Marangi authored Jul 27, 2022

The same port VLAN functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Also drop exposing busy_wait and make it static.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

c5290f63

net: dsa: qca8k: move port mirror functions to common code · 742d37a8

Christian Marangi authored Jul 27, 2022

The same port mirror functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

742d37a8

net: dsa: qca8k: move port FDB/MDB function to common code · 2e5bd96e

Christian Marangi authored Jul 27, 2022

The same port FDB/MDB function are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Also drop bulk read/write functions and make them static
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2e5bd96e

net: dsa: qca8k: move set age/MTU/port enable/disable functions to common code · b3a302b1

Christian Marangi authored Jul 27, 2022

The same set age, MTU and port enable/disable function are used by
driver based on qca8k family switch.
Move them to common code to make them accessible also by other drivers.
While at it also drop unnecessary qca8k_priv cast for void pointers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

b3a302b1

net: dsa: qca8k: move bridge functions to common code · fd3cae2f

Christian Marangi authored Jul 27, 2022

The same bridge functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
While at it also drop unnecessary qca8k_priv cast for void pointers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

fd3cae2f

net: dsa: qca8k: move port set status/eee/ethtool stats function to common code · 472fcea1

Christian Marangi authored Jul 27, 2022

The same logic to disable/enable port, set eee and get ethtool stats is
used by drivers based on qca8k family switch.
Move it to common code to make it accessible also by other drivers.
While at it also drop unnecessary qca8k_priv cast for void pointers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

472fcea1

net: dsa: qca8k: move mib init function to common code · fce1ec0c

Christian Marangi authored Jul 27, 2022

The same mib function is used by drivers based on qca8k family switch.
Move it to common code to make it accessible also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

fce1ec0c

net: dsa: qca8k: move qca8k bulk read/write helper to common code · 91074644

Christian Marangi authored Jul 27, 2022

The same ATU function are used by drivers based on qca8k family switch.
Move the bulk read/write helper to common code to declare these shared
ATU functions in common code.
These helper will be dropped when regmap correctly support bulk
read/write.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

91074644

net: dsa: qca8k: move qca8k read/write/rmw and reg table to common code · d5f901ea

Christian Marangi authored Jul 27, 2022

The same reg table and read/write/rmw function are used by drivers
based on qca8k family switch.
Move them to common code to make it accessible also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

d5f901ea

net: dsa: qca8k: move mib struct to common code · 027152b8

Christian Marangi authored Jul 27, 2022

The same MIB struct is used by drivers based on qca8k family switch. Move
it to common code to make it accessible also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

027152b8

net: dsa: qca8k: make mib autocast feature optional · 533c64bc

Christian Marangi authored Jul 27, 2022

Some switch may not support mib autocast feature and require the legacy
way of reading the regs directly.
Make the mib autocast feature optional and permit to declare support for
it using match_data struct in a dedicated qca8k_info_ops struct.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

533c64bc

net: dsa: qca8k: cache match data to speed up access · 3bb0844e

Christian Marangi authored Jul 27, 2022

Using of_device_get_match_data is expensive. Cache match data to speed
up access and rework user of match data to use the new cached value.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

3bb0844e

firewire: net: Make use of get_unaligned_be48(), put_unaligned_be48() · 29192a17

Andy Shevchenko authored Jul 26, 2022

Since we have a proper endianness converters for BE 48-bit data use
them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220726144906.5217-1-andriy.shevchenko@linux.intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

29192a17

amt: fix typo in comment · 39befe3a

Ruffalo Lavoisier authored Jul 28, 2022

Correct spelling on 'non-existent' in comment
Signed-off-by: Ruffalo Lavoisier <RuffaloLavoisier@gmail.com>
Link: https://lore.kernel.org/r/20220728032854.151180-1-RuffaloLavoisier@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

39befe3a

mlxsw: core_linecards: Remove duplicated include in core_linecard_dev.c · 707e304d

Yang Li authored Jul 28, 2022

Fix following includecheck warning:
./drivers/net/ethernet/mellanox/mlxsw/core_linecard_dev.c: linux/err.h is included more than once.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220727233801.23781-1-yang.lee@linux.alibaba.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

707e304d

selftests: net: dsa: Add a Makefile which installs the selftests · 6ecf206d

Martin Blumenstingl authored Jul 27, 2022

Add a Makefile which takes care of installing the selftests in
tools/testing/selftests/drivers/net/dsa. This can be used to install all
DSA specific selftests and forwarding.config using the same approach as
for the selftests in tools/testing/selftests/net/forwarding.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220727191642.480279-1-martin.blumenstingl@googlemail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

6ecf206d

Merge branch 'take-devlink-lock-on-mlx4-and-mlx5-callbacks' · 13719a5b

Jakub Kicinski authored Jul 28, 2022

Moshe Shemesh says:

====================
Take devlink lock on mlx4 and mlx5 callbacks

Prepare mlx4 and mlx5 drivers to have all devlink callbacks called with
devlink instance locked. Change mlx4 driver to use devl_ API where
needed to have devlink reload callbacks locked. Change mlx5 driver to
use devl_ API where needed to have devlink reload and devlink health
callbacks locked.

As mlx5 is the only driver which needed changes to enable calling health
callbacks with devlink instance locked, this patchset also removes
DEVLINK_NL_FLAG_NO_LOCK flag from devlink health callbacks.

This patchset will be followed by a patchset that will remove
DEVLINK_NL_FLAG_NO_LOCK flag from devlink and will remove devlink_mutex.
====================

Link: https://lore.kernel.org/r/1659023630-32006-1-git-send-email-moshe@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

13719a5b

devlink: Hold the instance lock in health callbacks · c90005b5

Moshe Shemesh authored Jul 28, 2022

Let the core take the devlink instance lock around health callbacks and
remove the now redundant locking in the drivers.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

c90005b5

net/mlx5: Lock mlx5 devlink health recovery callback · d3dbdc9f

Moshe Shemesh authored Jul 28, 2022

Change devlink instance locks in mlx5 driver to have devlink health
recovery callback locked, while keeping all driver paths which lead to
devl_ API functions called by the driver locked.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

d3dbdc9f

net/mlx4: Lock mlx4 devlink reload callback · 60d7ceea

Moshe Shemesh authored Jul 28, 2022

Change devlink instance locks in mlx4 driver to have devlink reload
callback locked, while keeping all driver paths which leads to devl_ API
functions called by the mlx4 driver locked.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

60d7ceea

net/mlx4: Use devl_ API for devlink port register / unregister · a8c05514

Moshe Shemesh authored Jul 28, 2022

Use devl_ API to call devl_port_register() and devl_port_unregister()
instead of devlink_port_register() and devlink_port_unregister(). Add
devlink instance lock in mlx4 driver paths to these functions.

This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

a8c05514

net/mlx4: Use devl_ API for devlink region create / destroy · 9cb7e94a

Moshe Shemesh authored Jul 28, 2022

Use devl_ API to call devl_region_create() and devl_region_destroy()
instead of devlink_region_create() and devlink_region_destroy().
Add devlink instance lock in mlx4 driver paths to these functions.

This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

9cb7e94a

net/mlx5: Lock mlx5 devlink reload callbacks · 84a433a4

Moshe Shemesh authored Jul 28, 2022

Change devlink instance locks in mlx5 driver to have devlink reload
callbacks locked, while keeping all driver paths which lead to devl_ API
functions called by the driver locked.

Add mlx5_load_one_devl_locked() and mlx5_unload_one_devl_locked() which
are used by the paths which are already locked such as devlink reload
callbacks.

This patch makes the driver use devl_ API also for traps register as
these functions are called from the driver paths parallel to reload that
requires locking now.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

84a433a4

net/mlx5: Move fw reset unload to mlx5_fw_reset_complete_reload · c12f4c6a

Moshe Shemesh authored Jul 28, 2022

Refactor fw reset code to have the unload driver part done on
mlx5_fw_reset_complete_reload(), so if it was called by the PF which
initiated the reload fw activate flow, the unload part will be handled
by the mlx5_devlink_reload_fw_activate() callback itself and not by the
reset event work.

This will be used by the downstream patch to invoke devlink reload
callbacks with devlink lock held.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

c12f4c6a

net: devlink: remove region snapshots list dependency on devlink->lock · 2dec18ad

Jiri Pirko authored Jul 28, 2022

After mlx4 driver is converted to do locked reload,
devlink_region_snapshot_create() may be called from both locked and
unlocked context.

Note that in mlx4 region snapshots could be created on any command
failure. That can happen in any flow that involves commands to FW,
which means most of the driver flows.

So resolve this by removing dependency on devlink->lock for region
snapshots list consistency and introduce new mutex to ensure it.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2dec18ad

net: devlink: remove region snapshot ID tracking dependency on devlink->lock · 5502e871

Jiri Pirko authored Jul 28, 2022

After mlx4 driver is converted to do locked reload, functions to get/put
regions snapshot ID may be called from both locked and unlocked context.

So resolve this by removing dependency on devlink->lock for region
snapshot ID tracking by using internal xa_lock() to maintain
shapshot_ids xa_array consistency.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5502e871

Merge branch 'add-framework-for-selftests-in-devlink' · 1515a1b8

Jakub Kicinski authored Jul 28, 2022

Vikas Gupta says:

====================
add framework for selftests in devlink

Add support for selftests in the devlink framework.
Adds a callback .selftests_check and .selftests_run in devlink_ops.
User can add test(s) suite which is subsequently passed to the driver
and driver can opt for running particular tests based on its capabilities.

Patchset adds a flash based test for the bnxt_en driver.
====================

Link: https://lore.kernel.org/r/20220727165721.37959-1-vikas.gupta@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1515a1b8

bnxt_en: implement callbacks for devlink selftests · 5b6ff128

vikas authored Jul 27, 2022

Add callbacks
=============
.selftest_check: returns true for flash selftest.
.selftest_run: runs a flash selftest.

Also, refactor NVM APIs so that they can be
used with devlink and ethtool both.
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5b6ff128

devlink: introduce framework for selftests · 08f588fa

Vikas Gupta authored Jul 27, 2022

Add a framework for running selftests.
Framework exposes devlink commands and test suite(s) to the user
to execute and query the supported tests by the driver.

Below are new entries in devlink_nl_ops
devlink_nl_cmd_selftests_show_doit/dumpit: To query the supported
selftests by the drivers.
devlink_nl_cmd_selftests_run: To execute selftests. Users can
provide a test mask for executing group tests or standalone tests.

Documentation/networking/devlink/ path is already part of MAINTAINERS &
the new files come under this path. Hence no update needed to the
MAINTAINERS
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

08f588fa

Merge branch 'mlx5e-use-tls-tx-pool-to-improve-connection-rate' · 68be7b82

Jakub Kicinski authored Jul 28, 2022

Tariq Toukan says:

====================
mlx5e use TLS TX pool to improve connection rate

To offload encryption operations, the mlx5 device maintains state and
keeps track of every kTLS device-offloaded connection.  Two HW objects
are used per TX context of a kTLS offloaded connection: a. Transport
interface send (TIS) object, to reach the HW context.  b. Data Encryption
Key (DEK) to perform the crypto operations.

These two objects are created and destroyed per TLS TX context, via FW
commands.  In total, 4 FW commands are issued per TLS TX context, which
seriously limits the connection rate.

In this series, we aim to save creation and destroy of TIS objects by
recycling them.  Upon recycling of a TIS, the HW still needs to be
notified for the re-mapping between a TIS and a context. This is done by
posting WQEs via an SQ, significantly faster API than the FW command
interface.

A pool is used for recycling. The pool dynamically interacts to the load
and connection rate, growing and shrinking accordingly.

Saving the TIS FW commands per context increases connection rate by ~42%,
from 11.6K to 16.5K connections per sec.

Connection rate is still limited by FW bottleneck due to the remaining
per context FW commands (DEK create/destroy). This will soon be addressed
in a followup series.  By combining the two series, the FW bottleneck
will be released, and a significantly higher (about 100K connections per
sec) kTLS TX device-offloaded connection rate is reached.
====================

Link: https://lore.kernel.org/r/20220727094346.10540-1-tariqt@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

68be7b82

net/mlx5e: kTLS, Dynamically re-size TX recycling pool · 624bf099

Tariq Toukan authored Jul 27, 2022

Let the TLS TX recycle pool be more flexible in size, by continuously
and dynamically allocating and releasing HW resources in response to
changes in the connections rate and load.

Allocate and release pool entries in bulks (16). Use a workqueue to
release/allocate in the background. Allocate a new bulk when the pool
size goes lower than the low threshold (1K). Symmetric operation is done
when the pool size gets greater than the upper threshold (4K).

Every idle pool entry holds: 1 TIS, 1 DEK (HW resources), in addition to
~100 bytes in host memory.

Start with an empty pool to minimize memory and HW resources waste for
non-TLS users that have the device-offload TLS enabled.

Upon a new request, in case the pool is empty, do not wait for a whole bulk
allocation to complete.  Instead, trigger an instant allocation of a single
resource to reduce latency.

Performance tests:
Before: 11,684 CPS
After:  16,556 CPS
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

624bf099

net/mlx5e: kTLS, Recycle objects of device-offloaded TLS TX connections · c4dfe704

Tariq Toukan authored Jul 27, 2022

The transport interface send (TIS) object is responsible for performing
all transport related operations of the transmit side.  The ConnectX HW
uses a TIS object to save and access the TLS crypto information and state
of an offloaded TX kTLS connection.

Before this patch, we used to create a new TIS per connection and destroy
it once it’s closed. Every create and destroy of a TIS is a FW command.

Same applies for the private TLS context, where we used to dynamically
allocate and free it per connection.

Resources recycling reduce the impact of the allocation/free operations
and helps speeding up the connection rate.

In this feature we maintain a pool of TX objects and use it to recycle
the resources instead of re-creating them per connection.

A cached TIS popped from the pool is updated to serve the new connection
via the fast-path HW interface, updating the tls static and progress
params. This is a very fast operation, significantly faster than FW
commands.

On recycling, a WQE fence is required after the context params change.
This guarantees that the data is sent after the context has been
successfully updated in hardware, and that the context modification
doesn't interfere with existing traffic.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

c4dfe704

net/mlx5e: kTLS, Take stats out of OOO handler · 23b1cf1e

Tariq Toukan authored Jul 27, 2022

Let the caller of mlx5e_ktls_tx_handle_ooo() take care of updating the
stats, according to the returned value.  As the switch/case blocks are
already there, this change saves unnecessary branches in the handler.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

23b1cf1e

net/mlx5e: kTLS, Introduce TLS-specific create TIS · da6682fa

Tariq Toukan authored Jul 27, 2022

TLS TIS objects have a defined role in mapping and reaching the HW TLS
contexts.  Some standard TIS attributes (like LAG port affinity) are
not relevant for them.

Use a dedicated TLS TIS create function instead of the generic
mlx5e_create_tis.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

da6682fa

net/tls: Multi-threaded calls to TX tls_dev_del · 7adc91e0

Tariq Toukan authored Jul 27, 2022

Multiple TLS device-offloaded contexts can be added in parallel via
concurrent calls to .tls_dev_add, while calls to .tls_dev_del are
sequential in tls_device_gc_task.

This is not a sustainable behavior. This creates a rate gap between add
and del operations (addition rate outperforms the deletion rate).  When
running for enough time, the TLS device resources could get exhausted,
failing to offload new connections.

Replace the single-threaded garbage collector work with a per-context
alternative, so they can be handled on several cores in parallel. Use
a new dedicated destruct workqueue for this.

Tested with mlx5 device:
Before: 22141 add/sec,   103 del/sec
After:  11684 add/sec, 11684 del/sec
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

7adc91e0

net/tls: Perform immediate device ctx cleanup when possible · 113671b2

Tariq Toukan authored Jul 27, 2022

TLS context destructor can be run in atomic context. Cleanup operations
for device-offloaded contexts could require access and interaction with
the device callbacks, which might sleep. Hence, the cleanup of such
contexts must be deferred and completed inside an async work.

For all others, this is not necessary, as cleanup is atomic. Invoke
cleanup immediately for them, avoiding queueing redundant gc work.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

113671b2