Commits · 77788b5bf6becc5ada0da9f99e90c20ea6e77a58 · Kirill Smelkov / linux

16 Jun, 2017 8 commits

net/mlx4_en: Increase default TX ring size · 77788b5b

Tariq Toukan authored Jun 15, 2017

Increase the default TX ring size (from 512 to 1024) to match
the RX ring size.
This gives the XDP TX ring a better chance to keep up with the
rate of its RX ring in case of a high load of XDP_TX actions.

Tested:
Ethtool counter rx_xdp_tx_full used to increase, after applying this
patch it stopped.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

77788b5b

net/mlx4_en: Poll XDP TX completion queue in RX NAPI · 6c78511b

Tariq Toukan authored Jun 15, 2017

Instead of having their own NAPIs, XDP TX completion queues get
polled within the corresponding RX NAPI.
This prevents any possible race on TX ring prod/cons indices,
between the context that issues the transmits (RX NAPI) and the
context that handles the completions (was previously done in
a separate NAPI).

This also improves performance, as it decreases the number
of NAPIs running on a CPU, saving the overhead of syncing
and switching between the contexts.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.

XDP_TX packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 12.0 Mpps | 13.8 Mpps |  15% |
IPv6 | 12.0 Mpps | 13.8 Mpps |  15% |
-------------------------------------
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6c78511b

net/mlx4_en: Improve XDP xmit function · 36ea7964

Tariq Toukan authored Jun 15, 2017

Several performance improvements in XDP TX datapath,
including:
- Ring a single doorbell for XDP TX ring per NAPI budget,
  instead of doing it per a lower threshold (was 8).
  This includes removing the flow of immediate doorbell ringing
  in case of a full TX ring.
- Compiler branch predictor hints.
- Calculate values in compile time rather than in runtime.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.

XDP_TX packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 10.3 Mpps | 12.0 Mpps |  17% |
IPv6 | 10.3 Mpps | 12.0 Mpps |  17% |
-------------------------------------
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

36ea7964

net/mlx4_en: Improve stack xmit function · f28186d6

Tariq Toukan authored Jun 15, 2017

Several small code and performance improvements in stack TX datapath,
including:
- Compiler branch predictor hints.
- Minimize variables scope.
- Move tx_info non-inline flow handling to a separate function.
- Calculate data_offset in compile time rather than in runtime
  (for !lso_header_size branch).
- Avoid trinary-operator ("?") when value can be preset in a matching
  branch.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Gain is too small to be measurable, no degradation sensed.
Results are similar for IPv4 and IPv6.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f28186d6

net/mlx4_en: Improve transmit CQ polling · cc26a490

Tariq Toukan authored Jun 15, 2017

Several small performance improvements in TX CQ polling,
including:
- Compiler branch predictor hints.
- Minimize variables scope.
- More proper check of cq type.
- Use boolean instead of int for a binary indication.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Packet-rate tests for both regular stack and XDP use cases:
No noticeable gain, no degradation.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cc26a490

net/mlx4_en: Improve receive data-path · 9bcee89a

Tariq Toukan authored Jun 15, 2017

Several small performance improvements in RX datapath,
including:
- Compiler branch predictor hints.
- Replace a multiplication with a shift operation.
- Minimize variables scope.
- Write-prefetch for packet header.
- Avoid trinary-operator ("?") when value can be preset in a matching
  branch.
- Save a branch by updating RX ring doorbell within
  mlx4_en_refill_rx_buffers(), which now returns void.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON
(enable by ethtool -L <interface> rx 1).

XDP_DROP packet rate:
Same (28.1 Mpps), lower CPU utilization (from ~100% to ~92%).

Drop packets in TC:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 4.14 Mpps | 4.18 Mpps |   1% |
-------------------------------------

XDP_TX packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 10.1 Mpps | 10.3 Mpps |   2% |
IPv6 | 10.1 Mpps | 10.3 Mpps |   2% |
-------------------------------------
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9bcee89a

net/mlx4_en: Optimized single ring steering · 4931c6ef

Saeed Mahameed authored Jun 15, 2017

Avoid touching RX QP RSS context when loading with only
one RX ring, to allow optimized A0 RX steering.

Enable by:
- loading mlx4_core with module param: log_num_mgm_entry_size = -6.
- then: ethtool -L <interface> rx 1

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

XDP_DROP packet rate:
-------------------------------------
     | Before    | After     | Gain |
IPv4 | 20.5 Mpps | 28.1 Mpps |  37% |
IPv6 | 18.4 Mpps | 28.1 Mpps |  53% |
-------------------------------------
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4931c6ef

net/mlx4_en: Remove unused argument in TX datapath function · cf97050d

Tariq Toukan authored Jun 15, 2017

Remove owner argument, as it is obsolete and unused.
This also saves the overhead of calculating its value in data-path.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cf97050d

15 Jun, 2017 32 commits

atm: solos-pci: remove useless variable assignments · 1492a3a7

Gustavo A. R. Silva authored Jun 15, 2017

Value assigned to variable _data32_ at lines 1254 and 1257 is
overwritten at line 1260 before it can be used. This makes
such variable assignments useless.

Addresses-Coverity-ID: 1227049
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1492a3a7

net: dsa: assign default CPU port to all ports · e4b77787

Vivien Didelot authored Jun 15, 2017

The current code only assigns the default cpu_dp to all user ports of
the switch to which the CPU port belongs. The user ports of the other
switches of the fabric thus don't have a default CPU port.

This patch fixes this by assigning the cpu_dp of all user ports of all
switches of the fabric when the tree is fully parsed.

Fixes: a29342e7 ("net: dsa: Associate slave network device with CPU port")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e4b77787

Merge branch 'r8152-support-new-chips' · 3715c47b

David S. Miller authored Jun 15, 2017

Hayes Wang says:

====================
r8152: support new chips

These patches are used to support new chips.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

3715c47b

r8152: add byte_enable for ocp_read_word function · d8fbd274

hayeswang authored Jun 15, 2017

Add byte_enable for ocp_read_word() to replace reading 4
bytes data with reading the desired 2 bytes data.

This is used to avoid the issue which is described in
commit b4d99def ("r8152: remove sram_read"). The
original method always reads 4 bytes data, and it may
have problem when reading the PHY registers.

The new method is supported since RTL8153B, but it
doesn't influence the previous chips. The bits of the
byte_enable for the previous chips are the reserved
bits, and the hw would ignore them.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d8fbd274

r8152: support RTL8153B · 65b82d69

hayeswang authored Jun 15, 2017

This patch supports two new chips for RTL8153B.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

65b82d69

r8152: support new chip 8050 · c27b32c2

hayeswang authored Jun 15, 2017

The settings of the new chip are the same with RTL8152, except that
its product ID is 0x8050.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c27b32c2

Merge branch 'ibmvnic-LPM-bug-fixes' · 18b6e795

David S. Miller authored Jun 15, 2017

Thomas Falcon says:

====================
ibmvnic: LPM bug fixes

This series of small patches is meant to resolve a number of
bugs, mostly occurring during an ibmvnic driver reset when
recovering from a logical partition migration (LPM).

The first patch ensures that RX buffer pools are properly
activated following an adapter reset by setting the proper
flag in the pool data structure.

The second patch uses netif_tx_disable to stop TX queues when
closing the device during a reset.

Third, fixup a typo that resulted in partial sanitization of
TX/RX descriptor queues following a device reset.

Fourth, remove an ambiguous conditional check that was resulting
in a kernel panic as null RX/TX completion descriptors were being
processed during napi polling while the device is closing.

Finally, fix a condition where the napi polling routine exits
before it has completed its work budget without notifying the
upper network layers. This omission could result in the
napi_disable function sleeping indefinitely under certain conditions.

v2: Attempt to provide a proper cover letter
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

18b6e795

ibmvnic: Exit polling routine correctly during adapter reset · 21ecba6c

Thomas Falcon authored Jun 14, 2017

This patch fixes a bug where, in the case of a device reset,
the polling routine will never complete, causing napi_disable
to sleep indefinitely when attempting to close the device.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

21ecba6c

ibmvnic: Remove VNIC_CLOSING check from pending_scrq · 1cf9cc72

Thomas Falcon authored Jun 14, 2017

Fix a kernel panic resulting from data access of a NULL
pointer during device close. The pending_scrq routine is
meant to determine whether there is a valid sub-CRQ message
awaiting processing. When the device is closing, however,
there is a possibility that NULL messages can be processed
because pending_scrq will always return 1 even if there
no valid message in the queue.

It's not clear what this closing state check was originally
meant to accomplish, so just remove it.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1cf9cc72

ibmvnic: Sanitize entire SCRQ buffer on reset · c8b2ad0a

Thomas Falcon authored Jun 14, 2017

Fixup a typo so that the entire SCRQ buffer is cleaned.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c8b2ad0a

ibmvnic: Ensure that TX queues are disabled in __ibmvnic_close · 4c2687a5

Thomas Falcon authored Jun 14, 2017

Use netif_tx_disable to guarantee that TX queues are disabled
when __ibmvnic_close is called by the device reset routine.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c2687a5

ibmvnic: Activate disabled RX buffer pools on reset · c3e53b9a

Thomas Falcon authored Jun 14, 2017

RX buffer pools are disabled while awaiting a device
reset if firmware indicates that the resource is closed.

This patch fixes a bug where pools were not being
subsequently enabled after the device reset, causing
the device to become inoperable.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c3e53b9a

sunvnet: restrict advertized checksum offloads to just IP · 7e9191c5

Shannon Nelson authored Jun 14, 2017

As much as we'd like to play well with others, we really aren't
handling the checksums on non-IP protocol packets very well.  This
is easily seen when trying to do TCP over ipv6 - the checksums are
garbage.

Here we restrict the checksum feature flag to just IP traffic so
that we aren't given work we can't yet do.

Orabug: 26175391, 26259755
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e9191c5

Merge branch 'sched-act_tunnel_key-UDP-checksusm' · 3d8bd78b

David S. Miller authored Jun 15, 2017

Jiri Benc says:

====================
net: sched: act_tunnel_key: UDP checksums

Currently, the tunnel_key tc action does not set TUNNEL_CSUM, thus
transmitting packets with zero UDP checksum. This is inconsistent with how
we treat non-lwt UDP tunnels where the default is to fill in the UDP
checksum. Non-zero UDP checksum is the better default anyway for various
reasons previously discussed.

Make this configurable for the tunnel_key tc action with the default being
non-zero checksum. Saves a lot of surprises especially with IPv6.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

3d8bd78b

net: sched: act_tunnel_key: make UDP checksum configurable · 86087e17

Jiri Benc authored Jun 14, 2017

Allow requesting of zero UDP checksum for encapsulated packets. The name and
meaning of the attribute is "NO_CSUM" in order to have the same meaning of
the attribute missing and being 0.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

86087e17

net: sched: act_tunnel_key: request UDP checksum by default · 63fe4c39

Jiri Benc authored Jun 14, 2017

There's currently no way to request (outer) UDP checksum with
act_tunnel_key. This is problem especially for IPv6. Right now, tunnel_key
action with IPv6 does not work without going through hassles: both sides
have to have udp6zerocsumrx configured on the tunnel interface. This is
obviously not a good solution universally.

It makes more sense to compute the UDP checksum by default even for IPv4.
Just set the default to request the checksum when using act_tunnel_key.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

63fe4c39

net: s2io: remove useless variable in fill_rx_buffers · 9d7cdedd

Gustavo A. R. Silva authored Jun 14, 2017

Remove useless variable rxd_index and code related.

Addresses-Coverity-ID: 1397691
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9d7cdedd

Merge branch 'dsa-prefix-Global-macros' · 19470306

David S. Miller authored Jun 15, 2017

Vivien Didelot says:

====================
net: dsa: prefix Global macros

This patch series is the 2/3 step of the register definitions cleanup.
It brings no functional changes.

It prefixes and documents all Global (1) registers with MV88E6XXX_G1_
(or a specific model like MV88E6352_G1_STS_PPU_STATE), and prefers a
16-bit hexadecimal representation of the Marvell registers layout.

The next and last patchset will prefix the Global 2 registers.
====================
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

19470306

net: dsa: mv88e6xxx: prefix Global Prio and Tag macros · ccba8f3a

Vivien Didelot authored Jun 15, 2017

Prefix and document the remaining Global IP and IEEE Priority and Core
Tag Type registers and give them a clear 16-bit register representation.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ccba8f3a

net: dsa: mv88e6xxx: prefix Global Stats macros · 57d1ef38

Vivien Didelot authored Jun 15, 2017

Prefix and document the Global Stats Operation and Counter registers and
give them a clear 16-bit registers representation.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

57d1ef38

net: dsa: mv88e6xxx: prefix Global Monitor Control macros · 101515c8

Vivien Didelot authored Jun 15, 2017

Prefix and document the Global Monitor Control Register macros
(which became the Global Monitor & MGMT Control Register with 88E6390)
and give a clear 16-bit registers representation.

Use __bf_shf to get the shift value at compile time instead of adding
new defined macros for it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

101515c8

net: dsa: mv88e6xxx: prefix Global Control macros · d77f4321

Vivien Didelot authored Jun 15, 2017

Prefix and document the Global Control and Control 2 registers macros
and give a clear 16-bit registers representation.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d77f4321

net: dsa: mv88e6xxx: prefix Global VTU macros · 7ec60d6e

Vivien Didelot authored Jun 15, 2017

Prefix and document the Global VTU registers macros and give a clear
16-bit registers representation.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7ec60d6e

net: dsa: mv88e6xxx: prefix Global ATU macros · 27c0e600

Vivien Didelot authored Jun 15, 2017

Prefix and document the Global ATU Registers macros and give clear
16-bit registers representation.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

27c0e600

net: dsa: mv88e6xxx: prefix Global Switch MAC macros · 4b0c4817

Vivien Didelot authored Jun 15, 2017

Prefix and document the Global Switch MAC Address Register macros and
give clear 16-bit register representation.

At the same time, move mv88e6xxx_g1_set_switch_mac in global1.c, where
it belongs.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4b0c4817

net: dsa: mv88e6xxx: prefix Global Status macros · 82466921

Vivien Didelot authored Jun 15, 2017

Prefix and document the Global Status Register macros and give clear
16-bit register representation.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

82466921

skbuff: make skb_put_zero() return void · 83ad357d

Johannes Berg authored Jun 14, 2017

It's nicer to return void, since then there's no need to
cast to any structures. Currently none of the users have
a cast, but a number of future conversions do.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

83ad357d

Merge branch 'net-ktls' · 108ea514

David S. Miller authored Jun 15, 2017

Dave Watson says:

====================
net: kernel TLS

This series adds support for kernel TLS encryption over TCP sockets.
A standard TCP socket is converted to a TLS socket using a setsockopt.
Only symmetric crypto is done in the kernel, as well as TLS record
framing.  The handshake remains in userspace, and the negotiated
cipher keys/iv are provided to the TCP socket.

We implemented support for this API in OpenSSL 1.1.0, the code is
available at https://github.com/Mellanox/tls-openssl/tree/master

It should work with any TLS library with similar modifications,
a test tool using gnutls is here: https://github.com/Mellanox/tls-af_ktls_tool

RFC patch to openssl:
https://mta.openssl.org/pipermail/openssl-dev/2017-June/009384.html

Changes from V2:

* EXPORT_SYMBOL_GPL in patch 1
* Ensure cleanup code always called before sk_stream_kill_queues to
  avoid warnings

Changes from V1:

* EXPORT_SYMBOL GPL in patch 2
* Add link to OpenSSL patch & gnutls example in documentation patch.
* sk_write_pending check was rolled in to wait_for_memory path,
  avoids special case and fixes lock inbalance issue.
* Unify flag handling for sendmsg/sendfile

Changes from RFC V2:

* Generic ULP (upper layer protocol) framework instead of TLS specific
  setsockopts
* Dropped Mellanox hardware patches, will come as separate series.
  Framework will work for both.

RFC V2:

http://www.mail-archive.com/netdev@vger.kernel.org/msg160317.html

Changes from RFC V1:

* Socket based on changing TCP proto_ops instead of crypto framework
* Merged code with Mellanox's hardware tls offload
* Zerocopy sendmsg support added - sendpage/sendfile is no longer
  necessary for zerocopy optimization

RFC V1:

http://www.mail-archive.com/netdev@vger.kernel.org/msg88021.html

* Socket based on crypto userspace API framework, required two
  sockets in userspace, one encrypted, one unencrypted.

Paper: https://netdevconf.org/1.2/papers/ktls.pdf
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

108ea514

tls: Documentation · 99c195fb

Dave Watson authored Jun 14, 2017

Add documentation for the tcp ULP tls interface.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

99c195fb

tls: kernel TLS support · 3c4d7559

Dave Watson authored Jun 14, 2017

Software implementation of transport layer security, implemented using ULP
infrastructure.  tcp proto_ops are replaced with tls equivalents of sendmsg and
sendpage.

Only symmetric crypto is done in the kernel, keys are passed by setsockopt
after the handshake is complete.  All control messages are supported via CMSG
data - the actual symmetric encryption is the same, just the message type needs
to be passed separately.

For user API, please see Documentation patch.

Pieces that can be shared between hw and sw implementation
are in tls_main.c
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3c4d7559

tcp: export do_tcp_sendpages and tcp_rate_check_app_limited functions · e3b5616a

Dave Watson authored Jun 14, 2017

Export do_tcp_sendpages and tcp_rate_check_app_limited, since tls will need to
sendpages while the socket is already locked.

tcp_sendpage is exported, but requires the socket lock to not be held already.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e3b5616a

tcp: ULP infrastructure · 734942cc

Dave Watson authored Jun 14, 2017

Add the infrustructure for attaching Upper Layer Protocols (ULPs) over TCP
sockets. Based on a similar infrastructure in tcp_cong.  The idea is that any
ULP can add its own logic by changing the TCP proto_ops structure to its own
methods.

Example usage:

setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));

modules will call:
tcp_register_ulp(&tcp_tls_ulp_ops);

to register/unregister their ulp, with an init function and name.

A list of registered ulps will be returned by tcp_get_available_ulp, which is
hooked up to /proc.  Example:

$ cat /proc/sys/net/ipv4/tcp_available_ulp
tls

There is currently no functionality to remove or chain ULPs, but
it should be possible to add these in the future if needed.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

734942cc