Commits · d11786bb9664a5bed47e7839265f49bb26d54a1b · Kirill Smelkov / linux

31 Jul, 2019 4 commits

selftests: mlxsw: Add a test for leftover DSCP rule · d11786bb

Petr Machata authored Jul 31, 2019

Commit dedfde2f ("mlxsw: spectrum_dcb: Configure DSCP map as the last
rule is removed") fixed a problem in mlxsw where last DSCP rule to be
removed remained in effect when DSCP rewrite was applied.

Add a selftest that covers this problem.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d11786bb

selftests: mlxsw: Fix local variable declarations in DSCP tests · 7700476f

Petr Machata authored Jul 31, 2019

These two tests have some problems in the global scope pollution and on
contrary, contain unnecessary local declarations. Fix them.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7700476f

myri10ge: remove unneeded variable · 70841488

Ding Xiang authored Jul 31, 2019

"error" is unneeded,just return 0
Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

70841488

net: ag71xx: Slighly simplify code in 'ag71xx_rings_init()' · a9d41e7b

Christophe JAILLET authored Jul 31, 2019

A few lines above, we have:
   tx_size = BIT(tx->order);

So use 'tx_size' directly to be consistent with the way 'rx->descs_cpu' and
'rx->descs_dma' are computed below.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

a9d41e7b

30 Jul, 2019 22 commits

Merge branch 'net-dsa-ksz-Add-Microchip-KSZ87xx-support' · 5133f36c

David S. Miller authored Jul 30, 2019

Marek Vasut says:

====================
net: dsa: ksz: Add Microchip KSZ87xx support

This series adds support for Microchip KSZ87xx switches, which are
slightly simpler compared to KSZ9xxx .
====================
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

5133f36c

net: dsa: ksz: Add Microchip KSZ8795 DSA driver · e66f840c

Tristram Ha authored Jul 29, 2019

Add Microchip KSZ8795 DSA driver.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e66f840c

net: dsa: ksz: Add KSZ8795 tag code · 016e43a2

Tristram Ha authored Jul 29, 2019

Add DSA tag code for Microchip KSZ8795 switch. The switch is simpler
and the tag is only 1 byte, instead of 2 as is the case with KSZ9477.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

016e43a2

dt-bindings: net: dsa: ksz: document Microchip KSZ87xx family switches · 4c173472

Marek Vasut authored Jul 29, 2019

Document Microchip KSZ87xx family switches. These include
KSZ8765 - 5 port switch
KSZ8794 - 4 port switch
KSZ8795 - 5 port switch
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

4c173472

Merge branch 'vsock-virtio-optimizations-to-increase-the-throughput' · c69e6eaf

David S. Miller authored Jul 30, 2019

Stefano Garzarella says:

====================
vsock/virtio: optimizations to increase the throughput

This series tries to increase the throughput of virtio-vsock with slight
changes.
While I was testing the v2 of this series I discovered an huge use of memory,
so I added patch 1 to mitigate this issue. I put it in this series in order
to better track the performance trends.

v5:
- rebased all patches on net-next
- added Stefan's R-b and Michael's A-b

v4: https://patchwork.kernel.org/cover/11047717
v3: https://patchwork.kernel.org/cover/10970145
v2: https://patchwork.kernel.org/cover/10938743
v1: https://patchwork.kernel.org/cover/10885431

Below are the benchmarks step by step. I used iperf3 [1] modified with VSOCK
support. As Michael suggested in the v1, I booted host and guest with 'nosmap'.

A brief description of patches:
- Patches 1:   limit the memory usage with an extra copy for small packets
- Patches 2+3: reduce the number of credit update messages sent to the
               transmitter
- Patches 4+5: allow the host to split packets on multiple buffers and use
               VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max packet size allowed

                    host -> guest [Gbps]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.032     0.030    0.048    0.051
64         0.061     0.059    0.108    0.117
128        0.122     0.112    0.227    0.234
256        0.244     0.241    0.418    0.415
512        0.459     0.466    0.847    0.865
1K         0.927     0.919    1.657    1.641
2K         1.884     1.813    3.262    3.269
4K         3.378     3.326    6.044    6.195
8K         5.637     5.676   10.141   11.287
16K        8.250     8.402   15.976   16.736
32K       13.327    13.204   19.013   20.515
64K       21.241    21.341   20.973   21.879
128K      21.851    22.354   21.816   23.203
256K      21.408    21.693   21.846   24.088
512K      21.600    21.899   21.921   24.106

                    guest -> host [Gbps]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.045     0.046    0.057    0.057
64         0.089     0.091    0.103    0.104
128        0.170     0.179    0.192    0.200
256        0.364     0.351    0.361    0.379
512        0.709     0.699    0.731    0.790
1K         1.399     1.407    1.395    1.427
2K         2.670     2.684    2.745    2.835
4K         5.171     5.199    5.305    5.451
8K         8.442     8.500   10.083    9.941
16K       12.305    12.259   13.519   15.385
32K       11.418    11.150   11.988   24.680
64K       10.778    10.659   11.589   35.273
128K      10.421    10.339   10.939   40.338
256K      10.300     9.719   10.508   36.562
512K       9.833     9.808   10.612   35.979

As Stefan suggested in the v1, I measured also the efficiency in this way:
    efficiency = Mbps / (%CPU_Host + %CPU_Guest)

The '%CPU_Guest' is taken inside the VM. I know that it is not the best way,
but it's provided for free from iperf3 and could be an indication.

        host -> guest efficiency [Mbps / (%CPU_Host + %CPU_Guest)]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.35      0.45     0.79     1.02
64         0.56      0.80     1.41     1.54
128        1.11      1.52     3.03     3.12
256        2.20      2.16     5.44     5.58
512        4.17      4.18    10.96    11.46
1K         8.30      8.26    20.99    20.89
2K        16.82     16.31    39.76    39.73
4K        30.89     30.79    74.07    75.73
8K        53.74     54.49   124.24   148.91
16K       80.68     83.63   200.21   232.79
32K      132.27    132.52   260.81   357.07
64K      229.82    230.40   300.19   444.18
128K     332.60    329.78   331.51   492.28
256K     331.06    337.22   339.59   511.59
512K     335.58    328.50   331.56   504.56

        guest -> host efficiency [Mbps / (%CPU_Host + %CPU_Guest)]
pkt_size before opt   p 1     p 2+3    p 4+5

32         0.43      0.43     0.53     0.56
64         0.85      0.86     1.04     1.10
128        1.63      1.71     2.07     2.13
256        3.48      3.35     4.02     4.22
512        6.80      6.67     7.97     8.63
1K        13.32     13.31    15.72    15.94
2K        25.79     25.92    30.84    30.98
4K        50.37     50.48    58.79    59.69
8K        95.90     96.15   107.04   110.33
16K      145.80    145.43   143.97   174.70
32K      147.06    144.74   146.02   282.48
64K      145.25    143.99   141.62   406.40
128K     149.34    146.96   147.49   489.34
256K     156.35    149.81   152.21   536.37
512K     151.65    150.74   151.52   519.93

[1] https://github.com/stefano-garzarella/iperf/
====================
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c69e6eaf

vsock/virtio: change the maximum packet size allowed · 0038ff35

Stefano Garzarella authored Jul 30, 2019

Since now we are able to split packets, we can avoid limiting
their sizes to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE.
Instead, we can use VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max
packet size.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0038ff35

vhost/vsock: split packets to send using multiple buffers · 6dbd3e66

Stefano Garzarella authored Jul 30, 2019

If the packets to sent to the guest are bigger than the buffer
available, we can split them, using multiple buffers and fixing
the length in the packet header.
This is safe since virtio-vsock supports only stream sockets.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6dbd3e66

vsock/virtio: fix locking in virtio_transport_inc_tx_pkt() · 9632e9f6

Stefano Garzarella authored Jul 30, 2019

fwd_cnt and last_fwd_cnt are protected by rx_lock, so we should use
the same spinlock also if we are in the TX path.

Move also buf_alloc under the same lock.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9632e9f6

vsock/virtio: reduce credit update messages · b89d882d

Stefano Garzarella authored Jul 30, 2019

In order to reduce the number of credit update messages,
we send them only when the space available seen by the
transmitter is less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b89d882d

vsock/virtio: limit the memory used per-socket · 473c7391

Stefano Garzarella authored Jul 30, 2019

Since virtio-vsock was introduced, the buffers filled by the host
and pushed to the guest using the vring, are directly queued in
a per-socket list. These buffers are preallocated by the guest
with a fixed size (4 KB).

The maximum amount of memory used by each socket should be
controlled by the credit mechanism.
The default credit available per-socket is 256 KB, but if we use
only 1 byte per packet, the guest can queue up to 262144 of 4 KB
buffers, using up to 1 GB of memory per-socket. In addition, the
guest will continue to fill the vring with new 4 KB free buffers
to avoid starvation of other sockets.

This patch mitigates this issue copying the payload of small
packets (< 128 bytes) into the buffer of last packet queued, in
order to avoid wasting memory.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

473c7391

net: Remove dev_err() usage after platform_get_irq() · d1a55841

Stephen Boyd authored Jul 30, 2019

We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.

// <smpl>
@@
expression ret;
struct platform_device *E;
@@

ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);

if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>

While we're here, remove braces on if statements that only have one
statement (manually).

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d1a55841

Merge branch 'Finish-conversion-of-skb_frag_t-to-bio_vec' · 2d73a6c3

David S. Miller authored Jul 30, 2019

Jonathan Lemon says:

====================
Finish conversion of skb_frag_t to bio_vec

The recent conversion of skb_frag_t to bio_vec did not include
skb_frag's page_offset.  Add accessor functions for this field,
utilize them, and remove the union, restoring the original structure.

v2:
  - rename accessors
  - follow kdoc conventions
====================
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

2d73a6c3

linux: Remove bvec page_offset, use bv_offset · 65c84f14

Jonathan Lemon authored Jul 30, 2019

Now that page_offset is referenced through accessors, remove
the union, and use bv_offset.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

65c84f14

net: Use skb_frag_off accessors · b54c9d5b

Jonathan Lemon authored Jul 30, 2019

Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b54c9d5b

linux: Add skb_frag_t page_offset accessors · 7240b60c

Jonathan Lemon authored Jul 30, 2019

Add skb_frag_off(), skb_frag_off_add(), skb_frag_off_set(),
and skb_frag_off_copy() accessors for page_offset.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7240b60c

Merge branch 'sctp-clean-up-sctp_connect-function' · 6ca04afb

David S. Miller authored Jul 30, 2019

Xin Long says:

====================
sctp: clean up __sctp_connect function

This patchset is to factor out some common code for
sctp_sendmsg_new_asoc() and __sctp_connect() into 2
new functioins.

v1->v2:
  - add the patch 1/5 to avoid a slab-out-of-bounds warning.
  - add some code comment for the check change in patch 2/5.
  - remove unused 'addrcnt' as Marcelo noticed in patch 3/5.
====================
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6ca04afb

sctp: factor out sctp_connect_add_peer · a64e59c7

Xin Long authored Jul 30, 2019

In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it adds a peer with the other addr into the
asoc after this asoc is created with the 1st addr.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a64e59c7

sctp: factor out sctp_connect_new_asoc · f26f9951

Xin Long authored Jul 30, 2019

In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it creates the asoc and adds a peer with the
1st addr.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f26f9951

sctp: clean up __sctp_connect · dd8378b3

Xin Long authored Jul 30, 2019

__sctp_connect is doing quit similar things as sctp_sendmsg_new_asoc.
To factor out common functions, this patch is to clean up their code
to make them look more similar:

  1. create the asoc and add a peer with the 1st addr.
  2. add peers with the other addrs into this asoc one by one.

while at it, also remove the unused 'addrcnt'.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dd8378b3

sctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectx · f40f1177

Xin Long authored Jul 30, 2019

Now __sctp_connect() is called by __sctp_setsockopt_connectx() and
sctp_inet_connect(), the latter has done addr_size check with size
of sa_family_t.

In the next patch to clean up __sctp_connect(), we will remove
addr_size check with size of sa_family_t from __sctp_connect()
for the 1st address.

So before doing that, __sctp_setsockopt_connectx() should do
this check first, as sctp_inet_connect() does.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f40f1177

sctp: only copy the available addr data in sctp_transport_init · 4c31bc6b

Xin Long authored Jul 30, 2019

'addr' passed to sctp_transport_init is not always a whole size
of union sctp_addr, like the path:

  sctp_sendmsg() ->
  sctp_sendmsg_new_asoc() ->
  sctp_assoc_add_peer() ->
  sctp_transport_new() -> sctp_transport_init()

In the next patches, we will also pass the address length of data
only to sctp_assoc_add_peer().

So sctp_transport_init() should copy the only available data from
addr to peer->ipaddr, instead of 'peer->ipaddr = *addr' which may
cause slab-out-of-bounds.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c31bc6b

rxrpc: Fix -Wframe-larger-than= warnings from on-stack crypto · 1db88c53

David Howells authored Jul 30, 2019

rxkad sometimes triggers a warning about oversized stack frames when
building with clang for a 32-bit architecture:

net/rxrpc/rxkad.c:243:12: error: stack frame size of 1088 bytes in function 'rxkad_secure_packet' [-Werror,-Wframe-larger-than=]
net/rxrpc/rxkad.c:501:12: error: stack frame size of 1088 bytes in function 'rxkad_verify_packet' [-Werror,-Wframe-larger-than=]

The problem is the combination of SYNC_SKCIPHER_REQUEST_ON_STACK() in
rxkad_verify_packet()/rxkad_secure_packet() with the relatively large
scatterlist in rxkad_verify_packet_1()/rxkad_secure_packet_encrypt().

The warning does not show up when using gcc, which does not inline the
functions as aggressively, but the problem is still the same.

Allocate the cipher buffers from the slab instead, caching the allocated
packet crypto request memory used for DATA packet crypto in the rxrpc_call
struct.

Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

1db88c53

29 Jul, 2019 14 commits

Merge branch 'bnxt_en-TPA-57500' · 85fd8011

David S. Miller authored Jul 29, 2019

Michael Chan says:

====================
bnxt_en: Add TPA (GRO_HW and LRO) on 57500 chips.

This patchset adds TPA v2 support on the 57500 chips.  TPA v2 is
different from the legacy TPA scheme on older chips and requires major
refactoring and restructuring of the existing TPA logic.  The main
difference is that the new TPA v2 has on-the-fly aggregation buffer
completions before a TPA packet is completed.  The larger aggregation
ID space also requires a new ID mapping logic to make it more
memory efficient.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

85fd8011

bnxt_en: Add PCI IDs for 57500 series NPAR devices. · 49c98421

Michael Chan authored Jul 29, 2019

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

49c98421

bnxt_en: Support all variants of the 5750X chip family. · 1dc88b97

Michael Chan authored Jul 29, 2019

Define the 57508, 57504, and 57502 chip IDs that are all part of the
BNXT_CHIP_P5 family of chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1dc88b97

bnxt_en: Refactor bnxt_init_one() and turn on TPA support on 57500 chips. · 7c380918

Michael Chan authored Jul 29, 2019

With the new TPA feature in the 57500 chips, we need to discover the
feature first before setting up the netdev features. Refactor the
the firmware probe and init logic more cleanly into 2 functions and
and make these calls before setting up the netdev features.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c380918

bnxt_en: Support TPA counters on 57500 chips. · 78e7b866

Michael Chan authored Jul 29, 2019

Support the new expanded TPA v2 counters on 57500 B0 chips for
ethtool -S.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

78e7b866

bnxt_en: Allocate the larger per-ring statistics block for 57500 chips. · 4e748506

Michael Chan authored Jul 29, 2019

The new TPA implemantation has additional TPA counters that extend the
per-ring statistics block.  Allocate the proper size accordingly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e748506

bnxt_en: Refactor ethtool ring statistics logic. · ee79566e

Michael Chan authored Jul 29, 2019

The current code assumes that the per ring statistics counters are
fixed.  In newer chips that support a newer version of TPA, the
TPA counters are also changed.  Refactor the code by defining these
counter names in arrays so that it is easy to add a new array for
a new set of counters supported by the newer chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee79566e

bnxt_en: Add hardware GRO setup function for 57500 chips. · 67912c36

Michael Chan authored Jul 29, 2019

Add a more optimized hardware GRO function to setup the SKB on 57500
chips.  Some workaround code is no longer needed on 57500 chips and
the pseudo checksum is also calculated in hardware, so no need to
do the software pseudo checksum in the driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

67912c36

bnxt_en: Add TPA ID mapping logic for 57500 chips. · ec4d8e7c

Michael Chan authored Jul 29, 2019

The new TPA feature on 57500 supports a larger number of concurrent TPAs
(up to 1024) divided among the functions.  We need to add some logic to
map the hardware TPA ID to a software index that keeps track of each TPA
in progress.  A 1:1 direct mapping without translation would be too
wasteful as we would have to allocate 1024 TPA structures for each RX
ring on each PCI function.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ec4d8e7c

bnxt_en: Add fast path logic for TPA on 57500 chips. · bfcd8d79

Michael Chan authored Jul 29, 2019

With all the previous refactoring, the TPA fast path can now be
modified slightly to support TPA on the new chips.  The main
difference is that the agg completions are retrieved differently using
the bnxt_get_tpa_agg_p5() function on the new chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bfcd8d79

bnxt_en: Set TPA GRO mode flags on 57500 chips properly. · f45b7b78

Michael Chan authored Jul 29, 2019

On 57500 chips, hardware GRO mode cannot be determined from the TPA
end, so we need to check bp->flags to determine if we are in hardware
GRO mode or not.  Modify bnxt_set_features so that the TPA flags
in bp->flags don't change until the device is closed.  This will ensure
that the fast path can safely rely on bp->flags to determine the
TPA mode.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f45b7b78

bnxt_en: Refactor tunneled hardware GRO logic. · bee5a188

Michael Chan authored Jul 29, 2019

The 2 GRO functions to set up the hardware GRO SKB fields for 2
different hardware chips have practically identical logic for
tunneled packets.  Refactor the logic into a separate bnxt_gro_tunnel()
function that can be used by both functions.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bee5a188

bnxt_en: Handle standalone RX_AGG completions. · 8fe88ce7

Michael Chan authored Jul 29, 2019

On the new 57500 chips, these new RX_AGG completions are not coalesced
at the TPA_END completion.  Handle these by storing them in the
array in the bnxt_tpa_info struct, as they are seen when processing
the CMPL ring.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8fe88ce7

bnxt_en: Expand bnxt_tpa_info struct to support 57500 chips. · 79632e9b

Michael Chan authored Jul 29, 2019

Add an aggregation array to bnxt_tpa_info struct to keep track of the
aggregation completions.  The aggregation completions are not
completed at the TPA_END completion on 57500 chips so we need to
keep track of them.  The array is only allocated on the new chips
when required.  An agg_count field is also added to keep track of the
number of these completions.

The maximum concurrent TPA is now discovered from firmware instead of
the hardcoded 64.  Add a new bp->max_tpa to keep track of maximum
configured TPA.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

79632e9b