Commits · 0c5c3252c43cc935bef05c2211fc7cb32facddf7 · Kirill Smelkov / linux

21 Apr, 2016 40 commits

mlx4: protect mlx4_en_start_port in mlx4_en_restart with rtnl_lock · 0c5c3252

Hannes Frederic Sowa authored Apr 18, 2016

mlx4_en_start_port requires rtnl_lock to be held.

Cc: Eugenia Emantayev <eugenia@mellanox.com>
Cc: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

0c5c3252

fm10k: protect fm10k_open in fm10k_io_resume with rtnl_lock · 41419b93

Hannes Frederic Sowa authored Apr 18, 2016

fm10k_open requires rtnl_lock to be held.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

41419b93

benet: be_resume needs to protect be_open with rtnl_lock · 08d9910c

Hannes Frederic Sowa authored Apr 18, 2016

be_open calls down to functions which expects rtnl lock to be held.

Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Padmanabh Ratnakar <padmanabh.ratnakar@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

08d9910c

net: Add support for IP ID mangling TSO in cases that require encapsulation · 7f348a60

Alexander Duyck authored Apr 20, 2016

This patch adds support for NETIF_F_TSO_MANGLEID if a given tunnel supports
NETIF_F_TSO.  This way if needed a device can then later enable the TSO
with IP ID mangling and the tunnels on top of that device can then also
make use of the IP ID mangling as well.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7f348a60

Merge branch 'mlx5-next' · 1df845be

David S. Miller authored Apr 21, 2016

Saeed Mahameed says:

====================
Mellanox 100G mlx5 driver receive path optimizations

Changes from V2:
	- Rebased to 46e7b8d8 ("net: dsa: kill circular reference with slave priv")
	- Updated: ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
		* Per Eric Dumazet comment we changed the driver memory handling scheme to
		work with order-0 pages rather than order-5 via split_page().
		* This means that now a mlx5e rx skb can hold one or (more in case of HW LRO)
                skb frag each pointing to a 4K order-0 page rather than one frag with order-5 page.
	- Updated: ("net/mlx5e: Add fragmented memory support for RX multi packet WQE")
		* Code refactoring and code reuse due the split_page() mechanism,
		  now the MPWQE and fragmented MPWQE handling almost look the same,
		  and share most of the code.
	- In some cases we see 2%-3% packet rate degradation in comparison to the order-5 pages approach,
	  due to split_page() cpu consumption, but still we do see 3%-10% improvement in comparison to the
          current linear SKB approach.
	- We do believe that now the driver memory scheme is significantly less vulnerable
	  to the memory DOS attack Eric pointed at.

Changes from V1:
	- Rebased to efde611b ("Merge branch 'nfp-next'")
	- Dropped: ("net/mlx5: Refactor mlx5_core_mr to mkey")
                Already merged into 4.6 from rdma tree.
	- Dropped: ("net/mlx5_core: Add ConnectX-5 to list of supported devices")
                Will be pushed to net as we want it in 4.6 release.
	- Dropped: ("net/mlx5e: Change RX moderation period to be based on CQE")
                Will be pushed in a later series with full software based adaptive moderation.
	- Added: ("net/mlx5e: Delay skb->data access")
		Small trivial optimization.
	- Updated: ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
	 	Changed Striding RQ defaults to:
			> 	NUM WQEs = 16
			> 	Strides Per WQE = 1024
			> 	Stride Size = 128
	- Updated: ("net/mlx5e: Use napi_alloc_skb for RX SKB allocations")
		Consider the IP packet alignment already done in napi_alloc_skb.

Changes from V0:
	- Fixed a typo in commit message reported by Sergei
	- Align SKB fragments truesize to stride size
	- Use skb_add_rx_frag and remove the use of SKB_TRUESIZE
	- Fix: # MTTs alignment on Power PC
	- Fix: Free original (unaligned) pointer of MTT array
	- Use dev_alloc_pages and dev_alloc_page
	- Extend the stats.buff_alloc_err counter
	- Reform the copying of packet header into skb linear data
	- Add compiler hints for conditional statements
	- Prefetch skd->data prior to copying packet header into it
	- Rework: mlx5e_complete_rx_fragmented_mpwqe
	- Handle SKB fragments before linear data
	- Dropped ("net/mlx5e: Prefetch next RX CQE") for now
	- Added a small patch that Adds ConnectX-5 devices to the list of supported devices
	- Rebased to 1cdba550 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next")

This series includes Some RX modifications and optimizations for
the mlx5 Ethernet driver.

From Rana, we have one patch that adds the support for Connectx-4
queue counters.

From Tariq, several patches that are centralized around improving
RX path message rate, CPU and Memory utilization, in each patch
commit message you will find the performance improvements numbers
related to that specific patch.

In the 2nd patch we used a queue counter to report "out of buffer"
dropped packet count, "Dropped packets due to lack of software resources"

3rd patch modifies the driver's to RSS default value to be spread along the
close NUMA node cores only for better out of the box experience.

In the 4th and 5th patches we utilized the use of RX multi-packet WQE
(Striding RQ) for better memory utilization especially in case of hardware
LRO is enabled and for better message rate for small packets.

In the 6th and 7th patches we added a fallback mechanism to use fragmented
memory when allocating large WQE strides fails, using UMR
(User Memory Registration) and ICO (Internal Control Operations) SQs.

In the 8th to 11th patches we did some small modification which show some small
extra improvements.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1df845be

net/mlx5e: Add ethtool counter for RX buffer allocation failures · 54984407

Tariq Toukan authored Apr 20, 2016

Counts the number of RX buffer allocation failures and shows it
in ethtool statistics.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

54984407

net/mlx5e: Delay skb->data access · e20a0db3

Saeed Mahameed authored Apr 20, 2016

Move mlx5e_handle_csum and eth_type_trans to the end of
mlx5e_build_rx_skb to gain some more time before accessing
skb->data, to reduce cache misses.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e20a0db3

net/mlx5e: Remove redundant barrier · 1bfec316

Tariq Toukan authored Apr 20, 2016

The bit-op operation one line before is an explicit barrier
by itself.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1bfec316

net/mlx5e: Use napi_alloc_skb for RX SKB allocations · c5adb96f

Tariq Toukan authored Apr 20, 2016

Instead of netdev_alloc_skb, we use the napi_alloc_skb function
which is designated to allocate skbuff's for RX in a
channel-specific NAPI instance, and implies the IP packet alignment.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5adb96f

net/mlx5e: Add fragmented memory support for RX multi packet WQE · bc77b240

Tariq Toukan authored Apr 20, 2016

If the allocation of a linear (physically continuous) MPWQE fails,
we allocate a fragmented MPWQE.

This is implemented via device's UMR (User Memory Registration)
which allows to register multiple memory fragments into ConnectX
hardware as a continuous buffer.
UMR registration is an asynchronous operation and is done via
ICO SQs.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bc77b240

net/mlx5e: Added ICO SQs · d3c9bc27

Tariq Toukan authored Apr 20, 2016

Added ICO (Internal Control Operations) SQ per channel to be used
for driver internal operations such as memory registration for
fragmented memory and nop requests upon ifconfig up.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d3c9bc27

net/mlx5e: Support RX multi-packet WQE (Striding RQ) · 461017cb

Tariq Toukan authored Apr 20, 2016

Introduce the feature of multi-packet WQE (RX Work Queue Element)
referred to as (MPWQE or Striding RQ), in which WQEs are larger
and serve multiple packets each.

Every WQE consists of many strides of the same size, every received
packet is aligned to a beginning of a stride and is written to
consecutive strides within a WQE.

In the regular approach, each regular WQE is big enough to be capable
of serving one received packet of any size up to MTU or 64K in case of
device LRO is enabled, making it very wasteful when dealing with
small packets or device LRO is enabled.

For its flexibility, MPWQE allows a better memory utilization
(implying improvements in CPU utilization and packet rate) as packets
consume strides according to their size, preserving the rest of
the WQE to be available for other packets.

MPWQE default configuration:
	Num of WQEs	= 16
	Strides Per WQE = 2048
	Stride Size	= 64 byte

The default WQEs memory footprint went from 1024*mtu (~1.5MB) to
16 * 2048 * 64 = 2MB per ring.
However, HW LRO can now be supported at no additional cost in memory
footprint, and hence we turn it on by default and get an even better
performance.

Performance tested on ConnectX4-Lx 50G.
To isolate the feature under test, the numbers below were measured with
HW LRO turned off. We verified that the performance just improves when
LRO is turned back on.

* Netperf single TCP stream:
- BW raised by 10-15% for representative packet sizes:
  default, 64B, 1024B, 1478B, 65536B.

* Netperf multi TCP stream:
- No degradation, line rate reached.

* Pktgen: packet rate raised by 2-10% for traffic of different message
sizes: 64B, 128B, 256B, 1024B, and 1500B.

* Pktgen: packet loss in bursts of small messages (64byte),
single stream:
- | num packets | packets loss before | packets loss after
  |     2K      |       ~ 1K          |       0
  |     8K      |       ~ 6K          |       0
  |     16K     |       ~13K          |       0
  |     32K     |       ~28K          |       0
  |     64K     |       ~57K          |     ~24K

As expected as the driver can receive as many small packets (<=64B) as
the number of total strides in the ring (default = 2048 * 16) vs. 1024
(default ring size regardless of packets size) before this feature.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

461017cb

net/mlx5e: Use function pointers for RX data path handling · 2f48af12

Tariq Toukan authored Apr 20, 2016

In preparation for Striding RQ feature, which will need its own
RX handlers.
This patch does not change any functionality.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2f48af12

net/mlx5e: Use only close NUMA node for default RSS · d8c9660d

Tariq Toukan authored Apr 20, 2016

Distribute default RSS table uniformly over the rings of the
close NUMA node, instead of all available channels.
This way we enforce the preference of close rings over far ones.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d8c9660d

net/mlx5e: Allocate set of queue counters per netdev · 593cf338

Rana Shahout authored Apr 20, 2016

Connect all netdev RQs to this set of queue counters.
Also, add an "rx_out_of_buffer" counter to ethtool,
which indicates RX packet drops due to lack of receive
buffers.
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

593cf338

net/mlx5: Introduce device queue counters · 237cd218

Tariq Toukan authored Apr 20, 2016

A queue counter can collect several statistics for one or more
hardware queues (QPs, RQs, etc ..) that the counter is attached to.

For Ethernet it will provide an "out of buffer" counter which
collects the number of all packets that are dropped due to lack
of software buffers.

Here we add device commands to alloc/query/dealloc queue counters.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

237cd218

Merge branch 'bcmsysport-napi-updates' · b8fd789a

David S. Miller authored Apr 21, 2016

Florian Fainelli says:

====================
net: bcmsysport: utilize newer NAPI APIs

These two patches are very analoguous to what was already submitted for
BCMGENET and switch the SYSTEMPORT driver to utilizing __napi_schedule_irqoff()
and napi_complete_done for the RX NAPI context.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b8fd789a

net: bcmsysport: use napi_complete_done() · c82f47ef

Florian Fainelli authored Apr 20, 2016

By using napi_complete_done(), we allow fine tuning of
/sys/class/net/ethX/gro_flush_timeout for higher GRO aggregation
efficiency for a Gbit NIC.

Check commit 24d2e4a5 ("tg3: use napi_complete_done()") for details.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c82f47ef

net: bcmsysport: use __napi_schedule_irqoff() · ba90950c

Florian Fainelli authored Apr 20, 2016

Both bcm_sysport_tx_isr() and bcm_sysport_rx_isr() run in hard irq
context, we do not need to block irq again.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ba90950c

Merge branch 'nlattr_align' · c57107c7

David S. Miller authored Apr 21, 2016

Nicolas Dichtel says:

====================
libnl: enhance API to ease 64bit alignment for attribute

Here is a proposal to add more helpers in the libnetlink to manage 64-bit
alignment issues.
Note that this series was only tested on x86 by tweeking
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS and adding some traces.

The first patch adds helpers for 64bit alignment and other patches
use them.

We could also add helpers for nla_put_u64() and its variants if needed.

v1 -> v2:
 - remove patch #1
 - split patch #2 (now #1 and #2)
 - add nla_need_padding_for_64bit()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c57107c7

ip6mr: align RTA_MFC_STATS on 64-bit · 3d6b66c1

Nicolas Dichtel authored Apr 21, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3d6b66c1

ipmr: align RTA_MFC_STATS on 64-bit · a9a08042

Nicolas Dichtel authored Apr 21, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a9a08042

rtnl: use the new API to align IFLA_STATS* · 58414d32

Nicolas Dichtel authored Apr 21, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

58414d32

libnl: add more helpers to align attributes on 64-bit · 089bf1a6

Nicolas Dichtel authored Apr 21, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

089bf1a6

veth: Update features to include all tunnel GSO types · 732912d7

Alexander Duyck authored Apr 19, 2016

This patch adds support for the checksum enabled versions of UDP and GRE
tunnels. With this change we should be able to send and receive GSO frames
of these types over the veth pair without needing to segment the packets.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

732912d7

netdev_features: Fold NETIF_F_ALL_TSO into NETIF_F_GSO_SOFTWARE · b1c20f0b

Alexander Duyck authored Apr 19, 2016

This patch folds NETIF_F_ALL_TSO into the bitmask for NETIF_F_GSO_SOFTWARE.
The idea is to avoid duplication of defines since the only difference
between the two was the GSO_UDP bit.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1c20f0b

geneve: testing the wrong variable in geneve6_build_skb() · 1ba64fac

Dan Carpenter authored Apr 19, 2016

We intended to test "err" and not "skb".

Fixes: aed069df ('ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skb')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1ba64fac

NLA_BINARY misuse bug in HSR · f9375729

Peter Heise authored Apr 19, 2016

Removed .type field from NLA to do proper length checking.
Reported by Daniel Borkmann and Julia Lawall.
Signed-off-by: Peter Heise <peter.heise@airbus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f9375729

net: use jiffies_to_msecs to replace EXPIRES_IN_MS in inet/sctp_diag · b7de529c

Xin Long authored Apr 19, 2016

EXPIRES_IN_MS macro comes from net/ipv4/inet_diag.c and dates
back to before jiffies_to_msecs() has been introduced.

Now we can remove it and use jiffies_to_msecs().
Suggested-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b7de529c

perf, bpf: minimize the size of perf_trace_() tracepoint handler · 85b67bcb

Alexei Starovoitov authored Apr 18, 2016

move trace_call_bpf() into helper function to minimize the size
of perf_trace_*() tracepoint handlers.
    text	   data	    bss	    dec	 	   hex	filename
10541679	5526646	2945024	19013349	1221ee5	vmlinux_before
10509422	5526646	2945024	18981092	121a0e4	vmlinux_after

It may seem that perf_fetch_caller_regs() can also be moved,
but that is incorrect, since ip/sp will be wrong.

bpf+tracepoint performance is not affected, since
perf_swevent_put_recursion_context() is now inlined.
export_symbol_gpl can also be dropped.

No measurable change in normal perf tracepoints.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

85b67bcb

net: dsa: remove tag_protocol from dsa_switch · c60c9840

Vivien Didelot authored Apr 18, 2016

Having the tag protocol in dsa_switch_driver for setup time and in
dsa_switch_tree for runtime is enough. Remove dsa_switch's one.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c60c9840

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 2dabf0c4

David S. Miller authored Apr 21, 2016

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2016-04-20

This series contains updates to fm10k only.

Jacob provides majority of the changes in this series, starting with the
addition of helper functions to reduce code duplication and the amount
of code indentation.  Fixed the use or should we say abuse of the ethtool
stats API, which could result in corrupt memory or misleading statistic
output.  Added the appropriate rtnl_lock() and rtnl_unlock() to avoid
RCU warnings during AER events.  Come to find out, the PTP/1588 support
is not working with the current version of switch management software
and possibly never worked, so just remove support for PTP/1588 for now.
Fixed how error responses from the switch manager after a LPORT_MAP
request are handled, originally which were silently being ignored.
Fixed up code documentation to hopefully ease the code and comment
comprehension.  Fixed a possible NULL pointer dereference after a
kcalloc(), where when writing a new default redirection table, and we
needed to populate a new RSS table using ethtool_rxfh_indir_default().
We populate this table into a region of memory allocated using kcalloc()
but never check it for NULL.

Alex adds support for bulk transmit cleanup for fm10k, like he did for
all of our other drivers.

Ngai-Mint fixes a number of issues with the unicast and multicast address
syncs.  Where an issue would occur when the netdev is pre-configured to
either multicast mode and is enabled for the first time.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2dabf0c4

fm10k: fix incorrect IPv6 extended header checksum · dc1b4c2b

Jacob Keller authored Apr 07, 2016

Check for and handle IPv6 extended headers so that Tx checksum offload
can be done. Also use skb_checksum_help for unexpected cases. This was
originally discovered in ixgbe.
Reported-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

dc1b4c2b

fm10k: consistently use Intel(R) for driver names · 86641094

Jacob Keller authored Apr 07, 2016

Update every header file and other locations to consistently use
Intel(R) instead of just Intel. Also update copyright year of files
which we modified.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

86641094

fm10k: fix possible null pointer deref after kcalloc · 540a5d85

Jacob Keller authored Apr 07, 2016

When writing a new default redirection table, we needed to populate
a new RSS table using ethtool_rxfh_indir_default. We populated this
table into a region of memory allocated using kcalloc, but never checked
this for NULL. Fix this by moving the default table generation into
fm10k_write_reta. If this function is passed a table, use it. Otherwise,
generate the default table using ethtool_rxfh_indir_default, 4 at at
time.

Fixes: 0ea7fae4 ("fm10k: use ethtool_rxfh_indir_default for default redirection table")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

540a5d85

fm10k: Reset multicast mode when deleting lport · 11ec36a9

Ngai-Mint Kwan authored Apr 01, 2016

Deleting lport when multicast mode is configured to
FM10K_XCAST_MODE_ALLMULTI or FM10K_XCAST_MODE_PROMISC will result in
generating orphaned multicast-group entries in the switch manager.
Before deleting the lport, reset multicast mode to FM10K_XCAST_MODE_NONE
to flush out these multicast-group entries.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

11ec36a9

fm10k: update comment regarding reserved bits check · fb6515c8

Jacob Keller authored Apr 01, 2016

The original comment may be read incorrectly as referring to checking
the *entire* length is zero. However, it merely checks only the reserved
bits of both length and reserved in a small amount of code. Update the
comment to indicate this is a clever trick and clearly spell out that it
only checks the reserve bits.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

fb6515c8

fm10k: use different name than FM10K_VLAN_CLEAR for override bit · 5c69df8a

Jacob Keller authored Apr 01, 2016

Use a new #define FM10K_VLAN_OVERRIDE even though we're using the exact
same bit. The reason for this is clarity in the code, otherwise you can
read FM10K_VLAN_CLEAR and think it should be removed. Also add a comment
explaining why the FM10K_VLAN_OVERRIDE bit is set.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

5c69df8a

fm10k: use 8bit notation instead of 10bit notation for diagram · d057d9a9

Jacob Keller authored Apr 01, 2016

The diagram represents bit layout of the multi-bit VLAN update message
format. Typically these diagrams are drawn using some power of 2 as the
base, to more easily grasp where fields split. Although the numbers
above can make it somewhat easy to understand which bit you're looking
at, it makes the break points not line up. Re-draw the numbers using
base 8, and mark the bit values every 8 bits at the top. This should
make it more easy to grasp the table quickly.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

d057d9a9

fm10k: fix documentation of fm10k_tlv_parse_attr · 4e160f2a

Jacob Keller authored Apr 01, 2016

fm10k_tlv_parse_attr is supposed to return FM10K_NOT_IMPLEMENTED for any
TLV who's attribute id lies outside the range of results. It does not do
this today. In addition, the documentation does not indicate that other
attributes which are not implemented for a given TLV will be silently
ignored. Fix this. Clean up the logic so that we don't rely on the fact
that FM10K_NOT_IMPLEMENTED is greater than zero, as this can easily
cause confusion.

A future extension could look into some way of reporting unknown TLVs
in order to make issues more easily discoverable. We can't just return
FM10K_NOT_IMPLEMENTED here because we don't want to drop the entire
message if it has an unknown TLV.

While here, update the copyright year.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

4e160f2a