- 15 Nov, 2019 23 commits
-
-
Geetha sowjanya authored
Each of the NIX/NPA LFs can choose which ways of their respective NDC caches should be used to cache their contexts. This enables flexible configurations like disabling caching for a LF, limiting it's context to a certain set of ways etc etc. Separate way_mask for NIX-TX and NIX-RX is not supported. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sunil Goutham authored
Ingress packet replication support has been added to 96xx B0 silicon. This patch enables using that feature to replicate ingress broadcast packets to PF and it's VFs. Also fixed below issues - VFs can also install NPC MCAM entry to forward broadcast pkts. Otherwise, unless PF's interface is UP, VFs will not receive bcast packets. - NPC MCAM entry is disabled when PF and all it's VFs are down. - Few corner cases in installing multicast entry list. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sunil Goutham authored
CN96xx initial silicon doesn't support all features pertaining to NIX transmit scheduling and shaping. - It supports a fixed topology of 1:1 mapped transmit limiters at all levels. - Supports DWRR only at SMQ/MDQ and TL1. - Doesn't support shaping and coloring. This patch adds HW capability structure by which each variant and skew of silicon can be differentiated by their supported features. And adds support for A0 silicon's transmit scheduler capabilities or rather limitations. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kiran Kumar K authored
This patch adds support for few more RSS key types for flow key algorithm to compute rss hash index. Following flow key types have been added. - Tunnel types like NVGRE, VXLAN, GENEVE. - L2 offload type ETH_DMAC, Here we will consider only DMAC 6 bytes. - And extension header IPV6_EXT (1 byte followed by IPV6 header - Hashing inner protocol fields for inner DMAC, IPv4/v6, TCP, UDP, SCTP. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nithin Dabilpuram authored
Writing into NPC MCAM1 and MCAM0 registers are suppressed if they happened to form a reserved combination. Hence clear and disable MCAM entries before update. For HRM: [CAM(1)]<n>=1, [CAM(0)]<n>=1: Reserved. The reserved combination is not allowed. Hardware suppresses any write to CAM(0) or CAM(1) that would result in the reserved combination for any CAM bit. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hao Zheng authored
Updated NPC KPU packet parsing profile with support for following - Fragmentation support for IPv4 IPv6 outer header - NIX instruction header support - QinQ with TPID of 0x8100 as non inner most vlan tag, as legacy network equipments still generate QinQ packets with this configuration. - To better support RSS for tunnelled packets, udp based tunnel protocols such as vxlan, vxlan-gpe, geneve and gtpu are now captured into a separate layer E. Consequently, the inner packet headers are pushed one layer down to LF, LG, and LH accordingly. - Support for rfc7510 mpls in udp. Up to 4 MPLS labels can be parsed and captured in one layer LE. - Parser support for DSA, extended DSA and eDSA tags right after ethernet header by Marvell SOHO and Falcon switches. For extended DSA and eDSA tags, a special PKIND of 62 is used, as these tags don't contain a tpid field. - Higig2 protocol header parsing support, added a NPC_LT_LA_HIGIG2_ETHER for a combined header of HIGIG2 and Ethernet. Add a NPC_LT_LA_IH_NIX_HIGIG2_ETHER for a combined header of nix_ih, HIGIG2 and Ethernet on egress side. Also added 2 upper flags in LA to indicate the presence of nix_ih and HIGIG2. Other changes include - IPv4.TTL==0 IPv6.HLIM==0 check - Per RFC 1858, mark fragment offset == 1 as error - TCP invalid flags check - Separate error codes for outer and inner IPv4 checksum errors. - Fix a parser error when KPU parses incoming IPSec ESP and AH packets - NPC vtag capture/strip hardware expect tag pointer to point to tpid/ethertype instead of tci. So move lb_ptr to point to tpid/ethertype. - Fix npc parser error when parsing udp packets that don't have any payload. - For a single MCAM entry to match on packets with one or stacked vlan tags combine NPC_LT_LB_STAG and NPC_LT_LB_QINQ to NPC_LT_LB_STAG_QINQ. - NVGRE to have a separate ltype LD_NVGRE instead of combined with LD_GRE. - Reserve top LD/LTYPEs to support custom KPU profile fields. Signed-off-by: Hao Zheng <haoz@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Subbaraya Sundeep authored
For every mailbox handler added to rvu, we are adding a function declaration in rvu header file. Cleaned this up by adding a macro to generate these declarations automatically. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geetha sowjanya authored
If mailbox client has a bounce buffer or a intermediate buffer where mbox messages are framed then copy them from there to HW buffer. If 'mbase' and 'hw_mbase' are not same then assume 'mbase' points to bounce buffer. This patch also adds msg_size field to mbox header to copy only valid data instead of whole buffer. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sunil Goutham authored
Added a new mailbox API which goes through all responses to check their IDs and response codes. Also added logic to prevent queuing multiple works to process the same mailbox message. This scenario happens when AF is processing a PF's request and menawhile PF sends ACK to AF sent UP message, then mbox_hdr->num_msgs in the PF->AF DOWN mbox region will be nonzero and AF will end up processing PF's request again. This is fixed by taking a backup of num_msgs counter and clearing the same in the mbox region before scheduling work. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sunil Goutham authored
Added support to display current NPC MCAM entries and counter's allocation status ín debugfs. cat /sys/kernel/debug/octeontx2/npc/mcam_info' will dump following info - MCAM Rx and Tx keysize - Total MCAM entries and counters - Current available count - Count of number of MCAM entries and counters allocated by a RVU PF/VF device. Also, one NPC MCAM counter (last one) is reserved and mapped to NPC RX_INTF's MISS_ACTION to count dropped packets due to no MCAM entry match. This pkt drop counter can be checked via debugfs. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linu Cherian authored
A CGX port is shared by a RVU PF and it's VFs. These per CGX port level NIX Rx/Tx counters are cumilative stats of all NIXLFs sharing this port. These stats when compared to CGX Rx/Tx stats helps in identifying pkts dropped within the system, if any. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Prakash Brahmajyosyula authored
This patch adds CGX LMAC physical interface or serdes Rx/Tx packet stats to debugfs. 'cat cgx<idx>/lmac<idx>/stats' dumps the current interface link status and Rx/Tx stats. Stats include pkt received/transmitted, dropped, pause frames etc etc. Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Prakash Brahmajyosyula authored
NDC is a data cache unit which caches NPA and NIX block's aura/pool/RQ/SQ/CQ/etc contexts to reduce number of costly DRAM accesses. This patch adds support to dump cache's performance stats like cache line hit/miss counters, average cycles taken for accessing cached and non-cached data. This will help in checking if NPA/NIX context reads/writes are having NDC cache misses which inturn might effect performance. Also changed NDC enums to reflect correct NDC hardware instance. Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Prakash Brahmajyosyula authored
To aid in debugging NIX block related issues, added support to dump NIX block LF's RQ, SQ and CQ hardware contexts in debugfs. User can check which contexts are enabled currently and dump it's current HW context. Four new files 'qsize', 'rq_ctx', 'sq_ctx' and 'cq_ctx' are added to the debugfs at 'sys/kernel/debug/octeontx2/nix/' 'echo <nixlf index> > qsize' will display current enabled CQ/SQ/RQs. 'echo <nixlf> [rq number/all] > rq_ctx', 'echo <nixlf> [sq number/all] > sq_ctx' & 'echo <nixlf> [cq number/all] > cq_ctx' will dump RQ/SQ/CQ's current hardware context. Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
To aid in debugging NPA related issues, added support to dump NPA (pool allocator) block LF's aura and pool hardware contexts in debugfs. User can check which contexts are enabled currently and dump it's current HW context. Three new files 'qsize', 'aura_ctx', 'pool_ctx' are added to the debugfs at 'sys/kernel/debug/octeontx2/npa/' 'echo <npalf index> > qsize' will display current enabled Aura/Pools. 'echo <npalf> [aura number/all] > aura_ctx' & 'echo <npalf> [aura number/all] > pool_ctx' will dump Aura/Pool context info. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
Added support to dump current resource provisioning status of all resource virtualization unit (RVU) block's (i.e NPA, NIX, SSO, SSOW, CPT, TIM) local functions attached to a PF_FUNC into a debugfs file. 'cat /sys/kernel/debug/octeontx2/rsrc_alloc' will show the current block LF's allocation status. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Fix build_skb for bm capable devices when they fall-back using swbm path (e.g. when bm properties are configured in device tree but CONFIG_MVNETA_BM_ENABLE is not set). In this case rx_offset_correction is overwritten so we need to use it building skb instead of MVNETA_SKB_HEADROOM directly Fixes: 8dc9a088 ("net: mvneta: rely on build_skb in mvneta_rx_swbm poll routine") Fixes: 0db51da7 ("net: mvneta: add basic XDP support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reported-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5-updates-2019-11-12 1) Merge mlx5-next for devlink reload and flowtable offloads dependencies 2) Devlink reload support 3) TC Flowtable offloads 4) Misc cleanup ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use r8168d_modify_extpage() also in rtl8168f_config_eee_phy() to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cris Forno authored
Currently, when ibmveth receive a loopback packet, it reports an ambiguous error message "tx: h_send_logical_lan failed with rc=-4" because the hypervisor rejects those types of packets. This fix detects loopback packet and assures the source packet's MAC address matches the driver's MAC address before transmitting to the hypervisor. Signed-off-by: Cris Forno <cforno12@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Murphy authored
Add support for the TI DP83869 Gigabit ethernet phy device. The DP83869 is a robust, low power, fully featured Physical Layer transceiver with integrated PMD sublayers to support 10BASE-T, 100BASE-TX and 1000BASE-T Ethernet protocols. Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Murphy authored
Add dt bindings for the TI dp83869 Gigabit ethernet phy device. Signed-off-by: Dan Murphy <dmurphy@ti.com> CC: Rob Herring <robh+dt@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tonghao Zhang authored
When using the kernel datapath, the upcall don't include skb hash info relatived. That will introduce some problem, because the hash of skb is important in kernel stack. For example, VXLAN module uses it to select UDP src port. The tx queue selection may also use the hash in stack. Hash is computed in different ways. Hash is random for a TCP socket, and hash may be computed in hardware, or software stack. Recalculation hash is not easy. Hash of TCP socket is computed: tcp_v4_connect -> sk_set_txhash (is random) __tcp_transmit_skb -> skb_set_hash_from_sk There will be one upcall, without information of skb hash, to ovs-vswitchd, for the first packet of a TCP session. The rest packets will be processed in Open vSwitch modules, hash kept. If this tcp session is forward to VXLAN module, then the UDP src port of first tcp packet is different from rest packets. TCP packets may come from the host or dockers, to Open vSwitch. To fix it, we store the hash info to upcall, and restore hash when packets sent back. +---------------+ +-------------------------+ | Docker/VMs | | ovs-vswitchd | +----+----------+ +-+--------------------+--+ | ^ | | | | | | upcall v restore packet hash (not recalculate) | +-+--------------------+--+ | tap netdev | | vxlan module +---------------> +--> Open vSwitch ko +--> or internal type | | +-------------------------+ Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.htmlSigned-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 14 Nov, 2019 8 commits
-
-
David S. Miller authored
MarkLee says: ==================== Rework mt762x GDM setup flow The mt762x GDM block is mainly used to setup the HW internal rx path from GMAC to RX DMA engine(PDMA) and the packet switching engine(PSE) is responsed to do the data forward following the GDM configuration. This patch set have three goals : 1. Integrate GDM/PSE setup operations into single function "mtk_gdm_config" 2. Refine the timing of GDM/PSE setup, move it from mtk_hw_init to mtk_open 3. Enable GDM GDMA_DROP_ALL mode to drop all packet during the stop operation ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
MarkLee authored
Enable GDM GDMA_DROP_ALL mode to drop all packet during the stop operation. This is recommended by the mt762x HW design to drop all packet from GMAC before stopping PDMA. Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
MarkLee authored
Refine the timing of GDM/PSE setup, move it from mtk_hw_init to mtk_open. This is recommended by the mt762x HW design to do GDM/PSE setup only after PDMA has been started. We exclude mt7628 in mtk_gdm_config function since it is a old IP and there is no GDM/PSE block on it. Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
MarkLee authored
Integrate GDM/PSE setup operations into single function "mtk_gdm_config" Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
We don't really need 10k species of reset. Remove everything except cold reset which is what is actually used. Too bad the hardware designers couldn't agree to use the same bit field for rev 1 and rev 2, so the (*reset_cmd) function pointer is there to stay. However let's simplify the prototype and give it a struct dsa_switch (we want to avoid forward-declarations of structures, in this case struct sja1105_private, wherever we can). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Vladimir Oltean says: ==================== PTP clock source for SJA1105 tc-taprio offload This series makes the IEEE 802.1Qbv egress scheduler of the sja1105 switch use a time reference that is synchronized to the network. This enables quite a few real Time Sensitive Networking use cases, since in this mode the switch can offer its clients a TDMA sort of access to the network, and guaranteed latency for frames that are properly scheduled based on the common PTP time. The driver needs to do a 2-part activity: - Program the gate control list into the static config and upload it over SPI to the switch (already supported) - Write the activation time of the scheduler (base-time) into the PTPSCHTM register, and set the PTPSTRTSCH bit. - Monitor the activation of the scheduler at the planned time and its health. Ok, 3 parts. The time-aware scheduler cannot be programmed to activate at a time in the past, and there is some logic to avoid that. PTPCLKCORP is one of those "black magic" registers that just need to be written to the length of the cycle. There is a 40-line long comment in the second patch which explains why. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Tested using the following bash script and the tc from iproute2-next: #!/bin/bash set -e -u -o pipefail NSEC_PER_SEC="1000000000" gatemask() { local tc_list="$1" local mask=0 for tc in ${tc_list}; do mask=$((${mask} | (1 << ${tc}))) done printf "%02x" ${mask} } if ! systemctl is-active --quiet ptp4l; then echo "Please start the ptp4l service" exit fi now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }') # Phase-align the base time to the start of the next second. sec=$(echo "${now}" | gawk -F. '{ print $1; }') base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))" tc qdisc add dev swp5 parent root handle 100 taprio \ num_tc 8 \ map 0 1 2 3 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S $(gatemask 7) 100000 \ sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \ clockid CLOCK_TAI flags 2 The "state machine" is a workqueue invoked after each manipulation command on the PTP clock (reset, adjust time, set time, adjust frequency) which checks over the state of the time-aware scheduler. So it is not monitored periodically, only in reaction to a PTP command typically triggered from a userspace daemon (linuxptp). Otherwise there is no reason for things to go wrong. Now that the timecounter/cyclecounter has been replaced with hardware operations on the PTP clock, the TAS Kconfig now depends upon PTP and the standalone clocksource operating mode has been removed. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate whether the time-aware scheduler is running or not. We will be using that for monitoring the scheduler in the next patch, so refactor the PTP command API in order to allow that. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 13 Nov, 2019 9 commits
-
-
Dan Carpenter authored
"ret" is zero or possibly uninitialized on this error path. It should be a negative error code instead. Fixes: 2d0cb84d ("cxgb4: add ETHOFLD hardware queue support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
irqreturn_t type is an enum and in this context it's unsigned, so "err" can't be irqreturn_t or it breaks the error handling. In fact the "err" variable is only used to store integers (never irqreturn_t) so it should be declared as int. I removed the initialization because it's not required. Using a bogus initializer turns off GCC's uninitialized variable warnings. Secondly, there is a GCC warning about unused assignments and we would like to enable that feature eventually so we have been trying to remove these unnecessary initializers. Fixes: 7b0c342f ("net: atlantic: code style cleanup") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Venkat Duvvuru authored
Fix the array overrun while keeping the eth_addr and eth_addr_mask pointers as u16 to avoid unaligned u16 access. These were overlooked when modifying the code to use u16 pointer for proper alignment. Fixes: 90f90624 ("bnxt_en: Add support for L2 rewrite") Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Blakey authored
Since both tc rules and flow table rules are of the same format, we can re-use tc parsing for that, and move the flow table rules to their steering domain - In this case, the next chain after max tc chain. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Michael Guralnik authored
Implement devlink reload for mlx5. Usage example: devlink dev reload pci/0000:06:00.0 Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Michael Guralnik authored
Use devlink instance name space to set the netdev net namespace. Preparation patch for devlink reload implementation. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Eli Cohen authored
Be sure to release the neighbour in case of failures after successful route lookup. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Eli Cohen authored
Neighbour initializations to NULL are not necessary as the pointers are not used if an error is returned, and if success returned, pointers are initialized. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Parav Pandit authored
mlx5_device_disable_sriov() currently reads num_vfs from the PCI core. However when mlx5_device_disable_sriov() is executed, SR-IOV is already disabled at the PCI level. Due to this disable_hca() cleanup is not done during SR-IOV disable flow. mlx5_sriov_disable() pci_enable_sriov() mlx5_device_disable_sriov() <- num_vfs is zero here. When SR-IOV enablement fails during mlx5_sriov_enable(), HCA's are left in enabled stage because mlx5_device_disable_sriov() relies on num_vfs from PCI core. mlx5_sriov_enable() mlx5_device_enable_sriov() pci_enable_sriov() <- Fails Hence, to overcome above issues, (a) Read num_vfs before disabling SR-IOV and use it. (b) Use num_vfs given when enabling sriov in error unwinding path. Fixes: d886aba6 ("net/mlx5: Reduce dependency on enabled_vfs counter and num_vfs") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-