- 03 Jul, 2019 14 commits
-
-
Jason Gunthorpe authored
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Required for dependencies in the next patches. Resolved the conflicts: - esw_destroy_offloads_acl_tables() use the newer mlx5_esw_for_all_vports() version - esw_offloads_steering_init() drop the cap test - esw_offloads_init() drop the extra function arguments * branch 'mlx5-next': (39 commits) net/mlx5: Expose device definitions for object events net/mlx5: Report EQE data upon CQ completion net/mlx5: Report a CQ error event only when a handler was set net/mlx5: mlx5_core_create_cq() enhancements net/mlx5: Expose the API to register for ANY event net/mlx5: Use event mask based on device capabilities net/mlx5: Fix mlx5_core_destroy_cq() error flow net/mlx5: E-Switch, Handle UC address change in switchdev mode net/mlx5: E-Switch, Consider host PF for inline mode and vlan pop net/mlx5: E-Switch, Use iterator for vlan and min-inline setups net/mlx5: E-Switch, Reg/unreg function changed event at correct stage net/mlx5: E-Switch, Consolidate eswitch function number of VFs net/mlx5: E-Switch, Refactor eswitch SR-IOV interface net/mlx5: Handle host PF vport mac/guid for ECPF net/mlx5: E-Switch, Use correct flags when configuring vlan net/mlx5: Reduce dependency on enabled_vfs counter and num_vfs net/mlx5: Don't handle VF func change if host PF is disabled net/mlx5: Limit scope of mlx5_get_next_phys_dev() to PCI PF devices net/mlx5: Move pci status reg access mutex to mlx5_pci_init net/mlx5: Rename mlx5_pci_dev_type to mlx5_coredev_type ... Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Parav Pandit authored
Currently during dual port IB device registration in below code flow, ib_register_device() ib_device_register_sysfs() ib_setup_port_attrs() add_port() get_counter_table() get_perf_mad() process_mad() mlx5_ib_process_mad() mlx5_ib_process_mad() fails on 2nd port when both the ports are not fully setup at the device level (because 2nd port is unaffiliated). As a result, get_perf_mad() registers different PMA counter group for 1st and 2nd port, namely pma_counter_ext and pma_counter. However both ports have the same capability and counter offsets. Due to this when counters are read by the user via sysfs in below code flow, counters are queried from wrong location from the device mainly from PPCNT instead of VPORT counters. show_pma_counter() get_perf_mad() process_mad() mlx5_ib_process_mad() process_pma_cmd() This shows all zero counters for 2nd port. To overcome this, process_pma_cmd() is invoked, and when unaffiliated port is not yet setup during device registration phase, make the query on the first port. while at it, only process_pma_cmd() needs to work on the native port number and underlying mdev, so shift the get, put calls to where its needed inside process_pma_cmd(). Fixes: 212f2a87 ("IB/mlx5: Route MADs for dual port RoCE") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Yishai Hadas authored
Expose an extra device definitions for objects events. It includes: object_type values for legacy objects and generic data header for any other object. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Yishai Hadas authored
Report EQE data upon CQ completion to let upper layers use this data. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Yishai Hadas authored
Report a CQ error event only when a handler was set. This enables mlx5_ib to not set a handler upon CQ creation and use some other mechanism to get this event as of other events by the mlx5_eq_notifier_register API. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Yishai Hadas authored
Enhance mlx5_core_create_cq() to get the command out buffer from the callers to let them use the output. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Yishai Hadas authored
Expose the API to register for ANY event, mlx5_ib will be able to use this functionality for its needs. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Yishai Hadas authored
Use the reported device capabilities for the supported user events (i.e. affiliated and un-affiliated) to set the EQ mask. As the event mask can be up to 256 defined by 4 entries of u64 change the applicable code to work accordingly. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Yishai Hadas authored
The firmware command to destroy a CQ might fail when the object is referenced by other object and the ref count is managed by the firmware. To enable a second successful destruction post the first failure need to change mlx5_eq_del_cq() to be a void function. As an error in mlx5_eq_del_cq() is quite fatal from the option to recover, a debug message inside it should be good enougth and it was changed to be void. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
YueHaibing authored
Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function 'hns_roce_function_clear': drivers/infiniband/hw/hns/hns_roce_hw_v2.c:1135:7: warning: variable 'fclr_write_fail_flag' set but not used [-Wunused-but-set-variable] It is never used, so can be removed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Liu, Changcheng authored
The API for ib_query_qp requires the driver to set qp_state and cur_qp_state on return, add the missing sets. Fixes: d3749841 ("i40iw: add files for iwarp interface") Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Fuqian Huang authored
Use kmemdump instead of kzmalloc + memcpy. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Fuqian Huang authored
vzalloc has already zeroed the memory. So a memset is unneeded. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Fuqian Huang authored
In commit af7ddd8a ("Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping"), dma_alloc_coherent/dmam_alloc_coherent always zeroed the returned memory. So the memset after a coherent allocation function is not needed. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 02 Jul, 2019 12 commits
-
-
Jason Gunthorpe authored
Bernard Metzler says: ==================== This patch set contributes the SoftiWarp driver rebased for latest rdma-next. SoftiWarp (siw) implements the iWarp RDMA protocol over kernel TCP sockets. The driver integrates with the linux-rdma framework. A matching userlevel driver is available as PR at https://github.com/linux-rdma/rdma-core/pull/536 Many thanks for reviewing and testing the driver, especially to Leon, Jason, Steve, Doug, Olga, Dennis, Gal. You all helped to significantly improve the driver over the last year. Please find below a list of changes and comments, compared to older versions of the siw driver. Many thanks! Bernard. CHANGES: ======== v3 (this version) ----------------- - Rebased to rdma-next - Removed unneccessary initialization of enums in siw-abi.h - Added comment on sizing of all work queues to power of two. v2 ----------------- - Changed recieve path CRC calculation to compute CRC32c not on target buffer after placement, but on original skbuf. This change severely hurts performance, if CRC is switched on, since skb must now be walked twice. It is planned to work on an extension to skb_copy_bits() to fold in CRC computation. - Moved debugging to using ibdev_dbg(). - Dropped detailed packet debug printing. - Removed siw_debug.[ch] files. - Removed resource tracking, code now relies on restrack of RDMA midlayer. Only object counting to enforce reported device limits is left in place. - Removed all nested switch-case statements. - Cleaned up header file #include's - Moved CQ create/destroy to new semantics, where midlayer creates/destroys containing object. - Set siw's ABI version to 1 (was 0 before) - Removed all enum initialization where not needed. - Fixed MAINTANERS entry for siw driver - This version stays with the current siw specific management of user memory (siw_umem_get() vs. ib_umem_get(), etc.). This, since the current ib_umem implementation is less efficient for user page lookup on the fast path, where effciency is important for a SW RDMA driver. It is planned to contribute enhancements to the ib_umem framework, wich makes it suitable for SW drivers as well. v1 (first version after v9 of siw RFC) -------------------------------------- - Rebased to 5.2-rc1 - All IDR code got removed. - Both MR and QP deallocation verbs now synchronously free the resources referenced by the RDMA mid-layer. - IPv6 support was added. - For compatibility with Chelsio iWarp hardware, the RX path was slightly reworked. It now allows packet intersection between tagged and untagged RDMAP operations. While not a defined behavior as of IETF RFC 5040/5041, some RDMA hardware may intersect an ongoing outbound (large) tagged message, such as an multisegment RDMA Read Response with sending an untagged message, such as an RDMA Send frame. This behavior was only detected in an NVMeF setup, where siw was used at target side, and RDMA hardware at client side (during file write). siw now implements two input paths for tagged and untagged messages each, and allows the intersected placement of both messages. - The siw kernel abi file got renamed from siw_user.h to siw-abi.h. ==================== * branch 'siw': SIW addition to kernel build environment SIW completion queue methods SIW receive path SIW transmit path SIW queue pair methods SIW application buffer management SIW application interface SIW connection management SIW network and RDMA core interface SIW main include file iWarp wire packet format
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Bernard Metzler authored
Broken up commit to add the Soft iWarp RDMA driver. Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 01 Jul, 2019 14 commits
-
-
Bodong Wang authored
When NVME device emulation mode is enabled, more than one PFs use the same physical port. In this case, MPFS is required to program L2 addresses. It used to rely on netdev set_rx_mode in switchdev mode, but driver later changed to not create netdev for eswitch manager once in switchdev mode. So, UC address event should be handled. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
When ECPF is the eswitch manager, host PF is treated like other VFs. Driver should do the same for inline mode and vlan pop. Add new iterators to include host PF if ECPF is the eswitch manager. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
Use the defined iterators to traversal VF reps/vport. Also, rely on num of VFs rather than the counter of enabled vports as PF will also be enabled from ECPF side, and the counter will be different from num of VFs. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
When driver is doing eswitch mode change, it's critical to keep number of enabled VFs unchanged. However, it can be changed on the fly once function changed event is registered. To remove this uncertainty, function changed event should not be registered before all setups, and first be unregistered before all cleanups. Wrap this functionality together with vport event handler. Fixes: 61fc880839e6 ("net/mlx5: E-Switch, Handle representors creation in handler context") Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
Enabled number of VFs is key for eswich manager to do flow steering initialization and vport configurations. However, the number of enabled VFs may come from two sources as below. PF: num of VFs is provided by enabled SR-IOV of itself. ECPF: num of VFs is provided by enabled SR-IOV from its peer PF. And SR-IOV can't be enabled from ECPF itself. Current driver handles the two cases in different stages and passing the number of enabled VFs among a large scope of internal functions. It is usually hard to find out where is the real number of VFs from due to layers of argument pass-in. This patch consolidated that number from the entry point of doing eswitch setup, and maintained a copy so that eswitch functions can refer to it directly. Eswitch driver shall always use this number when referring to enabled number of VFs, don't use other numbers such as from SR-IOV. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
Devlink eswitch mode is not necessarily related to SR-IOV, e.g, ECPF can be at offload mode when SR-IOV is not enabled. Rename the interface and eswitch mode names to decouple from SR-IOV, and cleanup eswitch messages accordingly. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
When ECPF is eswitch manager, it has the privilege to query and configure the mac and node guid of host PF. While vport number of host PF is 0, the vport command should be issued with other_vport set in this case as the cmd is issued by ECPF vport(0xfffe). Add a specific function to query own vport mac. Low level functions are used by vport manager to query/modify any vport mac and node guid. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
Before the offending commit, vlan will be configured if either vlan or qos is set. After the change with new set flags, function callers should provide flags accordingly. Fixes: e33dfe31 ("net/mlx5: E-Switch, Allow fine tuning of eswitch vport push/pop vlan") Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Parav Pandit authored
While enabling SR-IOV, PCI core already checks that if SR-IOV is already enabled, it returns failure error code. Hence, remove such duplicate check from mlx5_core driver. While at it, make mlx5_device_disable_sriov() to perform cleanup of VFs in reverse order of mlx5_device_enable_sriov(). Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
When ECPF eswitch manager is at offloads mode, it monitors functions changed event from host PF side and acts according to the number of VFs enabled/disabled. As ECPF and host PF work in two independent hosts, it's possible that host PF OS reboots but ECPF system is still kept on and continues monitoring events from host PF. When kernel from host PF side is booting, PCI iov driver does sriov_init and compute_max_vf_buses by iterating over all valid num of VFs. This triggers FLR and generates functions changed events, even though host PF HCA is not enabled at this time. However, ECPF is not aware of this information, and still handles these events as usual. ECPF system will see massive number of reps are created, but destroyed immediately once creation finished. To eliminate this noise, a bit is added to host parameter context to indicate host PF is disabled. ECPF will not handle the VF changed event if this bit is set. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Parav Pandit authored
As mlx5_get_next_phys_dev is used only for PCI PF devices use case, limit it to search only for PCI devices. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Parav Pandit authored
mlx5_pci_init() performs pci specific initialization of the mlx5_core_dev struct. Hence move pci_status_mutex to pci initialization routine mlx5_pci_init(). This allows reusing mlx5_mdev_init() to non PCI devices. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Huy Nguyen authored
Rename mlx5_pci_dev_type to mlx5_coredev_type to distinguish different mlx5 device types. mlx5_coredev_type represents mlx5_core_dev instance type. Hence keep mlx5_coredev_type in mlx5_core_dev structure. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Bodong Wang authored
When an IB rep is loaded, netdev for the same vport is saved for later reference. However, it's not cleaned up when doing unload. For ECPF, kernel crashes when driver is referring to the already removed netdev. Following steps lead to a shown call trace: 1. Create n VFs from host PF 2. Distroy the VFs 3. Run "rdma link" from ARM Call trace: mlx5_ib_get_netdev+0x9c/0xe8 [mlx5_ib] mlx5_query_port_roce+0x268/0x558 [mlx5_ib] mlx5_ib_rep_query_port+0x14/0x34 [mlx5_ib] ib_query_port+0x9c/0xfc [ib_core] fill_port_info+0x74/0x28c [ib_core] nldev_port_get_doit+0x1a8/0x1e8 [ib_core] rdma_nl_rcv_msg+0x16c/0x1c0 [ib_core] rdma_nl_rcv+0xe8/0x144 [ib_core] netlink_unicast+0x184/0x214 netlink_sendmsg+0x288/0x354 sock_sendmsg+0x18/0x2c __sys_sendto+0xbc/0x138 __arm64_sys_sendto+0x28/0x34 el0_svc_common+0xb0/0x100 el0_svc_handler+0x6c/0x84 el0_svc+0x8/0xc Cleanup the rep and netdev reference when unloading IB rep. Fixes: 26628e2d ("RDMA/mlx5: Move to single device multiport ports in switchdev mode") Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-