- 10 Aug, 2017 23 commits
-
-
Leon Romanovsky authored
Provide ability to get specific to device and port information. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
This patch implements the query interface to get all ports data for the specific device. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Provide ability to query specific device. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
This patch adds the ability to return all available devices together with their properties. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Add nldev init and exit flows to the RDMA/core. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Introduce new defines to rdma_netlink.h, so the RDMA configuration tool will be able to communicate with RDMA subsystem by using the shared defines. The addition of new client (NLDEV) revealed the fact that we exposed by mistake the RDMA_NL_I40IW define which is not backed by any RDMA netlink by now and it won't be exposed in the future too. So this patch reuses the value and deletes the old defines. The NLDEV operates with objects. The struct ib_device has two straightforward objects: device itself and ports of that device. This brings us to propose the following commands to work on those objects: * RDMA_NLDEV_CMD_{GET,SET,NEW,DEL} - works on ib_device itself * RDMA_NLDEV_CMD_PORT_{GET,SET,NEW,DEL} - works on ports of specific ib_device Those commands receive/return the device index (RDMA_NLDEV_ATTR_DEV_INDEX) and port index (RDMA_NLDEV_ATTR_PORT_INDEX). For device object accesses, the RDMA_NLDEV_ATTR_PORT_INDEX will return the maximum number of ports for specific ib_device and for port access the actual port index. The port index starts from 1 to follow RDMA/core internal semantics and the sysfs exposed knobs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Add Mellanox to the copyright header. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
RDMA_NL_LS protocol is actually does not dump anything, but sets data and it should be handled by doit callback. This patch actually converts RDMA_NL_LS to doit callback, while preserving IWCM and RDMA_CM flows through netlink_dump_start(). Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Introduce intermediate variable to store access to fields of cb_table. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
The .doit callback is used by netlink core to differentiate between get and set operations. Common convention is to use that call for command operations like (SET, ADD, e.t.c.) and/or access without NLF_M_DUMP flag. This commit adds proper declaration and implementation to RDMA netlink. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
This patch adds static device index in similar fashion to already available in netdev world (struct net->ifindex). In downstream patches, the RDMA nelink will use this idx-to-ib_device conversion, so as part of this commit, we are exposing the translation function to be visible for IB/core users. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Leon Romanovsky authored
The coming nldev needs to iterate over all IB devices in the system and in order to not expose the ib_devices list outside the devices.c, it is necessary to provide function iterator. Current version is written explicitly for nldev callback to avoid over-engineering at this stage, but it can be easily extended for other types. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
The RDMA netlink client infrastructure was removed and made obsolete. The old infrastructure defined struct ibnl_client_cbs. Now that all uses of this have been updated to the new infrastructure, rename the struct to be compliant with the current stack naming standards: struct rdma_nl_cbs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Make ibnl_chk_listeners function to be one line by removing unneeded comparison. Rename that function to be complaint to other functions in RDMA netlink. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
The pointer to netlink header was not used in the ibnl_multicast function, so let's remove it and simplify the function signature. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Netlink message header is not needed for unicast reply, hence remove it. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Reuse standard macros to cancel the netlink message in case of error. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
Add ability to provide flags to control RDMA netlink callbacks and convert addr.c and sa_query.c to be first users of such infrastructure. It allows to move their CAP_NET_ADMIN checks into netlink core. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>
-
Leon Romanovsky authored
The iwcm exports functions which are not used outside of ib_core. This patch simply removes these EXPORT_SYMBOLS. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Chien Tin Tung <chien.tin.tung@intel.com>
-
Leon Romanovsky authored
RDMA netlink implementation guarantees that supplied client number is in allowed range. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Chien Tin Tung <chien.tin.tung@intel.com>
-
Leon Romanovsky authored
The standard netlink_rcv_skb function skips messages without NLM_F_REQUEST flag in it, while SA netlink client issues them. In commit bc10ed7d ("IB/core: Add rdma netlink helper functions") the local function was introduced to allow such messages. This led to double pass for every incoming message. In this patch, we unify that local implementation and netlink_rcv_skb functions, so there will be no need for double pass anymore. As a outcome, this combined function gained more strict check for NLM_F_REQUEST flag and it is now allowed for SA pathquery client only. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Leon Romanovsky authored
Owner field is not needed to be set because netlink is part of ib_core which will be unloaded last after all other modules are unloaded. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Leon Romanovsky authored
RDMA netlink has a complicated infrastructure for dynamically registering and de-registering netlink clients to the NETLINK_RDMA group. The complicated portion of this code is not widely used because 2 of the 3 current clients are statically compiled together with netlink.c. The infrastructure, therefore, is deemed overkill. Refactor the code to eliminate the dynamically added clients. Now all clients are pre-registered in a client array at compile time, and at run time they merely check-in with the infrastructure to pass their callback table for inclusion in the pre-sized client array. This also allows for future cleanups and removal of unneeded code in the iwcm* netlink handler. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Chien Tin Tung <chien.tin.tung@intel.com>
-
- 09 Aug, 2017 1 commit
-
-
Ismail, Mustafa authored
Add a wait/retry version of ibnl_unicast, ibnl_unicast_wait, and modify ibnl_unicast to not wait/retry. This eliminates the undesirable wait for future users of ibnl_unicast. Change Portmapper calls originating from kernel to user-space to use ibnl_unicast_wait and take advantage of the wait/retry logic in netlink_unicast. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
-
- 31 Jul, 2017 16 commits
-
-
Sebastian Sanchez authored
Always initiate an offline transition request when a link down occurs. The firmware will use this request to confirm that the driver has seen the link down message. A host version is set to indicate this driver behavior to the firmware. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Sebastian Sanchez authored
When link interrupts occur, multiple link down requests could be queued up when only one is needed. This could get the hfi1 out of sync with its link partner during LNI. Only allow one link down request to be queued at any one time. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Sebastian Sanchez authored
Currently, link down interrupts queue link entries on a workqueue intended for sending events only. Create a workqueue for queuing link events. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mike Marciniszyn authored
The server side of qperf panics as follows: [242446.336860] IP: report_bug+0x64/0x10 [242446.341031] PGD 1c0c067 [242446.341032] P4D 1c0c067 [242446.343951] PUD 1c0d063 [242446.346870] PMD 8587ea067 [242446.349788] PTE 800000083e14016 [242446.352901] [242446.358352] Oops: 0003 [#1] SM [242446.437919] CPU: 1 PID: 7442 Comm: irq/92-hfi1_0 k Not tainted 4.12.0-mam-asm #1 [242446.446365] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/201 [242446.458397] task: ffff8808392d2b80 task.stack: ffffc9000664000 [242446.465097] RIP: 0010:report_bug+0x64/0x10 [242446.469859] RSP: 0018:ffffc900066439c0 EFLAGS: 0001000 [242446.475784] RAX: ffffffffa06647e4 RBX: ffffffffa06461e1 RCX: 000000000000000 [242446.483840] RDX: 0000000000000907 RSI: ffffffffa0675040 RDI: ffffffffffff740 [242446.491897] RBP: ffffc900066439e0 R08: 0000000000000001 R09: 000000000000025 [242446.499953] R10: ffffffff81a253df R11: 0000000000000133 R12: ffffc90006643b3 [242446.508010] R13: ffffffffa065bbf0 R14: 00000000000001e5 R15: 000000000000000 [242446.516067] FS: 0000000000000000(0000) GS:ffff88085f640000(0000) knlGS:000000000000000 [242446.525191] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003 [242446.531698] CR2: ffffffffa06647ee CR3: 0000000001c09000 CR4: 00000000001406e [242446.539756] Call Trace [242446.542582] fixup_bug+0x2c/0x5 [242446.546277] do_trap+0x12b/0x18 [242446.549972] do_error_trap+0x89/0x11 [242446.554171] ? hfi1_copy_sge+0x271/0x2b0 [hfi1 [242446.559324] ? ttwu_do_wakeup+0x1e/0x14 [242446.563795] ? ttwu_do_activate+0x77/0x8 [242446.568363] do_invalid_op+0x20/0x3 [242446.572448] invalid_op+0x1e/0x3 [242446.576247] RIP: 0010:hfi1_copy_sge+0x271/0x2b0 [hfi1 [242446.582075] RSP: 0018:ffffc90006643be8 EFLAGS: 0001004 [242446.587999] RAX: 0000000000000000 RBX: ffff88083e0fa240 RCX: 000000000000000 [242446.596058] RDX: 0000000000000000 RSI: ffff880842508000 RDI: ffff88083e0fa24 [242446.604116] RBP: ffffc90006643c28 R08: 0000000000000000 R09: 000000000000000 [242446.612172] R10: ffffc90009473640 R11: 0000000000000133 R12: 000000000000000 [242446.620228] R13: 0000000000000000 R14: 0000000000002000 R15: ffff88084250800 [242446.628293] ? hfi1_copy_sge+0x1a1/0x2b0 [hfi1 [242446.633449] hfi1_rc_rcv+0x3da/0x1270 [hfi1 [242446.638312] ? sc_buffer_alloc+0x113/0x150 [hfi1 [242446.643662] hfi1_ib_rcv+0x1c9/0x2e0 [hfi1 [242446.648428] process_receive_ib+0x19a/0x270 [hfi1 [242446.653866] ? process_rcv_qp_work+0xd2/0x160 [hfi1 [242446.659505] handle_receive_interrupt_nodma_rtail+0x184/0x2e0 [hfi1 [242446.666693] ? irq_finalize_oneshot+0x100/0x10 [242446.671846] receive_context_thread+0x1b/0x140 [hfi1 [242446.677576] irq_thread_fn+0x1e/0x4 [242446.681659] irq_thread+0x13c/0x1b [242446.685646] ? irq_forced_thread_fn+0x60/0x6 [242446.690604] kthread+0x112/0x15 [242446.694298] ? irq_thread_check_affinity+0xe0/0xe [242446.699738] ? kthread_park+0x60/0x6 [242446.703919] ? do_syscall_64+0x67/0x15 [242446.708292] ret_from_fork+0x25/0x3 [242446.712374] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0 [242446.733527] RIP: report_bug+0x64/0x100 RSP: ffffc900066439c [242446.739935] CR2: ffffffffa06647e [242446.743763] ---[ end trace 0e90a20d0aa494f7 ]-- The root cause is that the qib/hfi1 post receive call to rvt_lkey_ok() doesn't interpret the new return value from rvt_lkey_ok() properly leading to an mr reference count underrun. Additionally, remove an unused argument in rvt_sge_adjacent() aw well as an unneeded incr local in rvt_post_one_wr(). Fixes: Commit 14fe13fc ("IB/rdmavt: Compress adjacent SGEs in rvt_lkey_ok()") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Jan Sokolowski authored
The error messages when checksum validation of the platform configuration fields populated into the ASIC scratch registers fails are ambiguous. Disambiguate them. Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Michael J. Ruhl authored
The allocate_ctxt() function adds the context to the fd data structure. Since the context is not completely initialized, this can cause confusion as to whether the context is valid or not. Move the fd reference from allocate_ctxt() to setup_base_ctxt(). Update the necessary functions to be aware of this move. Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Jan Sokolowski authored
Fix issue where a disabled port can be enabled by inserting a cable. The port should be explicitly enabled instead. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Alex Estrin authored
There is a window that allows other threads to read state of 'host_link_state' as a new, before the hardware actual state is set. This patch closes the window by indicating a new state only after hardware transition is complete. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Michael J. Ruhl authored
A copy_to_user() call assumes that two members of a data structure are sequential. Since this may not always be true, separate the copies to ensure a safe copy. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Alex Estrin authored
There is a window where the FM can read the buffer control table and decide not to program buffers. When a port goes down, the code clears the table and if it is not programmed, posted SDMA descriptors will never complete due to no buffer credits. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Bartlomiej Dudek authored
During PCIe initialization some registers' values from PCI config space are saved in order to restore them later (i.e. after reset). Restoring those value is done by a function called restore_pci_variables, while saving them is put directly into function hfi1_pcie_ddinit. Move saving values to a separate function in the image of restoring functionality. Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Byczkowski, Jakub authored
Loading debug signed firmware fails if started immediately after failed attempt to load production firmware. A short delay is required so add about a 100us delay after RSA check failure. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Jan Sokolowski authored
Code structure is not consistent for if/else blocks and break instructions in set_link_state for case HLS_UP_INIT. Physical state uses break in case of an error and if/else blocks for logical use cases. These blocks should be implemented consistently. Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Michael J. Ruhl authored
A trap should be sent to the FM until the FM sends a repress message. This is in line with the IBTA 13.4.9. Add the ability to resend traps until a repress message is received. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael N. Henry <michael.n.henry@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Michael J. Ruhl authored
The hfi1_rcvctrl() function receives an index which it then converts to an rcd. Since most functions have the rcd, use that instead. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Michael J. Ruhl authored
The hfi1_<set|clear>_ctxt_<j|p>key functions take a context index and look up the context based on that index. Since the context index is being retrieved from the context, this doesn't seem optimal. Pass the context pointer for use, rather than the context index. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-