- 21 May, 2012 2 commits
-
-
Roland Dreier authored
Merge branches 'core', 'cxgb4', 'ipath', 'iser', 'lockdep', 'mlx4', 'nes', 'ocrdma', 'qib' and 'raw-qp' into for-linus
-
Vipul Pandya authored
Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
- 19 May, 2012 4 commits
-
-
Jack Morgenstein authored
We need to use a different loop index for mlx4_counter_alloc() and for device_create_file() iterations: the mlx4_counter_alloc() loop index is used in the error flow to free counters. If the same loop index is used for device_create_file() and, say, the device_create_file() loop fails on the first iteration, the allocated counters will not be freed. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Jack Morgenstein authored
It needs parentheses around the argument, so that it can be used with complex arguments (e.g., "n+5"). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Or Gerlitz authored
The current error flow code was releasing the IB connection object and calling iscsi_destroy_endpoint() directly without going through the reference counting mechanism introduced in commit 39ff05db ("IB/iser: Enhance disconnection logic for multi-pathing"). This resulted in a double free of the iscsi endpoint object, which causes a kernel NULL pointer dereference. Fix that by plugging into the IB conn reference counting correctly. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Shlomo Pongratz authored
Enable IB ULPs to use a larger portion of the device EQs (which map to IRQs). The mlx4_ib driver follows the mlx4_core framework of the EQs to be divided among the device ports. In this scheme, for each IB port, the number of allocated EQs follows the number of cores, subject to other system constraints, such as number available MSI-X vectors. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
- 18 May, 2012 10 commits
-
-
Vipul Pandya authored
This allows querying the QP state before flushing. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
Using kfifos for ID management was limiting the number of QPs and preventing NP384 MPI jobs. So replace it with a simple bitmap allocator. Remove IDs from the IDR tables before deallocating them. This bug was causing the BUG_ON() in insert_handle() to fire because the ID was getting reused before being removed from the IDR table. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
This allows dumping thousands of QPs. Log active open failures of interest. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
Add module option db_fc_threshold which is the count of active QPs that trigger automatic db flow control mode. Automatically transition to/from flow control mode when the active qp count crosses db_fc_theshold. Add more db debugfs stats On DB DROP event from the LLD, recover all the iwarp queues. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
Use GFP_ATOMIC in _insert_handle() if ints are disabled. Don't panic if we get an abort with no endpoint found. Just log a warning. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
Get FULL/EMPTY/DROP events from LLD. On FULL event, disable normal user mode DB rings. Add modify_qp semantics to allow user processes to call into the kernel to ring doobells without overflowing. Add DB Full/Empty/Drop stats. Mark queues when created indicating the doorbell state. If we're in the middle of db overflow avoidance, then newly created queues should start out in this mode. Bump the C4IW_UVERBS_ABI_VERSION to 2 so the user mode library can know if the driver supports the kernel mode db ringing. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
recover LLD EQs for DB drop interrupts. This includes adding a new db_lock, a spin lock disabling BH too, used by the recovery thread and the ring_tx_db() paths to allow db drop recovery. Clean up initial DB avoidance code. Add read_eq_indices() - this allows the LLD to use the PCIe mw to efficiently read hw eq contexts. Add cxgb4_sync_txq_pidx() - called by iw_cxgb4 to sync up the sw/hw pidx value. Add flush_eq_cache() and cxgb4_flush_eq_cache(). This allows iw_cxgb4 to flush the sge eq context cache before beginning db drop recovery. Add module parameter, dbfoifo_int_thresh, to allow tuning the db interrupt threshold value. Add dbfifo_int_thresh to cxgb4_lld_info so iw_cxgb4 knows the threshold. Add module parameter, dbfoifo_drain_delay, to allow tuning the amount of time delay between DB FULL and EMPTY upcalls to iw_cxgb4. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
Add platform-specific callback functions for interrupts. This is needed to do a single read-clear of the CAUSE register and then call out to platform specific functions for DB threshold interrupts and DB drop interrupts. Add t4_mem_win_read_len() - mem-window reads for arbitrary lengths. This is used to read the CIDX/PIDX values from EC contexts during DB drop recovery. Add t4_fwaddrspace_write() - sends addrspace write cmds to the fw. Needed to flush the sge eq context cache. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Vipul Pandya authored
Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
- 15 May, 2012 2 commits
-
-
Steve Wise authored
Log a warning and drop the abort message. Otherwise we will do a bogus wake_up() and crash. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Steve Wise authored
This fixes a race where an ingress abort fails to wake up the thread blocked in rdma_init() causing the app to hang. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
- 14 May, 2012 12 commits
-
-
Jack Morgenstein authored
Under most circumstances, the bitmap allocator does not allocate the same full 24-bit QP number immediately after a QP is destroyed. This works by using the upper bits of a 24-bit QP number, beyond the number of QPs that are actually available in the low level driver. For example, say that the HCA is willing to allocate a maximum of 64K qps. We use the bits 23..16 as a "counter" which is incremented by 1 at each allocation so that even if the same physical QP is re-allocated, it will not receive the same 24-bit QP number. However, we have seen the following scenario: 1. Allocate, say, 255 QPs in succession. This will cause a wrap of the "counter". 2. Destroy the first QP allocated, then allocate a new QP. The new QP, because of the counter wraparound, will get the same FULL QP number as the QP just destroyed! This is a problem because packets in transit can be erroneously delivered to the new QP when they were meant for the old (destroyed) QP, because the full QP number of the new QP is identical to the destroyed QP. (The "counter" mechanism is meant to prevent this by having the full 24-bit QP numbers differ even if the physical QP on the HCA is the same. As we see above, however, this mechanism does not always work). The best fix for this problem is to allocate QPs in round-robin mode, so that the physical QP numbers are not immediately re-used. Found-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Tatyana Nikolova authored
Don't call the ibqp event_handler pointer in the case it wasn't initialized. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Donald Wood <Donald.E.Wood@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Tatyana Nikolova authored
Set ORD value of the connecting peer to be at least one in order to accommodate an RDMA READ Request message. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Donald Wood <Donald.E.Wood@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Mike Marciniszyn authored
This patch reorganizes the QP and devdata files to be more cache line aware. qib_qp fields in particular are split into read-mostly, send, and receive fields. qib_devdata fields are split into read-mostly and read/write fields Testing has show that bidirectional tests improve by as much as 100% with this patch. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Jim Foraker authored
If a MAD is sent directly to the local HCA rather than placed on a QP and the MAD fails M_Key checks, there is no means to generate a timeout for the client, which may hang. Instead we report IB_MAD_RESULT_FAILURE, which operates the same for on-the-wire packets, but will generate a send failure back to the client. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jim Foraker <foraker1@llnl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Jim Foraker authored
If a port has an M_Key lease set, the M_Key protect bits set to 1, and a SubnSet arrives with an invalid M_Key, an M_Key mismatch trap is generated, the lease timer begins as expected, and eventually the M_Key protect bits will be set back to 0 as per the spec. However, if any other SMP with an invalid M_Key arrives, the lease timer is expired and the M_Key protect bits remain in force. This is not according to to spec. In particular, C14-17 says that a lease timer that is underway is not affected by protection level checks (ie, at protection level 1, a SubnGet with a bad M_Key may be successful, but does not stop the timer), and C14-19 says that the timer shall stop when a valid M_Key has been received. C14-19 is the only compliance statement that specifies a stopping condition for the timer. This behavior is magnified if the port's Master SM LID attribute points at itself. In that case, the M_Key mismatch trap is sufficient to expire the timer, and the mkey lease attribute is rendered useless. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Jim Foraker <foraker1@llnl.gov> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Mitko Haralanov authored
The SERDES was using the incorrect Frequency Loop Bandwidth setting causing the link to cycle through the Physical link negotiation state machine. Fixing the Frequency Loop Bandwidth setting in the SERDES helps the link come up faster and more reliably. Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Mitko Haralanov authored
A "fix" for a bug with the number of contexts on a single-port board caused the calculation to be off by one, which causes problems with the upper layers. The same problem exists for number of free contexts, which is also fixed here. Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Todd Rimmer authored
When a port first goes active with SMA Set(PortInfo) and reregister bit set, the driver sends up the reregister event followed by a port active event. The problem is that in response to reregister event most apps try to issue a SA query of some sort, but that fails because port is not active. The qib driver needs to a trivial change to correct this behavior. This issue has been there for a while; however the recent serdes work has probably made the delay between the reregister event and the active event larger and hence opened the race far enough so that its being seen more often. The patch also changes the clientrereg local to a u8 and saves off the rereg bit into it. The code following the nested subn_get_portinfo() now restores that bit per o14-12.2.1 with a logical OR from that copy. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Mike Marciniszyn authored
This patch optimizes pio buffer allocation in the kernel. For qib, kernel pio buffers are used for sending acks. The code to allocate the buffer would always start at 0 until it found a buffer. This means that an average of 64 comparisions were done on each allocate, since the busy bit won't be cleared until the bits are refreshed when buffers are exhausted. This patch adds two new fields in the devdata struct, last_pio and min_kernel_pio. last_pio is the last buffer that was allocated. min_kernel_pio is the lowest potential available buffer. min_kernel_pio is modifed as contexts are allocated and deallocted. Reviewed-by: Ramkrishna Vepa <ramkrishna.vepa@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Mike Marciniszyn authored
Add a prefetch call when a packet has been stored. The nature of the prefetch is correctly determined by the alternatives mechanism. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Mike Marciniszyn authored
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
- 11 May, 2012 1 commit
-
-
Yishai Hadas authored
Commit bc3e53f6 ("mm: distinguish between mlocked and pinned pages") introduced a separate counter for pinned pages and used it in the IB stack. However, in ib_umem_get() the pinned counter is incremented, but ib_umem_release() wrongly decrements the locked counter. Fix this. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
- 08 May, 2012 9 commits
-
-
Shlomo Pongratz authored
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> [ Replace one more printk_once() with pr_info_once(). - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Shlomo Pongratz authored
This patch adds a 64-bit flags2 features member to struct mlx4_dev to export further features of the hardware. The original flags field tracks features whose support bits are advertised by the firmware in offsets 0x40 and 0x44 of the query device capabilities command. flags2 will track features whose support bits are scattered at various offsets. RSS support is the first feature to be exported through flags2. RSS capabilities are located at offset 0x2e. The size of the RSS indirection table is also given in this offset. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Oren Duer authored
Otherwise CM packets going over MLX QP1 get fixed scheduling priority 0. We want CM packets to get the same scheduling priority, and therefore map to the same SQ (Schedule Queue) and eventually TC (Traffic Class), as the application requested for the actual QP used for the connection. Signed-off-by: Oren Duer <oren@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Or Gerlitz authored
Implement raw packet QPs for Ethernet ports using the MLX transport (as done by the mlx4_en Ethernet netdevice driver). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Or Gerlitz authored
IB_QPT_RAW_PACKET allows applications to build a complete packet, including L2 headers, when sending; on the receive side, the HW will not strip any headers. This QP type is designed for userspace direct access to Ethernet; for example by applications that do TCP/IP themselves. Only processes with the NET_RAW capability are allowed to create raw packet QPs (the name "raw packet QP" is supposed to suggest an analogy to AF_PACKET / SOL_RAW sockets). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Roland Dreier authored
When IPV6 is not enabled: ERROR: "register_inet6addr_notifier" [drivers/infiniband/hw/ocrdma/ocrdma.ko] undefined! ERROR: "unregister_inet6addr_notifier" [drivers/infiniband/hw/ocrdma/ocrdma.ko] undefined! Fix this by wrapping the inet6 calls in #ifdef IPV6. Also make the ocrdma module depend on (IPV6 || IPV6=n) to forbid the case of modular ipv6 but built-in ocrdma (which can't work, because ocrdma calls ipv6 functions). Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Dan Carpenter authored
We only need to disable the IRQs one time. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [ Rename "wq_flags" to more conventional "flags." - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Dan Carpenter authored
The ocrdma_alloc_lkey() function never returns NULL pointers -- it returns ERR_PTRs. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
-
Sasha Levin authored
Events sent to ocrdma_inet6addr_event() are sent from an atomic context, therefore we can't try to lock a mutex within the notifier callback. We could just switch the mutex to a spinlock since all it does it protect a list, but I've gone ahead and switched the list to use RCU instead. I couldn't fully test it since I don't have IB hardware, so if it doesn't fully work for some reason let me know and I'll switch it back to using a spinlock. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> [ Fixed locking in ocrdma_add(). - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
-