Commits · 0c86e280fe8a08d4ae30b77e46a1e7da28d756c9 · nexedi / linux

25 Jan, 2008 40 commits

IB/ehca: Remove CQ-QP-link before destroying QP in error path of create_qp() · 0c86e280
Hoang-Nam Nguyen authored Jan 17, 2008
```
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
```
0c86e280

IB/iser: Add change_queue_depth method · 6410627e

Erez Zilber authored Jan 17, 2008

Add a .change_queue_depth handler to the scsi_host_template in the
iSER driver.  iscsi_change_queue_depth was added to iscsi_tcp in order
to solve the problem of queue depth which was too high for some
targets.  It is also applicable for iSER.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

6410627e

IB/iser: Print information about unhandled RDMA CM events · a4ef1451

Erez Zilber authored Jan 17, 2008

Some RDMA CM events are not supported or not handled in iSER.
This patch adds some info (printk) for the user about them.
Signed-off-by: Erez Zilber <erezz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

a4ef1451

IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs · a3cd7d90

Olaf Kirch authored Jan 16, 2008

When a FMR is released via ib_fmr_pool_unmap(), the FMR usually ends
up on the free_list rather than the dirty_list (because we allow a
certain number of remappings before actually requiring a flush).

However, ib_fmr_batch_release() only looks at dirty_list when flushing
out old mappings.  This means that when ib_fmr_pool_flush() is used to
force a flush of the FMR pool, some dirty FMRs that have not reached
their maximum remap count will not actually be flushed.

Fix this by flushing all FMRs that have been used at least once in
ib_fmr_batch_release().
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

a3cd7d90

IB/fmr_pool: Flush serial numbers can get out of sync · a656eb75

Olaf Kirch authored Jan 16, 2008

Normally, the serial numbers for flush requests and flushes executed
for an FMR pool should be in sync.

However, if the FMR pool flushes dirty FMRs because the
dirty_watermark was reached, we wake up the cleanup thread and let it
do its stuff.  As a side effect, the cleanup thread increments
pool->flush_ser, which leaves it one higher than pool->req_ser.  The
next time the user calls ib_flush_fmr_pool(), the cleanup thread will
be woken up, but ib_flush_fmr_pool() won't wait for the flush to
complete because flush_ser is already past req_ser.  This means the
FMRs that the user expects to be flushed may not have all been flushed
when the function returns.

Fix this by telling the cleanup thread to do work exclusively by
incrementing req_ser, and by moving the comparison of dirty_len and
dirty_watermark into ib_fmr_pool_unmap().
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>

a656eb75

IB/umad: Simplify and fix locking · 2fe7e6f7

Roland Dreier authored Jan 25, 2008

In addition to being overly complex, the locking in user_mad.c is
broken: there were multiple reports of deadlocks and lockdep warnings.
In particular it seems that a single thread may end up trying to take
the same rwsem for reading more than once, which is explicitly
forbidden in the comments in <linux/rwsem.h>.

To solve this, we change the locking to use plain mutexes instead of
rwsems. There is one mutex per open file, which protects the contents
of the struct ib_umad_file, including the array of agents and list of
queued packets; and there is one mutex per struct ib_umad_port, which
protects the contents, including the list of open files. We never
hold the file mutex across calls to functions like ib_unregister_mad_agent(),
which can call back into other ib_umad code to queue a packet, and we
always hold the port mutex as long as we need to make sure that a
device is not hot-unplugged from under us.

This even makes things nicer for users of the -rt patch, since we
remove calls to downgrade_write() (which is not implemented in -rt).
Signed-off-by: Roland Dreier <rolandd@cisco.com>

2fe7e6f7

IB/ipath: Fix some sparse warnings about shadowed symbols · cf9542aa

Roland Dreier authored Jan 25, 2008

There are a few places in the ipath driver where a variable is
re-declared within a block where it is already in scope.  Most of these
extra declarations can simply be removed, since the variable from the
outer scope is used in a way so that it does not need to keep its
variable across the block with the re-declaration.
Signed-off-by: Roland Dreier <rolandd@cisco.com>

cf9542aa

RDMA/cxgb3: Endianness annotation for irs field · 1d6e658e

Roland Dreier authored Jan 25, 2008

t3_rdma_init_wr.irs is a big-endian field, so declare it as __be32.
This fixes one sparse warning.
Signed-off-by: Roland Dreier <rolandd@cisco.com>

1d6e658e

IB/ehca: Use round_jiffies() for EQ polling timer · 1a7d2dce

Anton Blanchard authored Oct 15, 2007

Use round_jiffies() to align ehca's 1-second timer with other timers
and potentially save power by sleeping cores for longer.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

1a7d2dce

RDMA/cma: Override default responder_resources with user value · 5851bb89

Sean Hefty authored Jan 04, 2008

By default, the responder_resources parameter is set to that received
in a connection request.  The passive side may override this value
when accepting the connection.  Use the value provided by the passive
side when transitioning the QP to RTR state, rather than the value
given in the connect request.  Without this change, the RTR transition
may fail if the passive side supports fewer responder_resources than
that in the request.

For code consistency and to protect against QP destruction, restructure
overriding initiator_depth to match how responder_resources is set.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

5851bb89

IB/ipath: Drop support for the original QHT7040 board · 1f813ca8

Dave Olson authored Jan 06, 2008

The original QHT7040 had significant performance issues so there was an
additional check in the driver for a newer serial number.  Support for
the small quantities of that board shipped has been dropped, so this
patch removes the special checks to simplify the code.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

1f813ca8

IB/ipath: Add ipath_read_ireg() abstraction · 7da0498e

Arthur Jones authored Jan 06, 2008

Different chips have different width interrupt status registers, so add
a flag and accessor function to decide which width register read to use.
Signed-off-by: Arthur Jones <arthur.jones@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

7da0498e

IB/ipath: Add flag and handling for chips with swapped register bug · 4ea61b54

Ralph Campbell authored Jan 06, 2008

The 6110 had a bug that caused some registers to be swapped; it was
fixed for the 7220 (and didn't affect the 6120 because it had fewer
registers).  This adds a flag and related code to handle that, and
includes some minor cleanups in the same area.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>

4ea61b54

IB/ipath: Port config has on-chip effects for 7220 · 60948a41

Ralph Campbell authored Jan 06, 2008

The number of configured ports for the 7220 changes the number of eager
TIDs available per port, for all but port 0 (kernel port) which remains
constant, so add a field to give port0 count separate from the portdata
structure.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

60948a41

IB/ipath: Allow more flexible user register alignments · a18e26ae

Ralph Campbell authored Jan 06, 2008

User registers have different alignments on different chips (4KB on
older, 64KB on 7220).  Allow mapping the user registers on kernels with
page sizes up to 64K.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

a18e26ae

IB/ipath: Clean up some comments · 9e2ef36b

Dave Olson authored Jan 06, 2008

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

9e2ef36b

IB/ipath: Export hardware counters more consistently · 3029fcc3

Ralph Campbell authored Jan 06, 2008

Various hardware counters are exported via the ipath file system (since
it is binary data).  The old file format was very dependent on the HW
offsets for these registers.  Newer HCA chips can have different
counters at different offsets.  This patch adds a level of indirection
to make the file format consistent across HCAs.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

3029fcc3

IB/ipath: MAD performance sampling registers support · 6c719cae

Ralph Campbell authored Jan 06, 2008

Add support for QLogic HCAs which have hardware performance sampling
registers for PortSamplesControl and PortSamplesResult MADs.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

6c719cae

IB/srp: Add identifying information to log messages · 7aa54bd7

David Dillow authored Jan 07, 2008

When you have multiple targets, it gets really confusing when you try
to track down who did a reset when there is no identifying information
in the log message, especially when the same extension ID is mapped
through two different local IB ports.  So, add an identifier that can
be used to track back to which local IB port/remote target pair is the
one having problems.
Signed-off-by: David Dillow <dillowda@ornl.gov>
Acked-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

7aa54bd7

IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries · 586a6934

Pradeep Satyanarayana authored Dec 21, 2007

Some HCAs (such as ehca2) support SRQ, but only support fewer than 16 SG
entries for SRQs. Currently IPoIB/CM implicitly assumes all HCAs will
support 16 SG entries for SRQs (to handle a 64K MTU with 4K pages). This
patch removes that restriction by limiting the maximum MTU in connected
mode to what the maximum number of SRQ SG entries allows.

This patch addresses <https://bugs.openfabrics.org/show_bug.cgi?id=728>
Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

586a6934

IB/srp: Enable SG list chaining · fff09a8e

David Dillow authored Dec 19, 2007

By default, the SCSI mid-layer seems to send down 512KB requests
(sg_tablesize = 256), with some requests occasionally combined. By
allowing the mid-layer to chain requests, we can easily grow to 1024KB
or larger -- I've tested 4096KB I/O requests with no problems.

I looked through the DMA paths on the hardware drivers to ensure they
could take advantage of the SG chaining, and it seems that every one
except ipath uses the system's DMA routines, which have been converted
to handle chaining.  ipath looks like it should be OK, but I have no
way to test it.
Signed-off-by: David Dillow <dillowda@ornl.gov>

[ Tested on ipath.  - Roland ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>

fff09a8e

IB/srp: Respect target credit limit · 8cba2077

David Dillow authored Dec 19, 2007

The current SRP initiator will send requests even if it has no credits
available.  The results of sending extra requests are vendor specific,
but on some devices, overrunning credits will cost 85% of peak
performance -- e.g. 100 MB/s vs 720 MB/s.  Other devices may just drop
the requests.

This patch will tell the SCSI midlayer to queue requests if there are
fewer than two credits remaining, and will not issue a task management
request if there are no credits remaining.  The mid-layer will retry
the queued command once an outstanding command completes.

The patch also removes the unlikely() in __srp_get_tx_iu(), as it is
not at all unlikely to hit this limit under heavy load.
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

8cba2077

IPoIB: improve IPv4/IPv6 to IB mcast mapping functions · a9e527e3

Rolf Manderscheid authored Dec 10, 2007

An IPoIB subnet on an IB fabric that spans multiple IB subnets can't
use link-local scope in multicast GIDs.  The existing routines that
map IP/IPv6 multicast addresses into IB link-level addresses hard-code
the scope to link-local, and they also leave the partition key field
uninitialised.  This patch adds a parameter (the link-level broadcast
address) to the mapping routines, allowing them to initialise both the
scope and the P_Key appropriately, and fixes up the call sites.

The next step will be to add a way to configure the scope for an IPoIB
interface.
Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

a9e527e3

IB/ipath: Changes for fields moving from devdata to portdata · 755807a2

Dave Olson authored Dec 06, 2007

This patch moves some arrays that were defined per-device to be
variables defined in the per context data structure, thus avoiding extra
kzalloc() calls.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

755807a2

IB/ipath: Generalize some xxx_SHIFT macros · d8274869

Dave Olson authored Dec 21, 2007

In preparation for upcoming chips that have different values for
INFINIPATH_R_PORTENABLE_SHIFT, INFINIPATH_R_INTRAVAIL_SHIFT,
INFINIPATH_R_TAILUPD_SHIFT, and portcfg_shift, remove the shared
#defines and use device-specific variables instead.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

d8274869

IB/ipath: kreceive uses portdata rather than devdata · c59a80ac

Ralph Campbell authored Dec 20, 2007

kreceive is now portdata * instead of devdata * and other kreceive
related cleanups....
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

c59a80ac

IB/ipath: Cleanup ipath_get_egrbuf() · d65708f3

Ralph Campbell authored Dec 21, 2007

Remove an unused parameter and fix up the comment.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

d65708f3

IB/ipath: Fix RNR NAK handling · cc65edcf

Ralph Campbell authored Dec 14, 2007

This patch fixes a couple of minor problems with RNR NAK handling:
 - The insertion sort was causing extra delay when inserting ahead
   vs. behind an existing entry on the list.
 - A resend of a first packet of a message which is still not ready,
   needs another RNR NAK (i.e., it was suppressed when it shouldn't).
 - Also, the resend tasklet doesn't need to be woken up unless the
   ACK/NAK actually indicates progress has been made.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

cc65edcf

IB/ehca: Forward event client-reregister-required to registered clients · e57d62a1

Hoang-Nam Nguyen authored Dec 20, 2007

This patch allows ehca to forward event client-reregister-required to
registered clients.  One such event is generated by a switch eg. after
its reboot.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

e57d62a1

IB/mlx4: Micro-optimize mlx4_ib_poll_one() · b3226184

Roland Dreier authored Jan 25, 2008

Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a
field from it, byte-swap it once into a temporary variable.  This 
results in smaller, better code -- eg, on 32-bit x86:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
function                                     old     new   delta
mlx4_ib_poll_cq                             1188    1183      -5
Signed-off-by: Roland Dreier <rolandd@cisco.com>

b3226184

IB/mthca: Remove MSI support as scheduled · e57895d3

Adrian Bunk authored Jan 01, 2008

Remove MSI support from the mthca driver, as scheduled.  There is no
reason to use MSI instead of MSI-X, since MSI-X performs better.  No
one has spoken up since MSI support was deprecated in commit f6be6fbe
("IB/mthca: Schedule MSI support for removal"), so apparently the MSI
support is unused.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

e57895d3

IB/iser: Typo fix (s/destory/destroy/) · 38dc732f

Oliver Pinter authored Jan 25, 2008

Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

38dc732f

IB/iser: update URLs of iSER docs · bd5d7a85
Erez Zilber authored Jan 25, 2008
```
Signed-off-by: Erez Zilber <erezz@voltaire.com>
```
bd5d7a85

RDMA/cma: add support for rdma_migrate_id() · 88314e4d

Sean Hefty authored Nov 14, 2007

This is based on user feedback from Doug Ledford at RedHat:

Events that occur on an rdma_cm_id are reported to userspace through an
event channel.  Connection request events are reported on the event
channel associated with the listen.  When the connection is accepted, a
new rdma_cm_id is created and automatically uses the listen event
channel.  This is suboptimal where the user only wants listen events on
that channel.

Additionally, it may be desirable to have events related to connection
establishment use a different event channel than those related to
already established connections.

Allow the user to migrate an rdma_cm_id between event channels. All
pending events associated with the rdma_cm_id are moved to the new event
channel.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

88314e4d

RDMA/cma: Reenable device removal on passive side · 45d9478d

Vladimir Sokolovsky authored Dec 07, 2007

Enable conn_id remove on the passive side after connection
establishment.  This corrects an issue where the IB driver can't be
unloaded after running applications over RDS.  The 'dev_remove' counter
does not reach 0 for established connections on the passive side.

This problem is limited to device removal, and only occurs on the
passive side if there are established connections.
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

45d9478d

IB/mad: Fix incorrect access to items on local_list · b61d92d8

Sean Hefty authored Nov 30, 2007

In cancel_mads(), MADs are moved from the wait_list and local_list
to a cancel_list for processing.  However, the structures on these two
lists are not the same.  The wait_list references struct
ib_mad_send_wr_private, but local_list references struct
ib_mad_local_private.  Cancel_mads() treats all items moved to the
cancel_list as struct ib_mad_send_wr_private.  This leads to a system
crash when requests are moved from the local_list to the cancel_list.

Fix this by leaving local_list alone.  All requests on the local_list
have completed are just awaiting processing by a queued worker thread.

Bug (crash) reported by Dotan Barak <dotanb@dev.mellanox.co.il>.
Problem with local_list access reported by Robert Reynolds
<rreynolds@opengridcomputing.com>.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

b61d92d8

IB/cm: Add basic performance counters · 9af57b7a

Sean Hefty authored Jul 16, 2007

Add performance/debug counters to track sent/received messages, retries,
and duplicates. Counters are tracked per CM message type, per port.

The counters are always enabled, so intrusive state tracking is not done.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

9af57b7a

IB/mad: Report number of times a mad was retried · 4fc8cd49

Sean Hefty authored Nov 27, 2007

To allow ULPs to tune timeout values and capture retry statistics,
report the number of times that a mad send operation was retried.

For RMPP mads, report the total number of times that the any portion
(send window) of the send operation was retried.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

4fc8cd49

IB/multicast: Report errors on multicast groups if P_key changes · 547af765

Sean Hefty authored Oct 22, 2007

P_key changes can invalidate multicast groups.  Report errors on all
multicast groups affected by a pkey change.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

547af765

IB: Spelling fixes in comments · 94545e8c

Joe Perches authored Dec 17, 2007

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

94545e8c