Commits · 3b661a92e869ebe2358de8f4b3230ad84f7fce51 · nexedi / linux

20 Jul, 2012 40 commits

[SCSI] fix hot unplug vs async scan race · 3b661a92

Dan Williams authored Jun 21, 2012

The following crash results from cases where the end_device has been
removed before scsi_sysfs_add_sdev has had a chance to run.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6
 ...
 Call Trace:
  [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3
  [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
  [<ffffffff8125e641>] kobject_add_varg+0x41/0x50
  [<ffffffff8125e70b>] kobject_add+0x64/0x66
  [<ffffffff8131122b>] device_add+0x12d/0x63a
  [<ffffffff814b65ea>] ? _raw_spin_unlock_irqrestore+0x47/0x56
  [<ffffffff8107de15>] ? module_refcount+0x89/0xa0
  [<ffffffff8132f348>] scsi_sysfs_add_sdev+0x4e/0x28a
  [<ffffffff8132dcbb>] do_scan_async+0x9c/0x145

...teach scsi_sysfs_add_devices() to check for deleted devices() before
trying to add them, and teach scsi_remove_target() how to remove targets
that have not been added via device_add().

Cc: <stable@vger.kernel.org>
Reported-by: Dariusz Majchrzak <dariusz.majchrzak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

3b661a92

[SCSI] aacraid: Fix endian issues in core and SRC portions of driver · b5f1758f

Ben Collins authored Jun 11, 2012

This may not fix all endian issues in this driver, but it does get the
driver working on PowerPC for a PMC SRC card. So it should at least fix
all the problems in the core and in the SRC support.

[jejb: fix >> 32 breakage reported by Fengguang Wu]
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Acked-by: Achim Leubner <Achim_Leubner@pmc-sierra.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

b5f1758f

[SCSI] aacraid: Relax the tight timeout loop on fib commands · 30002f1c

Ben Collins authored Jun 11, 2012

The loop that waited for syncronous fib commands was causing a CPU stall
when a timeout actually occured.

1) Switch to using a more accurate timeout mechanism.
2) Do not pace the loop with udelay(). Use cpu_relax() to allow for
   scheduling to occur.
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Acked-by: Achim Leubner <Achim_Leubner@pmc-sierra.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

30002f1c

[SCSI] aacraid: Better handling of in-flight events on thread stop · 361ee9c3

Ben Collins authored Jun 11, 2012

When an error occured that would shut down the driver, some in-flight
events were getting caught up, deadlocking a CPU or two.
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Acked-by: Achim Leubner <Achim_Leubner@pmc-sierra.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

361ee9c3

[SCSI] aacraid: Use resource_size_t for IO mem pointers and offsets · ff08784b

Ben Collins authored Jun 11, 2012

This also stops using the "legacy crap" in Scsi_Host (shost->base is an
unsigned long).

This affected 32-bit systems that have 64-bit resource sizes, causing the
IO address to be truncated.
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Acked-by: Achim Leubner <Achim_Leubner@pmc-sierra.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

ff08784b

[SCSI] scsi_dh: add scsi_dh_attached_handler_name · 7e8a74b1

Mike Snitzer authored Jun 26, 2012

Introduce scsi_dh_attached_handler_name() to retrieve the name of the
scsi_dh that is attached to the scsi_device associated with the provided
request queue.  Returns NULL if a scsi_dh is not attached.

Also, fix scsi_dh_{attach,detach} function header comments to document
@q rather than @sdev.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Babu Moger <babu.moger@netapp.com>
Reviewed-by: Chandra Seetharaman <sekharan@us.ibm.com>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

7e8a74b1

[SCSI] cxgb4i: tcp push bit fix · 6aca4112

Karen Xie authored Jun 28, 2012

Fixed the parentheses so the tcp push bit would be sent properly.
Signed-off-by: Karen Xie <kxie@chelsio.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

6aca4112

[SCSI] Stop accepting SCSI requests before removing a device · b485462a

Bart Van Assche authored Jun 29, 2012

Avoid that the code for requeueing SCSI requests triggers a
crash by making sure that that code isn't scheduled anymore
after a device has been removed.

Also, source code inspection of __scsi_remove_device() revealed
a race condition in this function: no new SCSI requests must be
accepted for a SCSI device after device removal started.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

b485462a

[SCSI] Change return type of scsi_queue_insert() into void · 84feb166

Bart Van Assche authored Jun 29, 2012

The return value of scsi_queue_insert() is ignored by all its
callers, hence change the return type of this function into
void.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

84feb166

[SCSI] Avoid dangling pointer in scsi_requeue_command() · 940f5d47

Bart Van Assche authored Jun 29, 2012

When we call scsi_unprep_request() the command associated with the request
gets destroyed and therefore drops its reference on the device.  If this was
the only reference, the device may get released and we end up with a NULL
pointer deref when we call blk_requeue_request.
Reported-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: <stable@kernel.org>
[jejb: enhance commend and add commit log for stable]
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

940f5d47

[SCSI] Fix device removal NULL pointer dereference · 67bd9413

Bart Van Assche authored Jun 29, 2012

Use blk_queue_dead() to test whether the queue is dead instead
of !sdev. Since scsi_prep_fn() may be invoked concurrently with
__scsi_remove_device(), keep the queuedata (sdev) pointer in
__scsi_remove_device(). This patch fixes a kernel oops that
can be triggered by USB device removal. See also
http://www.spinics.net/lists/linux-scsi/msg56254.html.

Other changes included in this patch:
- Swap the blk_cleanup_queue() and kfree() calls in
  scsi_host_dev_release() to make that code easier to grasp.
- Remove the queue dead check from scsi_run_queue() since the
  queue state can change anyway at any point in that function
  where the queue lock is not held.
- Remove the queue dead check from the start of scsi_request_fn()
  since it is redundant with the scsi_device_online() check.
Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

67bd9413

[SCSI] block: Fix blk_execute_rq_nowait() dead queue handling · e81ca6fe

Muthukumar Ratty authored Jun 29, 2012

If the queue is dead blk_execute_rq_nowait() doesn't invoke the done()
callback function. That will result in blk_execute_rq() being stuck
in wait_for_completion(). Avoid this by initializing rq->end_io to the
done() callback before we check the queue state. Also, make sure the
queue lock is held around the invocation of the done() callback. Found
this through source code review.
Signed-off-by: Muthukumar Ratty <muthur@gmail.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

e81ca6fe

[SCSI] megaraid: remove a spurious IRQ enable · 6548b0e5

Dan Carpenter authored Jun 27, 2012

We took this lock with spin_lock() so we should unlock it with
spin_unlock() instead of spin_unlock_irq().  This was introduced in
f2c8dc40 "[SCSI] megaraid_mbox: remove scsi_assign_lock usage".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

6548b0e5

[SCSI] megaraid: cleanup type issue in mega_build_cmd() · 9d5d93e3

Dan Carpenter authored Jun 27, 2012

On 64 bit systems the current code sets 32 bits of "seg" and leaves the
other 32 uninitialized.  It doesn't matter since the variable is never
used.  But it's still messy and we should fix it.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

9d5d93e3

[SCSI] bfa: dereferencing freed memory in bfad_im_probe() · a5254dbb

Dan Carpenter authored Jun 27, 2012

If bfad_thread_workq(bfad) was not BFA_STATUS_OK then we freed "im"
and then dereferenced it.

I did a little clean up because it seemed nicer to return directly
instead of doing a superfluous goto.  I looked at other functions in
this file and it seems like returning directly is standard.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

a5254dbb

[SCSI] bfa: off by one in bfa_ioc_mbox_isr() · fffa6923

Dan Carpenter authored Jun 27, 2012

If mc == BFI_MC_MAX then we're reading past the end of the
mod->mbhdlr[] array.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

fffa6923

[SCSI] properly initialize atomic_t · 9e1a1537

Josh Hunt authored Jun 09, 2012

Initialize atomic_t scsi_host_next_hn and ioerr_cntas per the guidelines
defined in Documentation/atomic_ops.txt
Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

9e1a1537

[SCSI] scsi_dh_alua: Re-enable STPG for unavailable ports · bb2c94a3

Bart Van Assche authored Jun 22, 2012

A quote from SPC-4: "While in the unavailable primary target port
asymmetric access state, the device server shall support those of
the following commands that it supports while in the active/optimized
state: [ ... ] d) SET TARGET PORT GROUPS; [ ... ]". Hence re-enable
sending STPG to a target port group that is in the unavailable state.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

bb2c94a3

[SCSI] qla4xxx: Update driver version to 5.02.00-k18 · efb6c717

Vikas Chaudhary authored Jun 14, 2012

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

efb6c717

[SCSI] qla4xxx: Fix Spell check. · 18e2df93

Vikas Chaudhary authored Jun 14, 2012

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

18e2df93

[SCSI] qla4xxx: Fix a Sparse warning message · 68b6d5d3

Vikas Chaudhary authored Jun 14, 2012

Fix following message:-
drivers/scsi/qla4xxx/ql4_os.c:3266:5: error: symbol 'qla4xxx_post_aen_work' redeclared with different type (originally declared at drivers/scsi/qla4xxx/ql4_glbl.h:186) - incompatible argument 2 (different signedness)
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

68b6d5d3

[SCSI] qla4xxx: multi-session fix for flash ddbs · 1cb78d73

Vikas Chaudhary authored Jun 14, 2012

Allow multi-session to target (for flash ddbs) accesible via
multiple network portal
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

1cb78d73

[SCSI] scsi_dh_alua: backoff alua rtpg retry linearly vs. geometrically · bc97f4bb

Rob Evers authored May 18, 2012

Currently the backoff algorithm for when to retry alua rtpg
requests progresses geometrically as so:

2, 4, 8, 16, 32, 64... seconds.

This progression can lead to un-needed delay in retrying
alua rtpg requests when the rtpgs are delayed.  A less
aggressive backoff algorithm that is additive would not
lead to such large jumps when delays start getting long, but
would backoff linearly:

2, 4, 6, 8, 10... seconds.
Signed-off-by: Martin George <marting@netapp.com>
Signed-off-by: Rob Evers <revers@redhat.com>
Reviewed-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

bc97f4bb

[SCSI] scsi_dh_alua: retry alua rtpg extended header for illegal request response · 8e67ce60

Rob Evers authored May 18, 2012

Some storage arrays are known to return 'illegal request'
when an rtpg extended header request is made.  T10 says the
array should ignore the bit, and return the non-extended
rtpg as the array doesn't support the request.  Working
around this by retrying the rtpg request without the extended
header bit set when the extended rtpg request results in
illegal request.
Signed-off-by: Rob Evers <revers@redhat.com>
Reviewed-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

8e67ce60

[SCSI] scsi_dh_alua: implement 'implied transition timeout' · 3588c5a2

Rob Evers authored May 18, 2012

During alua transitions, an array can return transitioning
status in response to rtpg requests.  These requests get
retried for a maximum of 60 seconds by default before timing
out.  Sometimes this timeout isn't sufficient to allow the
array to complete the transition.  T10-spc4 addresses this
under 'Report Target Port Groups' command.

This update retrieves the timeout value from the storage
array if available and retries the transitioning rtpgs
for up to the 'implied transitioning timeout' value
Signed-off-by: Rob Evers <revers@redhat.com>
Reviewed-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

3588c5a2

[SCSI] arcmsr: fix misuse of | instead of & · 6ad819b0

Dan Carpenter authored Jun 09, 2012

ARCMSR_ARC1880_DiagWrite_ENABLE is 0x00000080 so (x | 0x00000080) is
never zero.  The intent here was to test that loop until
ARCMSR_ARC1880_DiagWrite_ENABLE was turned on, but because the test was
wrong, we would do five loops regardless of whether it succeed or not.

Also I simplified the condition a little by removing the unused
assignement.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Nick Cheng <nick.cheng@areca.com.tw>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

6ad819b0

[SCSI] hptiop: fix RR312x in hosts with >12GB · 23f0bb47

HighPoint Linux Team authored Jun 14, 2012

As the limitation of RR312x's dma engine, the HBA can not access host memory
over 12GB.  This fixes

https://bugzilla.kernel.org/show_bug.cgi?id=14311

[alan: resurrected bug from 2009 and pushed upstream]
Reported-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: HighPoint Linux Team <linux@highpoint-tech.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

23f0bb47

[SCSI] lpfc 8.3.32: Update lpfc to version 8.3.32 · f3d8af9e

James Smart authored Jun 12, 2012

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

f3d8af9e

[SCSI] lpfc 8.3.32: Fix error reporting of misconfigured ports · 4b8bae08

James Smart authored Jun 12, 2012

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

4b8bae08

[SCSI] lpfc 8.3.32: Fix system panic due to node state change · 6b415f5d

James Smart authored Jun 12, 2012

Fix System Panic During IO Test using Medusa tool
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

6b415f5d

[SCSI] lpfc 8.3.32: Fix ability to change FCP EQ delay multiplier · 173edbb2

James Smart authored Jun 12, 2012

Fix fcp_imax module parameter to dynamically change FCP EQ delay multiplier
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

173edbb2

[SCSI] lpfc 8.3.32: Correct successful aborts returning error status · 3a70730a

James Smart authored Jun 12, 2012

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

3a70730a

[SCSI] lpfc 8.3.32: Correct provisioning change failure on local function · 618a5230

James Smart authored Jun 12, 2012

Fixed system held-up when performing resource provsion through same PCI
function
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

618a5230

[SCSI] lpfc 8.3.32: Correct host DIF configuration that hung system · bbeb79b9

James Smart authored Jun 12, 2012

Fix system hang due to bad protection module parameters (CR: 130769)
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

bbeb79b9

[SCSI] lpfc 8.3.32: Fix CQ and EQ dump failure for debugfs · 3b3da6a9

James Smart authored Jun 12, 2012

Fixed debug helper routine failed to dump CQ and EQ entries in non-MSI-X mode
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

3b3da6a9

[SCSI] lpfc 8.3.32: Correct null pointer Error in lpfc_sli.c · a629852a

James Smart authored Jun 12, 2012

This patch corrects the issue caught via Smatch and reported by Dan Carpenter:
http://marc.info/?l=linux-scsi&m=133693516103343

Resolve null pointer check ordering that were odd
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

a629852a

[SCSI] lpfc 8.3.32: lpfc_sli.c: add missing jumps to mempool_free · 4f4c1863

James Smart authored Jun 12, 2012

Incorporate patch originally supplied by Julia Lawall <Julia.Lawall@lip6.fr>
http://marc.info/?l=linux-scsi&m=133572879711140&w=2

"It appears that mempool_free should be performed on these failures as on
 the other exists from the containing functions."
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

4f4c1863

[SCSI] bnx2fc: Bumped version to 1.0.12 · eb47aa2c

Bhanu Prakash Gollapudi authored Jun 07, 2012

Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

eb47aa2c

[SCSI] bnx2fc: use list_entry instead of explicit cast · d71fb3bd

Bhanu Prakash Gollapudi authored Jun 07, 2012

Use list_for_each_entry_safe() instead of explicit cast to avoid relying on
struct layout
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

d71fb3bd

[SCSI] bnx2fc: Improve error recovery by handling parity errors · 5c17ae21

Bhanu Prakash Gollapudi authored Jun 07, 2012

During parity errors, the ramrods are not issued to FW. bnx2fc waits for the
timeout value, and proceeds with cleaning up the IOs. Since we are already
out-of-sync with FW, cleanup commands timeout too, and do not get the
completion. This operation takes 36 secs for each session to upload causing
huge delays. To fix this, bnx2fc now gets a PARITY_ERROR from cnic driver, and
upon failure, the driver does not issue any commands to the FW and finishes the
upload process sooner.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

5c17ae21