1. 17 May, 2012 9 commits
    • Jörn Engel's avatar
      [SCSI] sg: remove sg_mutex · 37b9d1e0
      Jörn Engel authored
      With the exception of the detached field, sg_mutex no longer adds any
      locking.  detached handling has been broken before and is still broken
      and this patch does not seem to make things worse than they were to
      begin with.
      
      However, I have observed cases of tasks being blocked for >200s waiting
      for sg_mutex.  So the removal clearly adds value for very little cost.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      37b9d1e0
    • Jörn Engel's avatar
      [SCSI] sg: completely protect sfds · 035d12e6
      Jörn Engel authored
      sfds is protected by sg_index_lock - except for sg_open(), where it
      isn't.  Change that and add some documentation.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      035d12e6
    • Jörn Engel's avatar
      [SCSI] sg: protect sdp->exclude · b499e524
      Jörn Engel authored
      Changes since v1: set_exclude now returns the new value, which gets
      rid of the comma expression and the operator precedence bug.  Thanks
      to Douglas for spotting it.
      
      sdp->exclude was previously protected by the BKL.  The sg_mutex, which
      replaced the BKL, only semi-protected it, as it was missing from
      sg_release() and sg_proc_seq_show_debug().  Take an explicit spinlock
      for it.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      b499e524
    • Jörn Engel's avatar
      [SCSI] sg: prevent unwoken sleep · 6acddc5e
      Jörn Engel authored
      srp->done is protected by sfp->rq_list_lock everywhere, except for this
      one case.  Result can be that the wake-up happens before the cacheline
      with the changed srp->done has arrived, so the waiter can go back to
      sleep and never be woken up again.
      
      The wait_event_interruptible() means that anyone trying to debug this
      unlikely race will likely notice everything working fine again, as the
      next signal will unwedge things.  Evil.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      6acddc5e
    • Jörn Engel's avatar
      [SCSI] sg: remove closed flag · ebaf466b
      Jörn Engel authored
      After sg_release() has been called, noone should be able to actually use
      that filedescriptor anymore.  So if closed ever made a difference in the
      past five years or so, it would have meant a bug.  Remove it.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      [jejb: fix up checkpatch warnings]
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      ebaf466b
    • Jörn Engel's avatar
      [SCSI] sg: use wait_event_interruptible() · 3f0c6aba
      Jörn Engel authored
      Afaics the use of __wait_event_interruptible() as opposed to
      wait_event_interruptible() is purely historic.  So let's follow the rest
      of the kernel and check the condition before prepare_to_wait() - and
      also make the code a bit nicer.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      3f0c6aba
    • Jörn Engel's avatar
      [SCSI] sg: remove while (1) non-loop · 794c10fa
      Jörn Engel authored
      The while (1) construct isn't actually a loop at all.  So let's not
      pretent and obfuscate the code.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      794c10fa
    • Jörn Engel's avatar
      [SCSI] sg: remove unnecessary indentation · dddbf8d9
      Jörn Engel authored
      blocking is de-facto a constant and the now-removed comment wasn't all
      that useful either.  Without them and the resulting indentation the code
      is a bit nicer to read.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      dddbf8d9
    • Dan Williams's avatar
      [SCSI] sd: limit the scope of the async probe domain · a7a20d10
      Dan Williams authored
      sd injects and synchronizes probe work on the global kernel-wide domain.
      This runs into conflict with PM that wants to perform resume actions in
      async context:
      
      [  494.237079] INFO: task kworker/u:3:554 blocked for more than 120 seconds.
      [  494.294396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  494.360809] kworker/u:3     D 0000000000000000     0   554      2 0x00000000
      [  494.420739]  ffff88012e4d3af0 0000000000000046 ffff88013200c160 ffff88012e4d3fd8
      [  494.484392]  ffff88012e4d3fd8 0000000000012500 ffff8801394ea0b0 ffff88013200c160
      [  494.548038]  ffff88012e4d3ae0 00000000000001e3 ffffffff81a249e0 ffff8801321c5398
      [  494.611685] Call Trace:
      [  494.632649]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
      [  494.674687]  [<ffffffff8104b968>] async_synchronize_cookie_domain+0xb6/0x112
      [  494.734177]  [<ffffffff810461ff>] ? __init_waitqueue_head+0x50/0x50
      [  494.787134]  [<ffffffff8131a224>] ? scsi_remove_target+0x48/0x48
      [  494.837900]  [<ffffffff8104b9d9>] async_synchronize_cookie+0x15/0x17
      [  494.891567]  [<ffffffff8104ba49>] async_synchronize_full+0x54/0x70  <-- here we wait for async contexts to complete
      [  494.943783]  [<ffffffff8104b9f5>] ? async_synchronize_full_domain+0x1a/0x1a
      [  495.002547]  [<ffffffffa00114b1>] sd_remove+0x2c/0xa2 [sd_mod]
      [  495.051861]  [<ffffffff812fe94f>] __device_release_driver+0x86/0xcf
      [  495.104807]  [<ffffffff812fe9bd>] device_release_driver+0x25/0x32  <-- here we take device_lock()
      
      [  853.511341] INFO: task kworker/u:4:549 blocked for more than 120 seconds.
      [  853.568693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  853.635119] kworker/u:4     D ffff88013097b5d0     0   549      2 0x00000000
      [  853.695129]  ffff880132773c40 0000000000000046 ffff880130790000 ffff880132773fd8
      [  853.758990]  ffff880132773fd8 0000000000012500 ffff88013288a0b0 ffff880130790000
      [  853.822796]  0000000000000246 0000000000000040 ffff88013097b5c8 ffff880130790000
      [  853.886633] Call Trace:
      [  853.907631]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
      [  853.949670]  [<ffffffff8149cc44>] __mutex_lock_common+0x220/0x351
      [  854.001225]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
      [  854.049082]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
      [  854.097011]  [<ffffffff8149ce48>] mutex_lock_nested+0x2f/0x36   <-- here we wait for device_lock()
      [  854.145591]  [<ffffffff81304bd7>] device_resume+0x58/0x1c4
      [  854.192066]  [<ffffffff81304d61>] async_resume+0x1e/0x45
      [  854.237019]  [<ffffffff8104bc93>] async_run_entry_fn+0xc6/0x173  <-- ...while running in async context
      
      Provide a 'scsi_sd_probe_domain' so that async probe actions actions can
      be flushed without regard for the state of PM, and allow for the resume
      path to handle devices that have transitioned from SDEV_QUIESCE to
      SDEV_DEL prior to resume.
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      [alan: uplevel scsi_sd_probe_domain, clarify scsi_device_resume]
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      [jejb: remove unneeded config guards in include file]
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      a7a20d10
  2. 10 May, 2012 31 commits