1. 27 Nov, 2014 2 commits
  2. 25 Nov, 2014 11 commits
  3. 24 Nov, 2014 26 commits
  4. 20 Nov, 2014 1 commit
    • Martin Peschke's avatar
      zfcp: auto port scan resiliency · 18f87a67
      Martin Peschke authored
      This patch improves the Fibre Channel port scan behaviour of the zfcp lldd.
      Without it the zfcp device driver may churn up the storage area network by
      excessive scanning and scan bursts, particularly in big virtual server
      environments, potentially resulting in interference of virtual servers and
      reduced availability of storage connectivity.
      
      The two main issues as to the zfcp device drivers automatic port scan in
      virtual server environments are frequency and simultaneity.
      On the one hand, there is no point in allowing lots of ports scans
      in a row. It makes sense, though, to make sure that a scan is conducted
      eventually if there has been any indication for potential SAN changes.
      On the other hand, lots of virtual servers receiving the same indication
      for a SAN change had better not attempt to conduct a scan instantly,
      that is, at the same time.
      
      Hence this patch has a two-fold approach for better port scanning:
      the introduction of a rate limit to amend frequency issues, and the
      introduction of a short random backoff to amend simultaneity issues.
      Both approaches boil down to deferred port scans, with delays
      comprising parts for both approaches.
      
      The new port scan behaviour is summarised best by:
      
                                                     NEW:    NEW:
                                no_auto_port_rescan  random  rate    flush
                                                     backoff limit   =wait
      
      adapter resume/thaw       yes                  yes     no      yes*
      adapter online (user)     no                   yes     no      yes*
      port rescan (user)        no                   no      no      yes
      adapter recovery (user)   yes                  yes     yes     no
      adapter recovery (other)  yes                  yes     yes     no
      incoming ELS              yes                  yes     yes     no
      incoming ELS lost         yes                  yes     yes     no
      
      Implementation is straight-forward by converting an existing worker to
      a delayed worker. But care is needed whenever that worker is going to be
      flushed (in order to make sure work has been completed), since a flush
      operation cancels the timer set up for deferred execution (see * above).
      
      There is a small race window whenever a port scan work starts
      running up to the point in time of storing the time stamp for that port
      scan. The impact is negligible. Closing that gap isn't trivial, though, and
      would the destroy the beauty of a simple work-to-delayed-work conversion.
      Signed-off-by: default avatarMartin Peschke <mpeschke@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      18f87a67