1. 10 Nov, 2015 22 commits
  2. 09 Nov, 2015 4 commits
  3. 05 Nov, 2015 1 commit
  4. 30 Oct, 2015 13 commits
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid bypassing context cleanup · a82544c7
      Matthew R. Ochs authored
      Contexts may be skipped over for cleanup in situations where contention
      for the adapter's table-list mutex is experienced in the presence of a
      signal during the execution of the release handler.
      
      This can lead to two known issues:
      
       - A hang condition on remove as that path tries to wait for users to
         cleanup - something that will never complete should this scenario play
         out as the user has already cleaned up from their perspective.
      
       - An Oops in the unmap_mapping_range() call that is made as part of
         the user waiting mechanism that is invoked on remove when contexts
         are found to still exist.
      
      The root cause of this issue can be found in get_context() and how the
      table-list mutex is acquired. As this code path is shared by several
      different access points within the driver, a decision was made during
      the development cycle to acquire this mutex in this location using the
      interruptible version of the mutex locking service. In almost all of
      the use-cases and environmental scenarios this holds up, even when the
      mutex is contended. However, for critical system threads (such as the
      release handler), failing to acquire the mutex and bailing with the
      intention of the user being able to try again later is unacceptable.
      
      In such a scenario, the context _must_ be derived as it is on an
      irreversible path to being freed. Without being able to derive the
      context, the code mistakenly assumes that it has already been freed
      and proceeds to free up the underlying CXL context resources. From
      this point on, any usage of [the now stale] CXL context resources
      will result in undefined behavior. This is root cause of the Oops
      mentioned as the second known issue as the mapping passed to the
      unmap_mapping_range() service is owned by the CXL context.
      
      To fix this problem, acquisition of the table-list mutex within
      get_context() is simply changed to use the uninterruptible version
      of the mutex locking service. This is safe as the timing windows for
      holding this mutex are short and also protected against blocking.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: default avatarManoj Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      a82544c7
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid lock instrumentation rejection · 0d73122c
      Matthew R. Ochs authored
      When running with lock instrumentation (e.g. lockdep), some of the
      instrumentation can become disabled at probe time for a cxlflash
      adapter. This is due to a missing lock registration for the tmf_slock.
      
      The fix is to call spin_lock_init() for the tmf_slock during probe.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: default avatarManoj Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      0d73122c
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid corrupting port selection mask · 1a47401b
      Matthew R. Ochs authored
      The port selection mask of a LUN can be corrupted when the manage LUN
      ioctl (DK_CXLFLASH_MANAGE_LUN) is issued more than once for any device.
      
      This mask indicates to the AFU which port[s] can be used for a data
      transfer to/from a particular LUN. The mask is critical to ensuring the
      correct behavior when using the virtual LUN function of this adapter.
      When the mask is configured for both ports, an I/O may be sent to either
      port as the AFU assumes that each port has access to the same physical
      device (specified by LUN ID in the port LUN table).
      
      In a situation where the mask becomes incorrectly configured to reflect
      access to both ports when in fact there is only access through a single
      port, an I/O can be targeted to the wrong physical device. This can lead
      to data corruption among other ill effects (e.g. security leaks).
      
      The cause for this corruption is the assumption that the ioctl will only
      be called a second time for a LUN when it is being configured for access
      via a second port. A boolean 'newly_created' variable is used to
      differentiate between a LUN that was created (and subsequently configured
      for single port access) and one that is destined for access across both
      ports. While initially set to 'true', this sticky boolean is toggled to
      the 'false' state during a lookup on any next ioctl performed on a device
      with a matching WWN/WWID. The code fails to realize that the match could
      in fact be the same device calling in again. From here, an assumption is
      made that any LUN with 'newly_created' set to 'false' is configured for
      access over both ports and the port selection mask is set to reflect this.
      Any future attempts to use this LUN for hosting a virtual LUN will result
      in the port LUN table being incorrectly programmed.
      
      As a remedy, the 'newly_created' concept was removed entirely and replaced
      with code that always constructs the port selection mask based upon the
      SCSI channel of the LUN being accessed. The bits remain sticky, therefore
      allowing for a device to be accessed over both ports when that is in fact
      the correct physical configuration.
      
      Also included in this commit are a few minor related changes to enhance
      the fix and provide better debug information for port selection mask and
      port LUN table bugs in the future. These include renaming refresh_local()
      to lookup_local(), tracing the WWN/WWID as a big-endian entity, and
      tracing the port selection mask, SCSI channel, and LUN ID each time the
      port LUN table is programmed.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: default avatarManoj Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      1a47401b
    • Manoj Kumar's avatar
      cxlflash: Fix to escalate to LINK_RESET on login timeout · e6e6df3f
      Manoj Kumar authored
      A 'login timed out' asynchronous error interrupt is generated if no
      response is seen to a FLOGI within 2 seconds.  If the time out error
      is not escalated to a LINK_RESET the port will not be available for
      use. This fix provides the required escalation.
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Acked-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      e6e6df3f
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid leaving dangling interrupt resources · ee3491ba
      Matthew R. Ochs authored
      When running with an unsupported AFU, the cxlflash driver fails
      the probe. When the driver is removed, the following Oops is
      encountered on a show_interrupts() thread:
      
      Call Trace:
      [c000001fba5a7a10] [0000000000000003] 0x3 (unreliable)
      [c000001fba5a7a60] [c00000000053dcf4] vsnprintf+0x204/0x4c0
      [c000001fba5a7ae0] [c00000000030045c] seq_vprintf+0x5c/0xd0
      [c000001fba5a7b20] [c00000000030051c] seq_printf+0x4c/0x60
      [c000001fba5a7b50] [c00000000013e140] show_interrupts+0x370/0x4f0
      [c000001fba5a7c10] [c0000000002ff898] seq_read+0xe8/0x530
      [c000001fba5a7ca0] [c00000000035d5c0] proc_reg_read+0xb0/0x110
      [c000001fba5a7cf0] [c0000000002ca74c] __vfs_read+0x6c/0x180
      [c000001fba5a7d90] [c0000000002cb464] vfs_read+0xa4/0x1c0
      [c000001fba5a7de0] [c0000000002cc51c] SyS_read+0x6c/0x110
      [c000001fba5a7e30] [c000000000009204] system_call+0x38/0xb4
      
      The Oops is due to not cleaning up correctly on the unsupported
      AFU error path, leaving various allocated and registered resources.
      In this case, interrupts are in a semi-allocated/registered state,
      which the show_interrupts() thread attempts to use.
      
      To fix, the cleanup logic in init_afu() is consolidated to error
      gates at the bottom of the function and the appropriate goto is
      added to each error path. As a mini side fix while refactoring
      in this routine, the else statement following the AFU version
      evaluation is eliminated as it is not needed.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: default avatarManoj Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      ee3491ba
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid potential deadlock on EEH · aacb4ff6
      Matthew R. Ochs authored
      Ioctl threads that use scsi_execute() can run for an excessive amount
      of time due to the fact that they have lengthy timeouts and retry logic
      built in. Under normal operation this is not an issue. However, once EEH
      enters the picture, a long execution time coupled with the possibility
      that a timeout can trigger entry to the driver via registered reset
      callbacks becomes a liability.
      
      In particular, a deadlock can occur when an EEH event is encountered
      while in running in scsi_execute(). As part of the recovery, the EEH
      handler drains all currently running ioctls, waiting until they have
      completed before proceeding with a reset. As the scsi_execute()'s are
      situated on the ioctl path, the EEH handler will wait until they (and
      the remainder of the ioctl handler they're associated with) have
      completed. Normally this would not be much of an issue aside from the
      longer recovery period. Unfortunately, the scsi_execute() triggers a
      reset when it times out. The reset handler will see that the device is
      already being reset and wait until that reset completed. This creates
      a condition where the EEH handler becomes stuck, infinitely waiting for
      the ioctl thread to complete.
      
      To avoid this behavior, temporarily unmark the scsi_execute() threads
      as an ioctl thread by releasing the ioctl read semaphore. This allows
      the EEH handler to proceed with a recovery while the thread is still
      running. Once the scsi_execute() returns, the ioctl read semaphore is
      reacquired and the adapter state is rechecked in case it changed while
      inside of scsi_execute(). The state check will wait if the adapter is
      still being recovered or returns a failure if the recovery failed. In
      the event that the adapter reset failed, the failure is simply returned
      as the ioctl would be unable to continue.
      Reported-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      aacb4ff6
    • Matthew R. Ochs's avatar
      cxlflash: Correct trace string · fa3f2c6e
      Matthew R. Ochs authored
      The trace following the failure of alloc_mem() incorrectly identifies
      which function failed. This can lead to misdiagnosing a failure.
      
      Fix the string to correctly indicate that alloc_mem() failed.
      Reported-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      fa3f2c6e
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid corrupting adapter fops · 17ead26f
      Matthew R. Ochs authored
      The fops owned by the adapter can be corrupted in certain scenarios,
      opening a window where certain fops are temporarily NULLed before being
      reset to their proper value. This can potentially lead software to make
      incorrect decisions, leaving the user with the inability to function as
      intended.
      
      An example of this behavior can be observed when there are a number of
      users with a high rate of turn around (attach to LUN, perform an I/O,
      detach from LUN, repeat). Every so often a user is given a valid
      context and adapter file descriptor, but the file associated with the
      descriptor lacks the correct read permission bit (FMODE_CAN_READ) and
      thus the read system call bails before calling the valid read fop.
      
      Background:
      
      The fops is stored in the adapter structure to provide the ability to
      lookup the adapter structure from within the fop handler. CXL services
      use the file's private_data and at present, the CXL context does not
      have a private section. In an effort to limit areas of the cxlflash
      driver with code specific the superpipe function, a design choice was
      made to keep the details of the fops situated away from the legacy
      portions of the driver. This drove the behavior that the adapter fops
      is set at the beginning of the disk attach ioctl handler when there
      are no users present.
      
      The corruption that this fix remedies is due to the fact that the fops
      is initially defaulted to values found within a static structure. When
      the fops is handed down to the CXL services later in the attach path,
      certain services are patched. The fops structure remains correct until
      the user count drops to 0 and the fops is reset, triggering the process
      to repeat again. The user counts are tightly coupled with the creation
      and deletion of the user context. If multiple users perform a disk
      attach at the same time, when the user count is currently 0, some users
      can be in the middle of obtaining a file descriptor and have not yet
      reached the context creation code that [in addition to creating the
      context] increments the user count. Subsequent users coming in to
      perform the attach see that the user count is still 0, and reinitialize
      the fops, temporarily removing the patched fops. The users that are in
      the middle obtaining their file descriptor may then receive an invalid
      descriptor.
      
      The fix simply removes the user count altogether and moves the fops
      initialization to probe time such that it is only performed one time
      for the life of the adapter. In the future, if the CXL services adopt
      a private member for their context, that could be used to store the
      adapter structure reference and cxlflash could revert to a model that
      does not require an embedded fops.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      17ead26f
    • Manoj Kumar's avatar
      cxlflash: Fix to double the delay each time · b22b4037
      Manoj Kumar authored
      The operator used to double the master context response delay
      is incorrect and does not result in delay doubling.
      
      To fix, use a left shift instead of the XOR operator.
      Reported-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      b22b4037
    • Matthew R. Ochs's avatar
      MAINTAINERS: Add cxlflash driver · 11f43ae7
      Matthew R. Ochs authored
      Add stanza for cxlflash SCSI driver.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      11f43ae7
    • Matthew R. Ochs's avatar
      cxlflash: Fix to prevent stale AFU RRQ · af10483e
      Matthew R. Ochs authored
      Following an adapter reset, the AFU RRQ that resides in host memory
      holds stale data. This can lead to a condition where the RRQ interrupt
      handler tries to process stale entries and/or endlessly loops due to an
      out of sync generation bit.
      
      To fix, the AFU RRQ in host memory needs to be cleared after each reset.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      af10483e
    • Matthew R. Ochs's avatar
      cxlflash: Correct spelling, grammar, and alignment mistakes · f15fbf8d
      Matthew R. Ochs authored
      There are several spelling and grammar mistakes throughout the
      driver. Additionally there are a handful of places where there
      are extra lines and unnecessary variables/statements. These are
      a nuisance and pollute the driver.
      
      Fix spelling and grammar issues. Update some comments for clarity and
      consistency. Remove extra lines and a few unneeded variables/statements.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      f15fbf8d
    • Matthew R. Ochs's avatar
      cxlflash: Fix to prevent EEH recovery failure · 8e782623
      Matthew R. Ochs authored
      The process_sense() routine can perform a read capacity which
      can take some time to complete. If an EEH occurs while waiting
      on the read capacity, the EEH handler will wait to obtain the
      context's mutex in order to put the context in an error state.
      The EEH handler will sit and wait until the context is free,
      but this wait can potentially last forever (deadlock) if the
      scsi_execute() that performs the read capacity experiences a
      timeout and calls into the reset callback. When that occurs,
      the reset callback sees that the device is already being reset
      and waits for the reset to complete. This leaves two threads
      waiting on the other.
      
      To address this issue, make the context unavailable to new,
      non-system owned threads and release the context while calling
      into process_sense(). After returning from process_sense() the
      context mutex is reacquired and the context is made available
      again. The context can be safely moved to the error state if
      needed during the unavailable window as no other threads will
      hold its reference.
      Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
      Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      8e782623