1. 12 Aug, 2016 21 commits
  2. 11 Aug, 2016 3 commits
  3. 10 Aug, 2016 1 commit
  4. 09 Aug, 2016 15 commits
    • Stefan Bader's avatar
      UBUNTU: Ubuntu-4.4.0-35.54 · 1135656a
      Stefan Bader authored
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      1135656a
    • Timo Aaltonen's avatar
      UBUNTU: SAUCE: i915_bpo: Sync with v4.7 · 1a9ca237
      Timo Aaltonen authored
      BugLink: http://bugs.launchpad.net/bugs/1609742
      
      Sync with v4.7 and un-revert 280201ac which got fixed upstream.
      
      Also drop two workarounds from 9f81d279:
      
      drm/i915/edp: Add WaKVMNotificationOnConfigChange:bdw
      - it's only for BDW which doesn't use i915_bpo
      
      drm/i915/skl: Add WAC6entrylatency
      - it didn't end up in 4.7
      Signed-off-by: default avatarTimo Aaltonen <timo.aaltonen@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1a9ca237
    • Sebastian Ott's avatar
      s390/cio: allow to reset channel measurement block · 13f6d6ac
      Sebastian Ott authored
      BugLink: http://bugs.launchpad.net/bugs/1609415
      
      Prior to commit 1bc6664b a call to
      enable_cmf for a device for which channel measurement was already
      enabled resulted in a reset of the measurement data.
      
      What looked like bugs at the time (a 2nd allocation was triggered
      but failed, reset was called regardless of previous failures, and
      errors have not been reported to userspace) was actually something
      at least one userspace tool depended on. Restore that behavior in
      a sane way.
      
      Fixes: 1bc6664b ("s390/cio: use device_lock during cmb activation")
      Cc: stable@vger.kernel.org #v4.4+
      Signed-off-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Reviewed-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      (cherry picked from commit 0f5d050c)
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      13f6d6ac
    • Michael Neuling's avatar
      powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint() · 2ba45ced
      Michael Neuling authored
      BugLink: http://bugs.launchpad.net/bugs/1606786
      
      At the start of __tm_recheckpoint() we save the kernel stack pointer
      (r1) in SPRG SCRATCH0 (SPRG2) so that we can restore it after the
      trecheckpoint.
      
      Unfortunately, the same SPRG is used in the SLB miss handler.  If an
      SLB miss is taken between the save and restore of r1 to the SPRG, the
      SPRG is changed and hence r1 is also corrupted.  We can end up with
      the following crash when we start using r1 again after the restore
      from the SPRG:
      
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 658 PID: 143777 Comm: htm_demo Tainted: G            EL   X 4.4.13-0-default #1
        task: c0000b56993a7810 ti: c00000000cfec000 task.ti: c0000b56993bc000
        NIP: c00000000004f188 LR: 00000000100040b8 CTR: 0000000010002570
        REGS: c00000000cfefd40 TRAP: 0300   Tainted: G            EL   X  (4.4.13-0-default)
        MSR: 8000000300001033 <SF,ME,IR,DR,RI,LE>  CR: 02000424  XER: 20000000
        CFAR: c000000000008468 DAR: 00003ffd84e66880 DSISR: 40000000 SOFTE: 0
        PACATMSCRATCH: 00003ffbc865e680
        GPR00: fffffffcfabc4268 00003ffd84e667a0 00000000100d8c38 000000030544bb80
        GPR04: 0000000000000002 00000000100cf200 0000000000000449 00000000100cf100
        GPR08: 000000000000c350 0000000000002569 0000000000002569 00000000100d6c30
        GPR12: 00000000100d6c28 c00000000e6a6b00 00003ffd84660000 0000000000000000
        GPR16: 0000000000000003 0000000000000449 0000000010002570 0000010009684f20
        GPR20: 0000000000800000 00003ffd84e5f110 00003ffd84e5f7a0 00000000100d0f40
        GPR24: 0000000000000000 0000000000000000 0000000000000000 00003ffff0673f50
        GPR28: 00003ffd84e5e960 00000000003d0f00 00003ffd84e667a0 00003ffd84e5e680
        NIP [c00000000004f188] restore_gprs+0x110/0x17c
        LR [00000000100040b8] 0x100040b8
        Call Trace:
        Instruction dump:
        f8a1fff0 e8e700a8 38a00000 7ca10164 e8a1fff8 e821fff0 7c0007dd 7c421378
        7db142a6 7c3242a6 38800002 7c810164 <e9c100e0> e9e100e8 ea0100f0 ea2100f8
      
      We hit this on large memory machines (> 2TB) but it can also be hit on
      smaller machines when 1TB segments are disabled.
      
      To hit this, you also need to be virtualised to ensure SLBs are
      periodically removed by the hypervisor.
      
      This patches moves the saving of r1 to the SPRG to the region where we
      are guaranteed not to take any further SLB misses.
      
      Fixes: 98ae22e1 ("powerpc: Add helper functions for transactional memory context switching")
      Cc: stable@vger.kernel.org # v3.9+
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Acked-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      (cherry picked from commit 6bcb8014)
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2ba45ced
    • Michael Neuling's avatar
      powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 · c91257e8
      Michael Neuling authored
      BugLink: http://bugs.launchpad.net/bugs/1606786
      
      Currently we have 2 segments that are bolted for the kernel linear
      mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
      stacks. Anything accessed outside of these regions may need to be
      faulted in. (In practice machines with TM always have 1T segments)
      
      If a machine has < 2TB of memory we never fault on the kernel linear
      mapping as these two segments cover all physical memory. If a machine
      has > 2TB of memory, there may be structures outside of these two
      segments that need to be faulted in. This faulting can occur when
      running as a guest as the hypervisor may remove any SLB that's not
      bolted.
      
      When we treclaim and trecheckpoint we have a window where we need to
      run with the userspace GPRs. This means that we no longer have a valid
      stack pointer in r1. For this window we therefore clear MSR RI to
      indicate that any exceptions taken at this point won't be able to be
      handled. This means that we can't take segment misses in this RI=0
      window.
      
      In this RI=0 region, we currently access the thread_struct for the
      process being context switched to or from. This thread_struct access
      may cause a segment fault since it's not guaranteed to be covered by
      the two bolted segment entries described above.
      
      We've seen this with a crash when running as a guest with > 2TB of
      memory on PowerVM:
      
        Unrecoverable exception 4100 at c00000000004f138
        Oops: Unrecoverable exception, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
        task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
        NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
        REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
        MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
        CFAR: c00000000004ed18 SOFTE: 0
        GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
        GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
        GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
        GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
        GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
        GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
        GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
        GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
        NIP [c00000000004f138] restore_gprs+0xd0/0x16c
        LR [0000000010003a24] 0x10003a24
        Call Trace:
        [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
        [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
        [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
        [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
        [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
        [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
        [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
        [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
        Instruction dump:
        7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
        38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
        ---[ end trace 602126d0a1dedd54 ]---
      
      This fixes this by copying the required data from the thread_struct to
      the stack before we clear MSR RI. Then once we clear RI, we only access
      the stack, guaranteeing there's no segment miss.
      
      We also tighten the region over which we set RI=0 on the treclaim()
      path. This may have a slight performance impact since we're adding an
      mtmsr instruction.
      
      Fixes: 090b9284 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reviewed-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      (cherry picked from commit 190ce869)
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c91257e8
    • Kamal Mostafa's avatar
    • AceLan Kao's avatar
      intel-vbtn: new driver for Intel Virtual Button · 48cfa68f
      AceLan Kao authored
      BugLink: https://bugs.launchpad.net/bugs/1609204
      
      This driver supports power button event in Intel Virtual Button currently.
      New Dell XPS 13 requires this driver for the power button.
      
      This driver is copied/modified from intel-hid.c
      Most credit goes to the author of intel-hid.c,
      Alex Hung <alex.hung@canonical.com>
      Signed-off-by: default avatarAceLan Kao <acelan.kao@canonical.com>
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      (cherry picked from commit 332e0812)
      Signed-off-by: default avatarAceLan Kao <acelan.kao@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      48cfa68f
    • Alex Hung's avatar
      x86/reboot: Add Dell Optiplex 7450 AIO reboot quirk · 0b194e1e
      Alex Hung authored
      BugLink: http://bugs.launchpad.net/bugs/1608762
      
      Dell Optiplex 7450 AIO works with BOOT_ACPI; however, the quirk for
      "OptiPlex 745" changes its boot method to BOOT_BIOS and causes 7450 AIO
      hangs when rebooting; as a result, 7450 AIO is appended to overwrite
      BOOT_BIOS by BOOT_ACPI in order not to break the original 745 series
      Signed-off-by: default avatarAlex Hung <alex.hung@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      (cherry picked from commit 4d581259)
      Signed-off-by: default avatarAlex Hung <alex.hung@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0b194e1e
    • Kamal Mostafa's avatar
      UBUNTU: SAUCE: xhci: Fix soft lockup in xhci_pci_probe path when XHCI_STATE_HALTED · ed3d7f66
      Kamal Mostafa authored
      Commit 27a41a83 ("xhci: Cleanup only when releasing primary hcd")
      causes a soft lockup at boot when XHCI_STATE_HALTED, preventing
      VirtualBox 5.1.x from booting if USB3.0 is enabled.
      
      Revert to allowing xhci_irq to handle the interrupt when
      XHCI_STATE_HALTED but not XHCI_STATE_DYING.
      
      Fixes: 27a41a83 ("xhci: Cleanup only when releasing primary hcd")
      BugLink: https://bugs.launchpad.net/bugs/1604058Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      Cc: <stable@vger.kernel.org> #v4.3+
      Cc: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      ed3d7f66
    • Christoph Hellwig's avatar
      block: defer timeouts to a workqueue · fc3bda55
      Christoph Hellwig authored
      BugLink: http://bugs.launchpad.net/bugs/1597908
      
      Timer context is not very useful for drivers to perform any meaningful abort
      action from.  So instead of calling the driver from this useless context
      defer it to a workqueue as soon as possible.
      
      Note that while a delayed_work item would seem the right thing here I didn't
      dare to use it due to the magic in blk_add_timer that pokes deep into timer
      internals.  But maybe this encourages Tejun to add a sensible API for that to
      the workqueue API and we'll all be fine in the end :)
      
      Contains a major update from Keith Bush:
      
      "This patch removes synchronizing the timeout work so that the timer can
       start a freeze on its own queue. The timer enters the queue, so timer
       context can only start a freeze, but not wait for frozen."
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      (cherry picked from commit 287922eb)
      Signed-off-by: default avatarJoseph Salisbury <joseph.salisbury@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fc3bda55
    • Jesse Gross's avatar
      tunnels: Remove encapsulation offloads on decap. · fab0acde
      Jesse Gross authored
      If a packet is either locally encapsulated or processed through GRO
      it is marked with the offloads that it requires. However, when it is
      decapsulated these tunnel offload indications are not removed. This
      means that if we receive an encapsulated TCP packet, aggregate it with
      GRO, decapsulate, and retransmit the resulting frame on a NIC that does
      not support encapsulation, we won't be able to take advantage of hardware
      offloads even though it is just a simple TCP packet at this point.
      
      This fixes the problem by stripping off encapsulation offload indications
      when packets are decapsulated.
      
      The performance impacts of this bug are significant. In a test where a
      Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
      and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
      result of avoiding unnecessary segmentation at the VM tap interface.
      Reported-by: default avatarRamu Ramamurthy <sramamur@linux.vnet.ibm.com>
      Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
      Signed-off-by: default avatarJesse Gross <jesse@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      BugLink: http://bugs.launchpad.net/bugs/1602755
      
      (backported from commit a09a4c8d)
      [adapt iptunnel_pull_header arguments, avoid 7f290c94]
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Acked-by: default avatarAndy Whitcroft <apw@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fab0acde
    • Guilherme G. Piccoli's avatar
      be2net: perform temperature query in adapter regardless of its interface state · cd2bd529
      Guilherme G. Piccoli authored
      BugLink: http://bugs.launchpad.net/bugs/1607387
      
      The be2net driver performs fw temperature queries on be_worker() routine,
      which is executed each second for each be_adapter. There is a frequency
      threshold to avoid fw query to happens at each call to be_worker();
      instead, currently a fw query occurs once in 64 runs of the procedure.
      
      Nevertheless, this fw temperature query is invoked only for adapters which
      interface is up, so we can see I/O errors on read of hwmon counters from
      userspace (from tools like lm-sensors) in case we have adapters' functions
      which interface is down.
      
      This patch moves the fw query code to be invoked even if interface is down.
      No functional changes were introduced.
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Acked-by: default avatarSathya Perla <sathya.perla@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit d3480615)
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      cd2bd529
    • Mario Limonciello's avatar
      r8152: Add support for setting pass through MAC address on RTL8153-AD · 4aff085c
      Mario Limonciello authored
      BugLink: http://bugs.launchpad.net/bugs/1579984
      
      The RTL8153-AD supports a persistent system specific MAC address.
      This means a device plugged into two different systems with host side
      support will show different (but persistent) MAC addresses.
      
      This information for the system's persistent MAC address is burned in when
      the system HW is built and available under \_SB.AMAC in the DSDT at runtime.
      
      This technology is currently implemented in the Dell TB15 and WD15 Type-C
      docks.  More information is available here:
      http://www.dell.com/support/article/us/en/04/SLN301147Signed-off-by: default avatarMario Limonciello <mario_limonciello@dell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (back ported from commit 34ee32c9)
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      
       Conflicts:
      	drivers/net/usb/r8152.c
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4aff085c
    • Shrikrishna Khare's avatar
      Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets · 0947359f
      Shrikrishna Khare authored
      BugLink: http://bugs.launchpad.net/bugs/1605494
      
      For IPv6, if the device indicates that the checksum is correct, set
      CHECKSUM_UNNECESSARY.
      Reported-by: default avatarSubbarao Narahari <snarahari@vmware.com>
      Signed-off-by: default avatarShrikrishna Khare <skhare@vmware.com>
      Signed-off-by: default avatarJin Heo <heoj@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (back ported from commit f0d43780)
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      
       Conflicts:
      	drivers/net/vmxnet3/vmxnet3_int.h
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0947359f
    • Mauricio Faria de Oliveira's avatar
      UBUNTU: SAUCE: lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() · 31c09c42
      Mauricio Faria de Oliveira authored
      BugLink: http://bugs.launchpad.net/bugs/1597974
      
      The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
      'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.
      
      That's not true in the .eh_(device|target|bus)_reset_handler path,
      because lpfc_send_taskmgmt() sends commands not from the midlayer,
      so does not set 'lpfc_cmd->pCmd'.
      
      That is true in the .queuecommand path because lpfc_queuecommand()
      stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd
      is stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is
      passed to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.
      
      This problem can be hit on SCSI EH, and immediately with sg_reset.
      These 2 test-cases demonstrate the problem/fix with next-20160601.
      
      Test-case 1) sg_reset
      
          # strace sg_reset --device /dev/sdm
          <...>
          open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
          ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
          +++ killed by SIGSEGV +++
          Segmentation fault
      
          # dmesg
          Unable to handle kernel paging request for data at address 0x00000000
          Faulting instruction address: 0xd00000001c88442c
          Oops: Kernel access of bad area, sig: 11 [#1]
          <...>
          CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
          <...>
          NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
          LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
          Call Trace:
          [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
          [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
          [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
          [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
          [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
          [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
          [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
          [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
          [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
          [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
          [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
          [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
          [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
          [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
          Instruction dump:
          <...>
      
          With fix:
      
          # strace sg_reset --device /dev/sdm
          <...>
          open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
          ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
          close(3)                                = 0
          exit_group(0)                           = ?
          +++ exited with 0 +++
      
          # dmesg
          [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002
      
      Test-case 2) SCSI EH
      
          Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
          -       cmd->scsi_done(cmd);
          +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
          +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
          +       else
          +               cmd->scsi_done(cmd);
      
          # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose
      
          # dd if=/dev/sdm of=/dev/null iflag=direct &
          <...>
      
          After a while:
      
          # dmesg
          lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
          lpfc: skip scsi_done()
          <...>
          Unable to handle kernel paging request for data at address 0x00000000
          Faulting instruction address: 0xd0000000199e448c
          Oops: Kernel access of bad area, sig: 11 [#1]
          <...>
          CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
          <...>
          NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
          LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
          Call Trace:
          [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
          [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
          [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
          [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
          [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
          [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
          [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
          [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
          [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
          [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
          Instruction dump:
          <...>
      
          With fix:
      
          # dmesg
          lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
          lpfc: skip scsi_done()
          <...>
          lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
          <...>
          lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
          <...>
          lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
          <...>
          lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
          <...>
      
      Fixes: 8b0dff14 ("lpfc: Add support for using block multi-queue")
      Signed-off-by: default avatarMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Signed-off-by: default avatarJames Smart   <james.smart@broadcom.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      31c09c42