1. 16 Oct, 2016 19 commits
    • Dan Williams's avatar
      x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation · 7216e7a2
      Dan Williams authored
      commit 917db484 upstream.
      
      In commit:
      
        ec776ef6 ("x86/mm: Add support for the non-standard protected e820 type")
      
      Christoph references the original patch I wrote implementing pmem support.
      The intent of the 'max_pfn' changes in that commit were to enable persistent
      memory ranges to be covered by the struct page memmap by default.
      
      However, that approach was abandoned when Christoph ported the patches [1], and
      that functionality has since been replaced by devm_memremap_pages().
      
      In the meantime, this max_pfn manipulation is confusing kdump [2] that
      assumes that everything covered by the max_pfn is "System RAM".  This
      results in kdump hanging or crashing.
      
       [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-March/000348.html
       [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1351098
      
      So fix it.
      Reported-by: default avatarZhang Yi <yizhan@redhat.com>
      Reported-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Tested-by: default avatarZhang Yi <yizhan@redhat.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boaz Harrosh <boaz@plexistor.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-nvdimm@lists.01.org
      Fixes: ec776ef6 ("x86/mm: Add support for the non-standard protected e820 type")
      Link: http://lkml.kernel.org/r/147448744538.34910.11287693517367139607.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7216e7a2
    • Mark Rutland's avatar
      arm64: fix dump_backtrace/unwind_frame with NULL tsk · f6dcaea9
      Mark Rutland authored
      commit b5e7307d upstream.
      
      In some places, dump_backtrace() is called with a NULL tsk parameter,
      e.g. in bug_handler() in arch/arm64, or indirectly via show_stack() in
      core code. The expectation is that this is treated as if current were
      passed instead of NULL. Similar is true of unwind_frame().
      
      Commit a80a0eb7 ("arm64: make irq_stack_ptr more robust") didn't
      take this into account. In dump_backtrace() it compares tsk against
      current *before* we check if tsk is NULL, and in unwind_frame() we never
      set tsk if it is NULL.
      
      Due to this, we won't initialise irq_stack_ptr in either function. In
      dump_backtrace() this results in calling dump_mem() for memory
      immediately above the IRQ stack range, rather than for the relevant
      range on the task stack. In unwind_frame we'll reject unwinding frames
      on the IRQ stack.
      
      In either case this results in incomplete or misleading backtrace
      information, but is not otherwise problematic. The initial percpu areas
      (including the IRQ stacks) are allocated in the linear map, and dump_mem
      uses __get_user(), so we shouldn't access anything with side-effects,
      and will handle holes safely.
      
      This patch fixes the issue by having both functions handle the NULL tsk
      case before doing anything else with tsk.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Fixes: a80a0eb7 ("arm64: make irq_stack_ptr more robust")
      Acked-by: default avatarJames Morse <james.morse@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6dcaea9
    • Dan Carpenter's avatar
      KVM: PPC: BookE: Fix a sanity check · 922ef422
      Dan Carpenter authored
      commit ac0e89bb upstream.
      
      We use logical negate where bitwise negate was intended.  It means that
      we never return -EINVAL here.
      
      Fixes: ce11e48b ('KVM: PPC: E500: Add userspace debug stub support')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      922ef422
    • Christoffer Dall's avatar
      KVM: arm/arm64: vgic: Don't flush/sync without a working vgic · cf75ec87
      Christoffer Dall authored
      commit 0099b770 upstream.
      
      If the vgic hasn't been created and initialized, we shouldn't attempt to
      look at its data structures or flush/sync anything to the GIC hardware.
      
      This fixes an issue reported by Alexander Graf when using a userspace
      irqchip.
      
      Fixes: 0919e84c ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
      Reported-by: default avatarAlexander Graf <agraf@suse.de>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf75ec87
    • Christoffer Dall's avatar
      KVM: arm64: Require in-kernel irqchip for PMU support · 6b30b92d
      Christoffer Dall authored
      commit 6fe407f2 upstream.
      
      If userspace creates a PMU for the VCPU, but doesn't create an in-kernel
      irqchip, then we end up in a nasty path where we try to take an
      uninitialized spinlock, which can lead to all sorts of breakages.
      
      Luckily, QEMU always creates the VGIC before the PMU, so we can
      establish this as ABI and check for the VGIC in the PMU init stage.
      This can be relaxed at a later time if we want to support PMU with a
      userspace irqchip.
      
      Cc: Shannon Zhao <shannon.zhao@linaro.org>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b30b92d
    • James Hogan's avatar
      KVM: MIPS: Drop other CPU ASIDs on guest MMU changes · 7398f669
      James Hogan authored
      commit 91e4f1b6 upstream.
      
      When a guest TLB entry is replaced by TLBWI or TLBWR, we only invalidate
      TLB entries on the local CPU. This doesn't work correctly on an SMP host
      when the guest is migrated to a different physical CPU, as it could pick
      up stale TLB mappings from the last time the vCPU ran on that physical
      CPU.
      
      Therefore invalidate both user and kernel host ASIDs on other CPUs,
      which will cause new ASIDs to be generated when it next runs on those
      CPUs.
      
      We're careful only to do this if the TLB entry was already valid, and
      only for the kernel ASID where the virtual address it mapped is outside
      of the guest user address range.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7398f669
    • Thomas Huth's avatar
      KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register · 0b7d6a74
      Thomas Huth authored
      commit fa73c3b2 upstream.
      
      The MMCR2 register is available twice, one time with number 785
      (privileged access), and one time with number 769 (unprivileged,
      but it can be disabled completely). In former times, the Linux
      kernel was using the unprivileged register 769 only, but since
      commit 8dd75ccb ("powerpc: Use privileged SPR number
      for MMCR2"), it uses the privileged register 785 instead.
      The KVM-PR code then of course also switched to use the SPR 785,
      but this is causing older guest kernels to crash, since these
      kernels still access 769 instead. So to support older kernels
      with KVM-PR again, we have to support register 769 in KVM-PR, too.
      
      Fixes: 8dd75ccbSigned-off-by: default avatarThomas Huth <thuth@redhat.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b7d6a74
    • Boris Ostrovsky's avatar
      xen/x86: Update topology map for PV VCPUs · ed17bfe0
      Boris Ostrovsky authored
      commit a6a198bc upstream.
      
      Early during boot topology_update_package_map() computes
      logical_pkg_ids for all present processors.
      
      Later, when processors are brought up, identify_cpu() updates
      these values based on phys_pkg_id which is a function of
      initial_apicid. On PV guests the latter may point to a
      non-existing node, causing logical_pkg_ids to be set to -1.
      
      Intel's RAPL uses logical_pkg_id (as topology_logical_package_id())
      to index its arrays and therefore in this case will point to index
      65535 (since logical_pkg_id is a u16). This could lead to either a
      crash or may actually access random memory location.
      
      As a workaround, we recompute topology during CPU bringup to reset
      logical_pkg_id to a valid value.
      
      (The reason for initial_apicid being bogus is because it is
      initial_apicid of the processor from which the guest is launched.
      This value is CPUID(1).EBX[31:24])
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed17bfe0
    • Uwe Kleine-König's avatar
      mfd: wm8350-i2c: Make sure the i2c regmap functions are compiled · d074f937
      Uwe Kleine-König authored
      commit 88003fb1 upstream.
      
      This fixes a compile failure:
      
      	drivers/built-in.o: In function `wm8350_i2c_probe':
      	core.c:(.text+0x828b0): undefined reference to `__devm_regmap_init_i2c'
      	Makefile:953: recipe for target 'vmlinux' failed
      
      Fixes: 52b461b8 ("mfd: Add regmap cache support for wm8350")
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Acked-by: default avatarCharles Keepax <ckeepax@opensource.wolfsonmicro.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d074f937
    • Dan Carpenter's avatar
      mfd: 88pm80x: Double shifting bug in suspend/resume · 21c02f87
      Dan Carpenter authored
      commit 9a6dc644 upstream.
      
      set_bit() and clear_bit() take the bit number so this code is really
      doing "1 << (1 << irq)" which is a double shift bug.  It's done
      consistently so it won't cause a problem unless "irq" is more than 4.
      
      Fixes: 70c6cce0 ('mfd: Support 88pm80x in 80x driver')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21c02f87
    • Boris Brezillon's avatar
      mfd: atmel-hlcdc: Do not sleep in atomic context · db910738
      Boris Brezillon authored
      commit 2c2469bc upstream.
      
      readl_poll_timeout() calls usleep_range(), but
      regmap_atmel_hlcdc_reg_write() is called in atomic context (regmap
      spinlock held).
      
      Replace the readl_poll_timeout() call by readl_poll_timeout_atomic().
      
      Fixes: ea31c0cf ("mfd: atmel-hlcdc: Implement config synchronization")
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db910738
    • Lu Baolu's avatar
      mfd: rtsx_usb: Avoid setting ucr->current_sg.status · 6fb0ccbb
      Lu Baolu authored
      commit 8dcc5ff8 upstream.
      
      Member "status" of struct usb_sg_request is managed by usb core. A
      spin lock is used to serialize the change of it. The driver could
      check the value of req->status, but should avoid changing it without
      the hold of the spinlock. Otherwise, it could cause race or error
      in usb core.
      
      This patch could be backported to stable kernels with version later
      than v3.14.
      
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Roger Tseng <rogerable@realtek.com>
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fb0ccbb
    • Takashi Sakamoto's avatar
      ALSA: usb-line6: use the same declaration as definition in header for MIDI manufacturer ID · be794eba
      Takashi Sakamoto authored
      commit 8da08ca0 upstream.
      
      Currently, usb-line6 module exports an array of MIDI manufacturer ID and
      usb-pod module uses it. However, the declaration is not the definition in
      common header. The difference is explicit length of array. Although
      compiler calculates it and everything goes well, it's better to use the
      same representation between definition and declaration.
      
      This commit fills the length of array for usb-line6 module. As a small
      good sub-effect, this commit suppress below warnings from static analysis
      by sparse v0.5.0.
      
      sound/usb/line6/driver.c:274:43: error: cannot size expression
      sound/usb/line6/driver.c:275:16: error: cannot size expression
      sound/usb/line6/driver.c:276:16: error: cannot size expression
      sound/usb/line6/driver.c:277:16: error: cannot size expression
      
      Fixes: 705ececd ("Staging: add line6 usb driver")
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be794eba
    • Anssi Hannula's avatar
      ALSA: usb-audio: Extend DragonFly dB scale quirk to cover other variants · 972cf5ca
      Anssi Hannula authored
      commit eb1a74b7 upstream.
      
      The DragonFly quirk added in 42e3121d ("ALSA: usb-audio: Add a more
      accurate volume quirk for AudioQuest DragonFly") applies a custom dB map
      on the volume control when its range is reported as 0..50 (0 .. 0.2dB).
      
      However, there exists at least one other variant (hw v1.0c, as opposed
      to the tested v1.2) which reports a different non-sensical volume range
      (0..53) and the custom map is therefore not applied for that device.
      
      This results in all of the volume change appearing close to 100% on
      mixer UIs that utilize the dB TLV information.
      
      Add a fallback case where no dB TLV is reported at all if the control
      range is not 0..50 but still 0..N where N <= 1000 (3.9 dB). Also
      restrict the quirk to only apply to the volume control as there is also
      a mute control which would match the check otherwise.
      
      Fixes: 42e3121d ("ALSA: usb-audio: Add a more accurate volume quirk for AudioQuest DragonFly")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@iki.fi>
      Reported-by: default avatarDavid W <regulars@d-dub.org.uk>
      Tested-by: default avatarDavid W <regulars@d-dub.org.uk>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      972cf5ca
    • Takashi Iwai's avatar
      ALSA: ali5451: Fix out-of-bound position reporting · 8af40120
      Takashi Iwai authored
      commit db685779 upstream.
      
      The pointer callbacks of ali5451 driver may return the value at the
      boundary occasionally, and it results in the kernel warning like
        snd_ali5451 0000:00:06.0: BUG: , pos = 16384, buffer size = 16384, period size = 1024
      
      It seems that folding the position offset is enough for fixing the
      warning and no ill-effect has been seen by that.
      Reported-by: default avatarEnrico Mioso <mrkiko.rs@gmail.com>
      Tested-by: default avatarEnrico Mioso <mrkiko.rs@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8af40120
    • Lu Baolu's avatar
      usb: dwc3: fix Clear Stall EP command failure · 2a25eb7a
      Lu Baolu authored
      commit 5e6c88d2 upstream.
      
      Commit 50c763f8 ("usb: dwc3: Set the ClearPendIN bit on Clear
      Stall EP command") sets ClearPendIN bit for all IN endpoints of
      v2.60a+ cores. This causes ClearStall command fails on 2.60+ cores
      operating in HighSpeed mode.
      
      In page 539 of 2.60a specification:
      
      "When issuing Clear Stall command for IN endpoints in SuperSpeed
      mode, the software must set the "ClearPendIN" bit to '1' to
      clear any pending IN transcations, so that the device does not
      expect any ACK TP from the host for the data sent earlier."
      
      It's obvious that we only need to apply this rule to those IN
      endpoints that currently operating in SuperSpeed mode.
      
      Fixes: 50c763f8 ("usb: dwc3: Set the ClearPendIN bit on Clear Stall EP command")
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2a25eb7a
    • John Stultz's avatar
      timekeeping: Fix __ktime_get_fast_ns() regression · 7ea90daa
      John Stultz authored
      commit 58bfea95 upstream.
      
      In commit 27727df2 ("Avoid taking lock in NMI path with
      CONFIG_DEBUG_TIMEKEEPING"), I changed the logic to open-code
      the timekeeping_get_ns() function, but I forgot to include
      the unit conversion from cycles to nanoseconds, breaking the
      function's output, which impacts users like perf.
      
      This results in bogus perf timestamps like:
       swapper     0 [000]   253.427536:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426573:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426687:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426800:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.426905:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427022:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427127:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427239:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427346:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   254.427463:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]   255.426572:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
      
      Instead of more reasonable expected timestamps like:
       swapper     0 [000]    39.953768:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.064839:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.175956:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.287103:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.398217:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.509324:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.620437:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.731546:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.842654:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    40.953772:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
       swapper     0 [000]    41.064881:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
      
      Add the proper use of timekeeping_delta_to_ns() to convert
      the cycle delta to nanoseconds as needed.
      
      Thanks to Brendan and Alexei for finding this quickly after
      the v4.8 release. Unfortunately the problematic commit has
      landed in some -stable trees so they'll need this fix as
      well.
      
      Many apologies for this mistake. I'll be looking to add a
      perf-clock sanity test to the kselftest timers tests soon.
      
      Fixes: 27727df2 "timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING"
      Reported-by: default avatarBrendan Gregg <bgregg@netflix.com>
      Reported-by: default avatarAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Tested-and-reviewed-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1475636148-26539-1-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ea90daa
    • Andrew Donnellan's avatar
      cxl: use pcibios_free_controller_deferred() when removing vPHBs · 3e08bc37
      Andrew Donnellan authored
      commit 6f38a8b9 upstream.
      
      When cxl removes a vPHB, it's possible that the pci_controller may be freed
      before all references to the devices on the vPHB have been released. This
      in turn causes an invalid memory access when the devices are eventually
      released, as pcibios_release_device() attempts to call the phb's
      release_device hook.
      
      In cxl_pci_vphb_remove(), remove the existing call to
      pcibios_free_controller(). Instead, use
      pcibios_free_controller_deferred() to free the pci_controller after all
      devices have been released. Export pci_set_host_bridge_release() so we can
      do this.
      Signed-off-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Acked-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e08bc37
    • Mauricio Faria de Oliveira's avatar
      powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb) · 83573add
      Mauricio Faria de Oliveira authored
      commit 2dd9c11b upstream.
      
      This patch leverages 'struct pci_host_bridge' from the PCI subsystem
      in order to free the pci_controller only after the last reference to
      its devices is dropped (avoiding an oops in pcibios_release_device()
      if the last reference is dropped after pcibios_free_controller()).
      
      The patch relies on pci_host_bridge.release_fn() (and .release_data),
      which is called automatically by the PCI subsystem when the root bus
      is released (i.e., the last reference is dropped).  Those fields are
      set via pci_set_host_bridge_release() (e.g. in the platform-specific
      implementation of pcibios_root_bridge_prepare()).
      
      It introduces the 'pcibios_free_controller_deferred()' .release_fn()
      and it expects .release_data to hold a pointer to the pci_controller.
      
      The function implictly calls 'pcibios_free_controller()', so an user
      must *NOT* explicitly call it if using the new _deferred() callback.
      
      The functionality is enabled for pseries (although it isn't platform
      specific, and may be used by cxl).
      
      Details on not-so-elegant design choices:
      
       - Use 'pci_host_bridge.release_data' field as pointer to associated
         'struct pci_controller' so *not* to 'pci_bus_to_host(bridge->bus)'
         in pcibios_free_controller_deferred().
      
         That's because pci_remove_root_bus() sets 'host_bridge->bus = NULL'
         (so, if the last reference is released after pci_remove_root_bus()
         runs, which eventually reaches pcibios_free_controller_deferred(),
         that would hit a null pointer dereference).
      
         The cxl/vphb.c code calls pci_remove_root_bus(), and the cxl folks
         are interested in this fix.
      
      Test-case #1 (hold references)
      
        # ls -ld /sys/block/sd* | grep -m1 0021:01:00.0
        <...> /sys/block/sdaa -> ../devices/pci0021:01/0021:01:00.0/<...>
      
        # ls -ld /sys/block/sd* | grep -m1 0021:01:00.1
        <...> /sys/block/sdab -> ../devices/pci0021:01/0021:01:00.1/<...>
      
        # cat >/dev/sdaa & pid1=$!
        # cat >/dev/sdab & pid2=$!
      
        # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
        Validating PHB DLPAR capability...yes.
        [  594.306719] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
        [  594.306738] pci_hp_remove_devices:    Removing 0021:01:00.0...
        ...
        [  598.236381] pci_hp_remove_devices:    Removing 0021:01:00.1...
        ...
        [  611.972077] pci_bus 0021:01: busn_res: [bus 01-ff] is released
        [  611.972140] rpadlpar_io: slot PHB 33 removed
      
        # kill -9 $pid1
        # kill -9 $pid2
        [  632.918088] pcibios_free_controller_deferred: domain 33, dynamic 1
      
      Test-case #2 (don't hold references)
      
        # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
        Validating PHB DLPAR capability...yes.
        [  916.357363] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
        [  916.357386] pci_hp_remove_devices:    Removing 0021:01:00.0...
        ...
        [  920.566527] pci_hp_remove_devices:    Removing 0021:01:00.1...
        ...
        [  933.955873] pci_bus 0021:01: busn_res: [bus 01-ff] is released
        [  933.955977] pcibios_free_controller_deferred: domain 33, dynamic 1
        [  933.955999] rpadlpar_io: slot PHB 33 removed
      Suggested-By: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Reviewed-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> # cxl
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83573add
  2. 07 Oct, 2016 21 commits