1. 28 Oct, 2016 40 commits
    • Will Deacon's avatar
      arm64: KVM: Take S1 walks into account when determining S2 write faults · d08622e1
      Will Deacon authored
      commit 60e21a0e upstream.
      
      The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was
      generated by a read or a write instruction. For stage 2 data aborts
      generated by a stage 1 translation table walk (i.e. the actual page
      table access faults at EL2), the WnR bit therefore reports whether the
      instruction generating the walk was a load or a store, *not* whether the
      page table walker was reading or writing the entry.
      
      For page tables marked as read-only at stage 2 (e.g. due to KSM merging
      them with the tables from another guest), this could result in livelock,
      where a page table walk generated by a load instruction attempts to
      set the access flag in the stage 1 descriptor, but fails to trigger
      CoW in the host since only a read fault is reported.
      
      This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to
      take into account stage 2 faults in stage 1 walks. Since DBM cannot be
      disabled at EL2 for CPUs that implement it, we assume that these faults
      are always causes by writes, avoiding the livelock situation at the
      expense of occasional, spurious CoWs.
      
      We could, in theory, do a bit better by checking the guest TCR
      configuration and inspecting the page table to see why the PTE faulted.
      However, I doubt this is measurable in practice, and the threat of
      livelock is real.
      
      Cc: Julien Grall <julien.grall@arm.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d08622e1
    • Andre Przywara's avatar
      arm64: Cortex-A53 errata workaround: check for kernel addresses · ac591c10
      Andre Przywara authored
      commit 87261d19 upstream.
      
      Commit 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on
      errata-affected core") adds code to execute cache maintenance instructions
      in the kernel on behalf of userland on CPUs with certain ARM CPU errata.
      It turns out that the address hasn't been checked to be a valid user
      space address, allowing userland to clean cache lines in kernel space.
      Fix this by introducing an address check before executing the
      instructions on behalf of userland.
      
      Since the address doesn't come via a syscall parameter, we can't just
      reject tagged pointers and instead have to remove the tag when checking
      against the user address limit.
      
      Fixes: 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
      Reported-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      [will: rework commit message + replace access_ok with max_user_addr()]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac591c10
    • Marc Zyngier's avatar
      arm64: kernel: Init MDCR_EL2 even in the absence of a PMU · 32c0a66f
      Marc Zyngier authored
      commit 85054035 upstream.
      
      Commit f436b2ac ("arm64: kernel: fix architected PMU registers
      unconditional access") made sure we wouldn't access unimplemented
      PMU registers, but also left MDCR_EL2 uninitialized in that case,
      leading to trap bits being potentially left set.
      
      Make sure we always write something in that register.
      
      Fixes: f436b2ac ("arm64: kernel: fix architected PMU registers unconditional access")
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32c0a66f
    • Will Deacon's avatar
      arm64: percpu: rewrite ll/sc loops in assembly · 69e1b2d4
      Will Deacon authored
      commit 1e6e57d9 upstream.
      
      Writing the outer loop of an LL/SC sequence using do {...} while
      constructs potentially allows the compiler to hoist memory accesses
      between the STXR and the branch back to the LDXR. On CPUs that do not
      guarantee forward progress of LL/SC loops when faced with memory
      accesses to the same ERG (up to 2k) between the failed STXR and the
      branch back, we may end up livelocking.
      
      This patch avoids this issue in our percpu atomics by rewriting the
      outer loop as part of the LL/SC inline assembly block.
      
      Fixes: f97fc810 ("arm64: percpu: Implement this_cpu operations")
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69e1b2d4
    • Ard Biesheuvel's avatar
      arm64: kaslr: fix breakage with CONFIG_MODVERSIONS=y · f1816cea
      Ard Biesheuvel authored
      commit 9c0e83c3 upstream.
      
      As it turns out, the KASLR code breaks CONFIG_MODVERSIONS, since the
      kcrctab has an absolute address field that is relocated at runtime
      when the kernel offset is randomized.
      
      This has been fixed already for PowerPC in the past, so simply wire up
      the existing code dealing with this issue.
      
      Fixes: f80fb3a3 ("arm64: add support for kernel ASLR")
      Tested-by: default avatarTimur Tabi <timur@codeaurora.org>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1816cea
    • Will Deacon's avatar
      arm64: swp emulation: bound LL/SC retries before rescheduling · 44870b14
      Will Deacon authored
      commit 1c5b51df upstream.
      
      If a CPU does not implement a global monitor for certain memory types,
      then userspace can attempt a kernel DoS by issuing SWP instructions
      targetting the problematic memory (for example, a framebuffer mapped
      with non-cacheable attributes).
      
      The SWP emulation code protects against these sorts of attacks by
      checking for pending signals and potentially rescheduling when the STXR
      instruction fails during the emulation. Whilst this is good for avoiding
      livelock, it harms emulation of legitimate SWP instructions on CPUs
      where forward progress is not guaranteed if there are memory accesses to
      the same reservation granule (up to 2k) between the failing STXR and
      the retry of the LDXR.
      
      This patch solves the problem by retrying the STXR a bounded number of
      times (4) before breaking out of the LL/SC loop and looking for
      something else to do.
      
      Fixes: bd35a4ad ("arm64: Port SWP/SWPB emulation support from arm")
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44870b14
    • Ulf Hansson's avatar
      memstick: rtsx_usb_ms: Manage runtime PM when accessing the device · 5e09db27
      Ulf Hansson authored
      commit 9158cb29 upstream.
      
      Accesses to the rtsx usb device, which is the parent of the rtsx memstick
      device, must not be done unless it's runtime resumed. This is currently not
      the case and it could trigger various errors.
      
      Fix this by properly deal with runtime PM in this regards. This means
      making sure the device is runtime resumed, when serving requests via the
      ->request() callback or changing settings via the ->set_param() callbacks.
      
      Cc: Ritesh Raj Sarraf <rrs@researchut.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e09db27
    • Alan Stern's avatar
      memstick: rtsx_usb_ms: Runtime resume the device when polling for cards · 959288b7
      Alan Stern authored
      commit 796aa46a upstream.
      
      Accesses to the rtsx usb device, which is the parent of the rtsx memstick
      device, must not be done unless it's runtime resumed.
      
      Therefore when the rtsx_usb_ms driver polls for inserted memstick cards,
      let's add pm_runtime_get|put*() to make sure accesses is done when the
      rtsx usb device is runtime resumed.
      Reported-by: default avatarRitesh Raj Sarraf <rrs@researchut.com>
      Tested-by: default avatarRitesh Raj Sarraf <rrs@researchut.com>
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      959288b7
    • Jan Kara's avatar
      isofs: Do not return EACCES for unknown filesystems · d0ee9157
      Jan Kara authored
      commit a2ed0b39 upstream.
      
      When isofs_mount() is called to mount a device read-write, it returns
      EACCES even before it checks that the device actually contains an isofs
      filesystem. This may confuse mount(8) which then tries to mount all
      subsequent filesystem types in read-only mode.
      
      Fix the problem by returning EACCES only once we verify that the device
      indeed contains an iso9660 filesystem.
      
      Fixes: 17b7f7cfReported-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Reported-by: default avatarKarel Zak <kzak@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0ee9157
    • Vaibhav Jain's avatar
      cxl: Prevent adapter reset if an active context exists · 627da039
      Vaibhav Jain authored
      commit 70b565bb upstream.
      
      This patch prevents resetting the cxl adapter via sysfs in presence of
      one or more active cxl_context on it. This protects against an
      unrecoverable error caused by PSL owning a dirty cache line even after
      reset and host tries to touch the same cache line. In case a force reset
      of the card is required irrespective of any active contexts, the int
      value -1 can be stored in the 'reset' sysfs attribute of the card.
      
      The patch introduces a new atomic_t member named contexts_num inside
      struct cxl that holds the number of active context attached to the card
      , which is checked against '0' before proceeding with the reset. To
      prevent against a race condition where a context is activated just after
      reset check is performed, the contexts_num is atomically set to '-1'
      after reset-check to indicate that no more contexts can be activated on
      the card anymore.
      
      Before activating a context we atomically test if contexts_num is
      non-negative and if so, increment its value by one. In case the value of
      contexts_num is negative then it indicates that the card is about to be
      reset and context activation is error-ed out at that point.
      
      Fixes: 62fa19d4 ("cxl: Add ability to reset the card")
      Acked-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      627da039
    • Vladimir Murzin's avatar
      irqchip/gic-v3-its: Fix entry size mask for GITS_BASER · ee13e8a1
      Vladimir Murzin authored
      commit 9224eb77 upstream.
      
      Entry Size in GITS_BASER<n> occupies 5 bits [52:48], but we mask out 8
      bits.
      
      Fixes: cc2d3216 ("irqchip: GICv3: ITS command queue")
      Signed-off-by: default avatarVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee13e8a1
    • Noam Camus's avatar
      irqchip/eznps: Acknowledge NPS_IPI before calling the handler · eff4f765
      Noam Camus authored
      commit c0ca8df7 upstream.
      
      IPI_IRQ (also TIMER0_IRQ) should be acked before the action->handler is called
      in handle_percpu_devid_irq.
      
      The IPI irq is edge sensitive and we might miss an IPI interrupt if it is
      triggered again while the handler runs.
      
      Fixes: 44df427c ("irqchip: add nps Internal and external irqchips")
      Signed-off-by: default avatarNoam Camus <noamca@mellanox.com>
      Cc: marc.zyngier@arm.com
      Cc: jason@lakedaemon.net
      Link: http://lkml.kernel.org/r/1476364532-12634-1-git-send-email-noamca@mellanox.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eff4f765
    • Dan Carpenter's avatar
      irqchip/gicv3: Handle loop timeout proper · 82301da7
      Dan Carpenter authored
      commit d102eb5c upstream.
      
      The timeout loop terminates when the loop count is zero, but the decrement
      of the count variable is post check. So count is -1 when we check for the
      timeout and therefor the error message is supressed.
      
      Change it to predecrement, so the error message is emitted.
      
      [ tglx: Massaged changelog ]
      
      Fixes: a2c22510 ("irqchip: gic-v3: Refactor gic_enable_redist to support both enabling and disabling")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: kernel-janitors@vger.kernel.org
      Cc: Jason Cooper <jason@lakedaemon.net>
      Link: http://lkml.kernel.org/r/20161014072534.GA15168@mwandaSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82301da7
    • Peter Zijlstra's avatar
      sched/fair: Fix min_vruntime tracking · 2bb293cb
      Peter Zijlstra authored
      commit b60205c7 upstream.
      
      While going through enqueue/dequeue to review the movement of
      set_curr_task() I noticed that the (2nd) update_min_vruntime() call in
      dequeue_entity() is suspect.
      
      It turns out, its actually wrong because it will consider
      cfs_rq->curr, which could be the entry we just normalized. This mixes
      different vruntime forms and leads to fail.
      
      The purpose of the second update_min_vruntime() is to move
      min_vruntime forward if the entity we just removed is the one that was
      holding it back; _except_ for the DEQUEUE_SAVE case, because then we
      know its a temporary removal and it will come back.
      
      However, since we do put_prev_task() _after_ dequeue(), cfs_rq->curr
      will still be set (and per the above, can be tranformed into a
      different unit), so update_min_vruntime() should also consider
      curr->on_rq. This also fixes another corner case where the enqueue
      (which also does update_curr()->update_min_vruntime()) happens on the
      rq->lock break in schedule(), between dequeue and put_prev_task.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 1e876231 ("sched: Fix ->min_vruntime calculation in dequeue_entity()")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bb293cb
    • Vincent Guittot's avatar
      sched/fair: Fix incorrect task group ->load_avg · f6ba2614
      Vincent Guittot authored
      commit b5a9b340 upstream.
      
      A scheduler performance regression has been reported by Joseph Salisbury,
      which he bisected back to:
      
        3d30544f ("sched/fair: Apply more PELT fixes)
      
      The regression triggers when several levels of task groups are involved
      (read: SystemD) and cpu_possible_mask != cpu_present_mask.
      
      The root cause is that group entity's load (tg_child->se[i]->avg.load_avg)
      is initialized to scale_load_down(se->load.weight). During the creation of
      a child task group, its group entities on possible CPUs are attached to
      parent's cfs_rq (tg_parent) and their loads are added to the parent's load
      (tg_parent->load_avg) with update_tg_load_avg().
      
      But only the load on online CPUs will then be updated to reflect real load,
      whereas load on other CPUs will stay at the initial value.
      
      The result is a tg_parent->load_avg that is higher than the real load, the
      weight of group entities (tg_parent->se[i]->load.weight) on online CPUs is
      smaller than it should be, and the task group gets a less running time than
      what it could expect.
      
      ( This situation can be detected with /proc/sched_debug. The ".tg_load_avg"
        of the task group will be much higher than sum of ".tg_load_avg_contrib"
        of online cfs_rqs of the task group. )
      
      The load of group entities don't have to be intialized to something else
      than 0 because their load will increase when an entity is attached.
      Reported-by: default avatarJoseph Salisbury <joseph.salisbury@canonical.com>
      Tested-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: joonwoop@codeaurora.org
      Fixes: 3d30544f ("sched/fair: Apply more PELT fixes)
      Link: http://lkml.kernel.org/r/1476881123-10159-1-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6ba2614
    • Ville Syrjälä's avatar
      pinctrl: baytrail: Fix lockdep · cbc82f17
      Ville Syrjälä authored
      commit a171bc51 upstream.
      
      Initialize the spinlock before using it.
      
      INFO: trying to register non-static key.
      the code is fine but needs lockdep annotation.
      turning off the locking correctness validator.
      CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.8.0-dwc-bisect #4
      Hardware name: Intel Corp. VALLEYVIEW C0 PLATFORM/BYT-T FFD8, BIOS BLAKFF81.X64.0088.R10.1403240443 FFD8_X64_R_2014_13_1_00 03/24/2014
       0000000000000000 ffff8800788ff770 ffffffff8133d597 0000000000000000
       0000000000000000 ffff8800788ff7e0 ffffffff810cfb9e 0000000000000002
       ffff8800788ff7d0 ffffffff8205b600 0000000000000002 ffff8800788ff7f0
      Call Trace:
       [<ffffffff8133d597>] dump_stack+0x67/0x90
       [<ffffffff810cfb9e>] register_lock_class+0x52e/0x540
       [<ffffffff810d2081>] __lock_acquire+0x81/0x16b0
       [<ffffffff810cede1>] ? save_trace+0x41/0xd0
       [<ffffffff810d33b2>] ? __lock_acquire+0x13b2/0x16b0
       [<ffffffff810cf05a>] ? __lock_is_held+0x4a/0x70
       [<ffffffff810d3b1a>] lock_acquire+0xba/0x220
       [<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80
       [<ffffffff81631567>] _raw_spin_lock_irqsave+0x47/0x60
       [<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80
       [<ffffffff8136f1fe>] byt_gpio_get_direction+0x3e/0x80
       [<ffffffff813740a9>] gpiochip_add_data+0x319/0x7d0
       [<ffffffff81631723>] ? _raw_spin_unlock_irqrestore+0x43/0x70
       [<ffffffff8136fe3b>] byt_pinctrl_probe+0x2fb/0x620
       [<ffffffff8142fb0c>] platform_drv_probe+0x3c/0xa0
      ...
      
      Based on the diff it looks like the problem was introduced in
      commit 71e6ca61 ("pinctrl: baytrail: Register pin control handling")
      but I wasn't able to verify that empirically as the parent commit
      just oopsed when I tried to boot it.
      
      Cc: Cristina Ciocan <cristina.ciocan@intel.com>
      Fixes: 71e6ca61 ("pinctrl: baytrail: Register pin control handling")
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbc82f17
    • Mika Westerberg's avatar
      pinctrl: intel: Only restore pins that are used by the driver · 2e141f62
      Mika Westerberg authored
      commit c538b943 upstream.
      
      Dell XPS 13 (and maybe some others) uses a GPIO (CPU_GP_1) during suspend
      to explicitly disable USB touchscreen interrupt. This is done to prevent
      situation where the lid is closed the touchscreen is left functional.
      
      The pinctrl driver (wrongly) assumes it owns all pins which are owned by
      host and not locked down. It is perfectly fine for BIOS to use those pins
      as it is also considered as host in this context.
      
      What happens is that when the lid of Dell XPS 13 is closed, the BIOS
      configures CPU_GP_1 low disabling the touchscreen interrupt. During resume
      we restore all host owned pins to the known state which includes CPU_GP_1
      and this overwrites what the BIOS has programmed there causing the
      touchscreen to fail as no interrupts are reaching the CPU anymore.
      
      Fix this by restoring only those pins we know are explicitly requested by
      the kernel one way or other.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=176361Reported-by: default avatarAceLan Kao <acelan.kao@canonical.com>
      Tested-by: default avatarAceLan Kao <acelan.kao@canonical.com>
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e141f62
    • Ville Syrjälä's avatar
      x86/boot/smp: Don't try to poke disabled/non-existent APIC · 85533a1a
      Ville Syrjälä authored
      commit ff856051 upstream.
      
      Apparently trying to poke a disabled or non-existent APIC
      leads to a box that doesn't even boot. Let's not do that.
      
      No real clue if this is the right fix, but at least my
      P3 machine boots again.
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: dyoung@redhat.com
      Cc: kexec@lists.infradead.org
      Fixes: 2a51fe08 ("arch/x86: Handle non enumerated CPU after physical hotplug")
      Link: http://lkml.kernel.org/r/1477102684-5092-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85533a1a
    • Alex Thorlton's avatar
      x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates · 5f874b4b
      Alex Thorlton authored
      commit caef78b6 upstream.
      
      Some time ago, we brought our UV BIOS callback code up to speed with the
      new EFI memory mapping scheme, in commit:
      
          d1be84a2 ("x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()")
      
      By leveraging some changes that I made to a few of the EFI runtime
      callback mechanisms, in commit:
      
          80e75596 ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")
      
      This got everything running smoothly on UV, with the new EFI mapping
      code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
      (a.k.a. efi=old_map) will no longer work on UV, on kernels that include
      the aforementioned changes.
      
      At the time this was not a major issue (in fact, it still really isn't),
      but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
      systems.  This commit adds a check into uv_bios_call(), to see if we have
      the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
      using our old callback method, which uses efi_call() directly on the __va()
      of our function pointer.
      Signed-off-by: default avatarAlex Thorlton <athorlton@sgi.com>
      Acked-by: default avatarMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1476928131-170101-1-git-send-email-athorlton@sgi.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f874b4b
    • Jiri Slaby's avatar
      kvm: x86: memset whole irq_eoi · 05574eae
      Jiri Slaby authored
      commit 8678654e upstream.
      
      gcc 7 warns:
      arch/x86/kvm/ioapic.c: In function 'kvm_ioapic_reset':
      arch/x86/kvm/ioapic.c:597:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
      
      And it is right. Memset whole array using sizeof operator.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [Added x86 subject tag]
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05574eae
    • Dan Williams's avatar
      x86/e820: Don't merge consecutive E820_PRAM ranges · 9c50927f
      Dan Williams authored
      commit 23446cb6 upstream.
      
      Commit:
      
        917db484 ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation")
      
      ... fixed up the broken manipulations of max_pfn in the presence of
      E820_PRAM ranges.
      
      However, it also broke the sanitize_e820_map() support for not merging
      E820_PRAM ranges.
      
      Re-introduce the enabling to keep resource boundaries between
      consecutive defined ranges. Otherwise, for example, an environment that
      boots with memmap=2G!8G,2G!10G will end up with a single 4G /dev/pmem0
      device instead of a /dev/pmem0 and /dev/pmem1 device 2G in size.
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Zhang Yi <yizhan@redhat.com>
      Cc: linux-nvdimm@lists.01.org
      Fixes: 917db484 ("x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation")
      Link: http://lkml.kernel.org/r/147629530854.10618.10383744751594021268.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c50927f
    • Bart Van Assche's avatar
      blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL · f974fb26
      Bart Van Assche authored
      commit bbb427e3 upstream.
      
      Unlocking a mutex twice is wrong. Hence modify blkcg_policy_register()
      such that blkcg_pol_mutex is unlocked once if cpd == NULL. This patch
      avoids that smatch reports the following error:
      
      block/blk-cgroup.c:1378: blkcg_policy_register() error: double unlock 'mutex:&blkcg_pol_mutex'
      
      Fixes: 06b285bd ("blkcg: fix blkcg_policy_data allocation bug")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f974fb26
    • Sachin Prabhu's avatar
      Fix regression which breaks DFS mounting · 38c7c171
      Sachin Prabhu authored
      commit d171356f upstream.
      
      Patch a6b5058f results in -EREMOTE returned by is_path_accessible() in
      cifs_mount() to be ignored which breaks DFS mounting.
      Signed-off-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38c7c171
    • Steve French's avatar
      Cleanup missing frees on some ioctls · 917395fb
      Steve French authored
      commit 24df1483 upstream.
      
      Cleanup some missing mem frees on some cifs ioctls, and
      clarify others to make more obvious that no data is returned.
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Acked-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      917395fb
    • Steve French's avatar
      Do not send SMB3 SET_INFO request if nothing is changing · b56cd33b
      Steve French authored
      commit 18dd8e1a upstream.
      
      [CIFS] We had cases where we sent a SMB2/SMB3 setinfo request with all
      timestamp (and DOS attribute) fields marked as 0 (ie do not change)
      e.g. on chmod or chown.
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b56cd33b
    • Steve French's avatar
      SMB3: GUIDs should be constructed as random but valid uuids · 72c4b996
      Steve French authored
      commit fa70b87c upstream.
      
      GUIDs although random, and 16 bytes, need to be generated as
      proper uuids.
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Reported-by: default avatarDavid Goebels <davidgoe@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      72c4b996
    • Steve French's avatar
    • Steve French's avatar
      Display number of credits available · 14d73491
      Steve French authored
      commit 9742805d upstream.
      
      In debugging smb3, it is useful to display the number
      of credits available, so we can see when the server has not granted
      sufficient operations for the client to make progress, or alternatively
      the client has requested too many credits (as we saw in a recent bug)
      so we can compare with the number of credits the server thinks
      we have.
      
      Add a /proc/fs/cifs/DebugData line to display the client view
      on how many credits are available.
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Reported-by: default avatarGermano Percossi <germano.percossi@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14d73491
    • Steve French's avatar
      Clarify locking of cifs file and tcon structures and make more granular · 1e70e271
      Steve French authored
      commit 3afca265 upstream.
      
      Remove the global file_list_lock to simplify cifs/smb3 locking and
      have spinlocks that more closely match the information they are
      protecting.
      
      Add new tcon->open_file_lock and file->file_info_lock spinlocks.
      Locks continue to follow a heirachy,
      	cifs_socket --> cifs_ses --> cifs_tcon --> cifs_file
      where global tcp_ses_lock still protects socket and cifs_ses, while the
      the newer locks protect the lower level structure's information
      (tcon and cifs_file respectively).
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Reviewed-by: default avatarGermano Percossi <germano.percossi@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e70e271
    • Aurelien Aptel's avatar
      fs/cifs: keep guid when assigning fid to fileinfo · 36390678
      Aurelien Aptel authored
      commit 94f87371 upstream.
      
      When we open a durable handle we give a Globally Unique
      Identifier (GUID) to the server which we must keep for later reference
      e.g. when reopening persistent handles on reconnection.
      
      Without this the GUID generated for a new persistent handle was lost and
      16 zero bytes were used instead on re-opening.
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36390678
    • Ross Lagerwall's avatar
      cifs: Limit the overall credit acquired · e3c8bb30
      Ross Lagerwall authored
      commit 7d414f39 upstream.
      
      The kernel client requests 2 credits for many operations even though
      they only use 1 credit (presumably to build up a buffer of credit).
      Some servers seem to give the client as much credit as is requested.  In
      this case, the amount of credit the client has continues increasing to
      the point where (server->credits * MAX_BUFFER_SIZE) overflows in
      smb2_wait_mtu_credits().
      
      Fix this by throttling the credit requests if an set limit is reached.
      For async requests where the credit charge may be > 1, request as much
      credit as what is charged.
      The limit is chosen somewhat arbitrarily. The Windows client
      defaults to 128 credits, the Windows server allows clients up to
      512 credits (or 8192 for Windows 2016), and the NetApp server
      (and at least one other) does not limit clients at all.
      Choose a high enough value such that the client shouldn't limit
      performance.
      
      This behavior was seen with a NetApp filer (NetApp Release 9.0RC2).
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e3c8bb30
    • Oleg Nesterov's avatar
      fs/super.c: fix race between freeze_super() and thaw_super() · 39bcd9f9
      Oleg Nesterov authored
      commit 89f39af1 upstream.
      
      Change thaw_super() to check frozen != SB_FREEZE_COMPLETE rather than
      frozen == SB_UNFROZEN, otherwise it can race with freeze_super() which
      drops sb->s_umount after SB_FREEZE_WRITE to preserve the lock ordering.
      
      In this case thaw_super() will wrongly call s_op->unfreeze_fs() before
      it was actually frozen, and call sb_freeze_unlock() which leads to the
      unbalanced percpu_up_write(). Unfortunately lockdep can't detect this,
      so this triggers misc BUG_ON()'s in kernel/rcu/sync.c.
      Reported-and-tested-by: default avatarNikolay Borisov <kernel@kyup.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      39bcd9f9
    • Al Viro's avatar
      arc: don't leak bits of kernel stack into coredump · 54203cf6
      Al Viro authored
      commit 7798bf21 upstream.
      
      On faulting sigreturn we do get SIGSEGV, all right, but anything
      we'd put into pt_regs could end up in the coredump.  And since
      __copy_from_user() never zeroed on arc, we'd better bugger off
      on its failure without copying random uninitialized bits of
      kernel stack into pt_regs...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54203cf6
    • Vladimir Murzin's avatar
      arm64: KVM: VHE: reset PSTATE.PAN on entry to EL2 · aa8f9876
      Vladimir Murzin authored
      commit cb96408d upstream.
      
      SCTLR_EL2.SPAN bit controls what happens with the PSTATE.PAN bit on an
      exception. However, this bit has no effect on the PSTATE.PAN when
      HCR_EL2.E2H or HCR_EL2.TGE is unset. Thus when VHE is used and
      exception taken from a guest PSTATE.PAN bit left unchanged and we
      continue with a value guest has set.
      
      To address that always reset PSTATE.PAN on entry from EL1.
      
      Fixes: 1f364c8c ("arm64: VHE: Add support for running Linux in EL2 mode")
      Signed-off-by: default avatarVladimir Murzin <vladimir.murzin@arm.com>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      [ rebased for v4.7+ ]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      aa8f9876
    • Christophe Leroy's avatar
      soc/fsl/qe: fix Oops on CPM1 (and likely CPM2) · 8790ff41
      Christophe Leroy authored
      commit 4d486e00 upstream.
      
      Commit 0e6e01ff ("CPM/QE: use genalloc to manage CPM/QE muram")
      has changed the way muram is managed.
      genalloc uses kmalloc(), hence requires the SLAB to be up and running.
      
      On powerpc 8xx, cpm_reset() is called early during startup.
      cpm_reset() then calls cpm_muram_init() before SLAB is available,
      hence the following Oops.
      
      cpm_reset() cannot be called during initcalls because the CPM is
      needed for console.
      
      This patch removes the call to cpm_muram_init() from cpm_reset().
      cpm_muram_init() will be called from a new function called cpm_init()
      which is declared as subsys_initcall, unless cpm_muram_alloc() is
      called earlier for the serial console in which case cpm_muram_init()
      will be called from there.
      
      The reason for calling it from two places is that some drivers
      (e.g. i2c-cpm) need some of the initialisations done by
      cpm_muram_init() but don't call cpm_muram_alloc(). The console
      driver calls cpm_muram_alloc() but some platforms might not use
      the CPM serial ports for console.
      
      [    0.000000] Unable to handle kernel paging request for data at address 0x00000008
      [    0.000000] Faulting instruction address: 0xc01acce0
      [    0.000000] Oops: Kernel access of bad area, sig: 11 [#1]
      [    0.000000] PREEMPT CMPC885
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.14-g0886ed8 #5
      [    0.000000] task: c05183e0 ti: c0536000 task.ti: c0536000
      [    0.000000] NIP: c01acce0 LR: c0011068 CTR: 00000000
      [    0.000000] REGS: c0537e50 TRAP: 0300   Not tainted (4.4.14-s3k-dev-g0886ed8-svn)
      [    0.000000] MSR: 00001032 <ME,IR,DR,RI>  CR: 28044428  XER: 00000000
      [    0.000000] DAR: 00000008 DSISR: c0000000
      GPR00: c0011068 c0537f00 c05183e0 00000000 00009000 ffffffff 00000bc0 ffffffff
      GPR08: ff003000 ff00b000 ff003bbf 00000000 22044422 100d43a8 00000000 07ff94e8
      GPR16: 00000000 07bb5d70 00000000 07ff81f4 07ff81f4 07ff81f4 00000000 00000000
      GPR24: 07ffb3a0 07fe7628 c0550000 c7ffa190 c0540000 ff003bbf 00000000 00000001
      [    0.000000] NIP [c01acce0] gen_pool_add_virt+0x14/0xdc
      [    0.000000] LR [c0011068] cpm_muram_init+0xd4/0x18c
      [    0.000000] Call Trace:
      [    0.000000] [c0537f00] [00000200] 0x200 (unreliable)
      [    0.000000] [c0537f20] [c0011068] cpm_muram_init+0xd4/0x18c
      [    0.000000] [c0537f70] [c0494684] cpm_reset+0xb4/0xc8
      [    0.000000] [c0537f90] [c0494c64] cmpc885_setup_arch+0x10/0x30
      [    0.000000] [c0537fa0] [c0493cd4] setup_arch+0x130/0x168
      [    0.000000] [c0537fb0] [c04906bc] start_kernel+0x88/0x380
      [    0.000000] [c0537ff0] [c0002224] start_here+0x38/0x98
      [    0.000000] Instruction dump:
      [    0.000000] 91430010 91430014 80010014 83e1000c 7c0803a6 38210010 4e800020 7c0802a6
      [    0.000000] 9421ffe0 bf61000c 90010024 7c7e1b78 <80630008> 7c9c2378 7cc31c30 3863001f
      [    0.000000] ---[ end trace dc8fa200cb88537f ]---
      
      fixes: 0e6e01ff ("CPM/QE: use genalloc to manage CPM/QE muram")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      [scottwood: Removed some string changes unrelated to bugfix]
      Signed-off-by: default avatarScott Wood <oss@buserror.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8790ff41
    • Christophe Leroy's avatar
      soc/fsl/qe: fix gpio save_regs functions · b726aaa1
      Christophe Leroy authored
      commit 5dc6f3fe upstream.
      
      of_mm_gpiochip_add_data() calls mm_gc->save_regs() before
      setting the data. Therefore ->save_regs() cannot use
      gpiochip_get_data()
      
      An Oops is encountered without this fix.
      
      fixes: 1e714e54 ("powerpc: qe_lib-gpio: use gpiochip data pointer")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarScott Wood <oss@buserror.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b726aaa1
    • Guenter Roeck's avatar
      metag: Only define atomic_dec_if_positive conditionally · eeacb70b
      Guenter Roeck authored
      commit 35d04077 upstream.
      
      The definition of atomic_dec_if_positive() assumes that
      atomic_sub_if_positive() exists, which is only the case if
      metag specific atomics are used. This results in the following
      build error when trying to build metag1_defconfig.
      
      kernel/ucount.c: In function 'dec_ucount':
      kernel/ucount.c:211: error:
      	implicit declaration of function 'atomic_sub_if_positive'
      
      Moving the definition of atomic_dec_if_positive() into the metag
      conditional code fixes the problem.
      
      Fixes: 6006c0d8 ("metag: Atomics, locks and bitops")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eeacb70b
    • Guenter Roeck's avatar
      watchdog: mt7621_wdt: Remove assignment of dev pointer · f316bc2b
      Guenter Roeck authored
      commit 8355b3f9 upstream.
      
      Commit 0254e953 ("watchdog: Drop pointer to watchdog device from
      struct watchdog_device") removed the dev pointer from struct
      watchdog_device, but this driver was still assigning it, leading to
      a compilation error:
      
      drivers/watchdog/mt7621_wdt.c: In function 'mt7621_wdt_probe':
      drivers/watchdog/mt7621_wdt.c:142:16: error:
      	'struct watchdog_device' has no member named 'dev'
      
      Fix this by removing the assignment.
      
      Fixes: 0254e953 ("watchdog: Drop pointer to watchdog device ...")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f316bc2b
    • Matt Redfearn's avatar
      watchdog: rt2880_wdt: Remove assignment of dev pointer · bfdc15a4
      Matt Redfearn authored
      commit df9c692b upstream.
      
      Commit 0254e953 ("watchdog: Drop pointer to watchdog device from
      struct watchdog_device") removed the dev pointer from struct
      watchdog_device, but this driver was still assigning it, leading to a
      compilation error:
      
      drivers/watchdog/rt2880_wdt.c: In function ‘rt288x_wdt_probe’:
      drivers/watchdog/rt2880_wdt.c:161:16: error: ‘struct watchdog_device’
      has no member named ‘dev’
        rt288x_wdt_dev.dev = &pdev->dev;
                      ^
      scripts/Makefile.build:289: recipe for target
      'drivers/watchdog/rt2880_wdt.o' failed
      
      Fix this by removing the assignment.
      
      Fixes: 0254e953 ("watchdog: Drop pointer to watchdog device ...")
      Signed-off-by: default avatarMatt Redfearn <matt.redfearn@imgtec.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfdc15a4
    • Ming Lei's avatar
      scsi: Fix use-after-free · 8bfe1c78
      Ming Lei authored
      commit bcd8f2e9 upstream.
      
      This patch fixes one use-after-free report[1] by KASAN.
      
      In __scsi_scan_target(), when a type 31 device is probed,
      SCSI_SCAN_TARGET_PRESENT is returned and the target will be scanned
      again.
      
      Inside the following scsi_report_lun_scan(), one new scsi_device
      instance is allocated, and scsi_probe_and_add_lun() is called again to
      probe the target and still see type 31 device, finally
      __scsi_remove_device() is called to remove & free the device at the end
      of scsi_probe_and_add_lun(), so cause use-after-free in
      scsi_report_lun_scan().
      
      And the following SCSI log can be observed:
      
      	scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36
      	scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0
      	scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added
      	scsi 0:0:2:0: scsi scan: Sending REPORT LUNS to (try 0)
      	scsi 0:0:2:0: scsi scan: REPORT LUNS successful (try 0) result 0x0
      	scsi 0:0:2:0: scsi scan: REPORT LUN scan
      	scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36
      	scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0
      	scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added
      	BUG: KASAN: use-after-free in __scsi_scan_target+0xbf8/0xe40 at addr ffff88007b44a104
      
      This patch fixes the issue by moving the putting reference at
      the end of scsi_report_lun_scan().
      
      [1] KASAN report
      ==================================================================
      [    3.274597] PM: Adding info for serio:serio1
      [    3.275127] BUG: KASAN: use-after-free in __scsi_scan_target+0xd87/0xdf0 at addr ffff880254d8c304
      [    3.275653] Read of size 4 by task kworker/u10:0/27
      [    3.275903] CPU: 3 PID: 27 Comm: kworker/u10:0 Not tainted 4.8.0 #2121
      [    3.276258] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [    3.276797] Workqueue: events_unbound async_run_entry_fn
      [    3.277083]  ffff880254d8c380 ffff880259a37870 ffffffff94bbc6c1 ffff880078402d80
      [    3.277532]  ffff880254d8bb80 ffff880259a37898 ffffffff9459fec1 ffff880259a37930
      [    3.277989]  ffff880254d8bb80 ffff880078402d80 ffff880259a37920 ffffffff945a0165
      [    3.278436] Call Trace:
      [    3.278528]  [<ffffffff94bbc6c1>] dump_stack+0x65/0x84
      [    3.278797]  [<ffffffff9459fec1>] kasan_object_err+0x21/0x70
      [    3.279063] device: 'psaux': device_add
      [    3.279616]  [<ffffffff945a0165>] kasan_report_error+0x205/0x500
      [    3.279651] PM: Adding info for No Bus:psaux
      [    3.280202]  [<ffffffff944ecd22>] ? kfree_const+0x22/0x30
      [    3.280486]  [<ffffffff94bc2dc9>] ? kobject_release+0x119/0x370
      [    3.280805]  [<ffffffff945a0543>] __asan_report_load4_noabort+0x43/0x50
      [    3.281170]  [<ffffffff9507e1f7>] ? __scsi_scan_target+0xd87/0xdf0
      [    3.281506]  [<ffffffff9507e1f7>] __scsi_scan_target+0xd87/0xdf0
      [    3.281848]  [<ffffffff9507d470>] ? scsi_add_device+0x30/0x30
      [    3.282156]  [<ffffffff94f7f660>] ? pm_runtime_autosuspend_expiration+0x60/0x60
      [    3.282570]  [<ffffffff956ddb07>] ? _raw_spin_lock+0x17/0x40
      [    3.282880]  [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160
      [    3.283200]  [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0
      [    3.283563]  [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250
      [    3.283882]  [<ffffffff9507efc1>] do_scan_async+0x41/0x450
      [    3.284173]  [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610
      [    3.284492]  [<ffffffff941a8954>] ? pwq_dec_nr_in_flight+0x124/0x2a0
      [    3.284876]  [<ffffffff941d1770>] ? preempt_count_add+0x130/0x160
      [    3.285207]  [<ffffffff941a9a84>] process_one_work+0x544/0x12d0
      [    3.285526]  [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0
      [    3.285844]  [<ffffffff941aa810>] ? process_one_work+0x12d0/0x12d0
      [    3.286182]  [<ffffffff941bb365>] kthread+0x1c5/0x260
      [    3.286443]  [<ffffffff940855cd>] ? __switch_to+0x88d/0x1430
      [    3.286745]  [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0
      [    3.287085]  [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40
      [    3.287368]  [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0
      [    3.287697] Object at ffff880254d8bb80, in cache kmalloc-2048 size: 2048
      [    3.288064] Allocated:
      [    3.288147] PID = 27
      [    3.288218]  [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50
      [    3.288531]  [<ffffffff9459f246>] save_stack+0x46/0xd0
      [    3.288806]  [<ffffffff9459f4bd>] kasan_kmalloc+0xad/0xe0
      [    3.289098]  [<ffffffff9459c07e>] __kmalloc+0x13e/0x250
      [    3.289378]  [<ffffffff95078e5a>] scsi_alloc_sdev+0xea/0xcf0
      [    3.289701]  [<ffffffff9507de76>] __scsi_scan_target+0xa06/0xdf0
      [    3.290034]  [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160
      [    3.290362]  [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0
      [    3.290724]  [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250
      [    3.291055]  [<ffffffff9507efc1>] do_scan_async+0x41/0x450
      [    3.291354]  [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610
      [    3.291695]  [<ffffffff941a9a84>] process_one_work+0x544/0x12d0
      [    3.292022]  [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0
      [    3.292325]  [<ffffffff941bb365>] kthread+0x1c5/0x260
      [    3.292594]  [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40
      [    3.292886] Freed:
      [    3.292945] PID = 27
      [    3.293016]  [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50
      [    3.293327]  [<ffffffff9459f246>] save_stack+0x46/0xd0
      [    3.293600]  [<ffffffff9459fa61>] kasan_slab_free+0x71/0xb0
      [    3.293916]  [<ffffffff9459bac2>] kfree+0xa2/0x1f0
      [    3.294168]  [<ffffffff9508158a>] scsi_device_dev_release_usercontext+0x50a/0x730
      [    3.294598]  [<ffffffff941ace9a>] execute_in_process_context+0xda/0x130
      [    3.294974]  [<ffffffff9508107c>] scsi_device_dev_release+0x1c/0x20
      [    3.295322]  [<ffffffff94f566f6>] device_release+0x76/0x1e0
      [    3.295626]  [<ffffffff94bc2db7>] kobject_release+0x107/0x370
      [    3.295942]  [<ffffffff94bc29ce>] kobject_put+0x4e/0xa0
      [    3.296222]  [<ffffffff94f56e17>] put_device+0x17/0x20
      [    3.296497]  [<ffffffff9505201c>] scsi_device_put+0x7c/0xa0
      [    3.296801]  [<ffffffff9507e1bc>] __scsi_scan_target+0xd4c/0xdf0
      [    3.297132]  [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160
      [    3.297458]  [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0
      [    3.297829]  [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250
      [    3.298156]  [<ffffffff9507efc1>] do_scan_async+0x41/0x450
      [    3.298453]  [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610
      [    3.298777]  [<ffffffff941a9a84>] process_one_work+0x544/0x12d0
      [    3.299105]  [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0
      [    3.299408]  [<ffffffff941bb365>] kthread+0x1c5/0x260
      [    3.299676]  [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40
      [    3.299967] Memory state around the buggy address:
      [    3.300209]  ffff880254d8c200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [    3.300608]  ffff880254d8c280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [    3.300986] >ffff880254d8c300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [    3.301408]                    ^
      [    3.301550]  ffff880254d8c380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [    3.301987]  ffff880254d8c400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [    3.302396]
      ==================================================================
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMing Lei <tom.leiming@gmail.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8bfe1c78