1. 27 Jul, 2016 40 commits
    • David S. Miller's avatar
      packet: Use symmetric hash for PACKET_FANOUT_HASH. · 424848bd
      David S. Miller authored
      [ Upstream commit eb70db87 ]
      
      People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that
      they want packets going in both directions on a flow to hash to the
      same bucket.
      
      The core kernel SKB hash became non-symmetric when the ipv6 flow label
      and other entities were incorporated into the standard flow hash order
      to increase entropy.
      
      But there are no users of PACKET_FANOUT_HASH who want an assymetric
      hash, they all want a symmetric one.
      
      Therefore, use the flow dissector to compute a flat symmetric hash
      over only the protocol, addresses and ports.  This hash does not get
      installed into and override the normal skb hash, so this change has
      no effect whatsoever on the rest of the stack.
      Reported-by: default avatarEric Leblond <eric@regit.org>
      Tested-by: default avatarEric Leblond <eric@regit.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      424848bd
    • Peter Zijlstra's avatar
      sched/fair: Fix cfs_rq avg tracking underflow · 43b1bfec
      Peter Zijlstra authored
      commit 89741892 upstream.
      
      As per commit:
      
        b7fa30c9 ("sched/fair: Fix post_init_entity_util_avg() serialization")
      
      > the code generated from update_cfs_rq_load_avg():
      >
      > 	if (atomic_long_read(&cfs_rq->removed_load_avg)) {
      > 		s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
      > 		sa->load_avg = max_t(long, sa->load_avg - r, 0);
      > 		sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
      > 		removed_load = 1;
      > 	}
      >
      > turns into:
      >
      > ffffffff81087064:       49 8b 85 98 00 00 00    mov    0x98(%r13),%rax
      > ffffffff8108706b:       48 85 c0                test   %rax,%rax
      > ffffffff8108706e:       74 40                   je     ffffffff810870b0 <update_blocked_averages+0xc0>
      > ffffffff81087070:       4c 89 f8                mov    %r15,%rax
      > ffffffff81087073:       49 87 85 98 00 00 00    xchg   %rax,0x98(%r13)
      > ffffffff8108707a:       49 29 45 70             sub    %rax,0x70(%r13)
      > ffffffff8108707e:       4c 89 f9                mov    %r15,%rcx
      > ffffffff81087081:       bb 01 00 00 00          mov    $0x1,%ebx
      > ffffffff81087086:       49 83 7d 70 00          cmpq   $0x0,0x70(%r13)
      > ffffffff8108708b:       49 0f 49 4d 70          cmovns 0x70(%r13),%rcx
      >
      > Which you'll note ends up with sa->load_avg -= r in memory at
      > ffffffff8108707a.
      
      So I _should_ have looked at other unserialized users of ->load_avg,
      but alas. Luckily nikbor reported a similar /0 from task_h_load() which
      instantly triggered recollection of this here problem.
      
      Aside from the intermediate value hitting memory and causing problems,
      there's another problem: the underflow detection relies on the signed
      bit. This reduces the effective width of the variables, IOW its
      effectively the same as having these variables be of signed type.
      
      This patch changes to a different means of unsigned underflow
      detection to not rely on the signed bit. This allows the variables to
      use the 'full' unsigned range. And it does so with explicit LOAD -
      STORE to ensure any intermediate value will never be visible in
      memory, allowing these unserialized loads.
      
      Note: GCC generates crap code for this, might warrant a look later.
      
      Note2: I say 'full' above, if we end up at U*_MAX we'll still explode;
             maybe we should do clamping on add too.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yuyang Du <yuyang.du@intel.com>
      Cc: bsegall@google.com
      Cc: kernel@kyup.com
      Cc: morten.rasmussen@arm.com
      Cc: pjt@google.com
      Cc: steve.muckle@linaro.org
      Fixes: 9d89c257 ("sched/fair: Rewrite runnable load and utilization average tracking")
      Link: http://lkml.kernel.org/r/20160617091948.GJ30927@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      43b1bfec
    • Kirill A. Shutemov's avatar
      UBIFS: Implement ->migratepage() · 1e1f4ff7
      Kirill A. Shutemov authored
      commit 4ac1c17b upstream.
      
      During page migrations UBIFS might get confused
      and the following assert triggers:
      [  213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
      [  213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008
      [  213.490000] Hardware name: Allwinner sun4i/sun5i Families
      [  213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14)
      [  213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0)
      [  213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50)
      [  213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8)
      [  213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290)
      [  213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80)
      [  213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0)
      [  213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4)
      [  213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0)
      [  213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8)
      [  213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274)
      [  213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c)
      [  213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0)
      [  213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
      [  213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48)
      [  213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444)
      [  213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614)
      [  213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c)
      [  213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)
      
      UBIFS is using PagePrivate() which can have different meanings across
      filesystems. Therefore the generic page migration code cannot handle this
      case correctly.
      We have to implement our own migration function which basically does a
      plain copy but also duplicates the page private flag.
      UBIFS is not a block device filesystem and cannot use buffer_migrate_page().
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      [rw: Massaged changelog, build fixes, etc...]
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e1f4ff7
    • Richard Weinberger's avatar
      mm: Export migrate_page_move_mapping and migrate_page_copy · 4b1cb3c8
      Richard Weinberger authored
      commit 1118dce7 upstream.
      
      Export these symbols such that UBIFS can implement
      ->migratepage.
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b1cb3c8
    • James Hogan's avatar
      MIPS: KVM: Fix modular KVM under QEMU · 7ad26023
      James Hogan authored
      commit 797179bc upstream.
      
      Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
      get a TLB refill exception in it when KVM is built as a module.
      
      This was observed to happen with the host MIPS kernel running under
      QEMU, due to a not entirely transparent optimisation in the QEMU TLB
      handling where TLB entries replaced with TLBWR are copied to a separate
      part of the TLB array. Code in those pages continue to be executable,
      but those mappings persist only until the next ASID switch, even if they
      are marked global.
      
      An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
      switching to the guest exception base. Subsequent TLB mapped kernel
      instructions just prior to switching to the guest trigger a TLB refill
      exception, which enters the guest exception handlers without updating
      EPC. This appears as a guest triggered TLB refill on a host kernel
      mapped (host KSeg2) address, which is not handled correctly as user
      (guest) mode accesses to kernel (host) segments always generate address
      error exceptions.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ad26023
    • Steve Capper's avatar
      ARM: 8579/1: mm: Fix definition of pmd_mknotpresent · 490a71c5
      Steve Capper authored
      commit 56530f5d upstream.
      
      Currently pmd_mknotpresent will use a zero entry to respresent an
      invalidated pmd.
      
      Unfortunately this definition clashes with pmd_none, thus it is
      possible for a race condition to occur if zap_pmd_range sees pmd_none
      whilst __split_huge_pmd_locked is running too with pmdp_invalidate
      just called.
      
      This patch fixes the race condition by modifying pmd_mknotpresent to
      create non-zero faulting entries (as is done in other architectures),
      removing the ambiguity with pmd_none.
      
      [catalin.marinas@arm.com: using L_PMD_SECT_VALID instead of PMD_TYPE_SECT]
      
      Fixes: 8d962507 ("ARM: mm: Transparent huge page support for LPAE systems.")
      Reported-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Signed-off-by: default avatarSteve Capper <steve.capper@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      490a71c5
    • Will Deacon's avatar
      ARM: 8578/1: mm: ensure pmd_present only checks the valid bit · 54cf0dde
      Will Deacon authored
      commit 62453188 upstream.
      
      In a subsequent patch, pmd_mknotpresent will clear the valid bit of the
      pmd entry, resulting in a not-present entry from the hardware's
      perspective. Unfortunately, pmd_present simply checks for a non-zero pmd
      value and will therefore continue to return true even after a
      pmd_mknotpresent operation. Since pmd_mknotpresent is only used for
      managing huge entries, this is only an issue for the 3-level case.
      
      This patch fixes the 3-level pmd_present implementation to take into
      account the valid bit. For bisectability, the change is made before the
      fix to pmd_mknotpresent.
      
      [catalin.marinas@arm.com: comment update regarding pmd_mknotpresent patch]
      
      Fixes: 8d962507 ("ARM: mm: Transparent huge page support for LPAE systems.")
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Steve Capper <Steve.Capper@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54cf0dde
    • Fabio Estevam's avatar
      ARM: imx6ul: Fix Micrel PHY mask · 91ac7387
      Fabio Estevam authored
      commit 20c15226 upstream.
      
      The value used for Micrel PHY mask is not correct. Use the
      MICREL_PHY_ID_MASK definition instead.
      
      Thanks to Jiri Luznicky for proposing the fix at
      https://community.freescale.com/thread/387739
      
      Fixes: 709bc065 ("ARM: imx6ul: add fec MAC refrence clock and phy fixup init")
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      91ac7387
    • Trond Myklebust's avatar
      NFS: Fix another OPEN_DOWNGRADE bug · b5d4a793
      Trond Myklebust authored
      commit e547f262 upstream.
      
      Olga Kornievskaia reports that the following test fails to trigger
      an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE.
      
      	fd0 = open(foo, RDRW)   -- should be open on the wire for "both"
      	fd1 = open(foo, RDONLY)  -- should be open on the wire for "read"
      	close(fd0) -- should trigger an open_downgrade
      	read(fd1)
      	close(fd1)
      
      The issue is that we're missing a check for whether or not the current
      state transitioned from an O_RDWR state as opposed to having transitioned
      from a combination of O_RDONLY and O_WRONLY.
      Reported-by: default avatarOlga Kornievskaia <aglo@umich.edu>
      Fixes: cd9288ff ("NFSv4: Fix another bug in the close/open_downgrade code")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5d4a793
    • Al Viro's avatar
      make nfs_atomic_open() call d_drop() on all ->open_context() errors. · 44d86dbf
      Al Viro authored
      commit d20cb71d upstream.
      
      In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code"
      unconditional d_drop() after the ->open_context() had been removed.  It had
      been correct for success cases (there ->open_context() itself had been doing
      dcache manipulations), but not for error ones.  Only one of those (ENOENT)
      got a compensatory d_drop() added in that commit, but in fact it should've
      been done for all errors.  As it is, the case of O_CREAT non-exclusive open
      on a hashed negative dentry racing with e.g. symlink creation from another
      client ended up with ->open_context() getting an error and proceeding to
      call nfs_lookup().  On a hashed dentry, which would've instantly triggered
      BUG_ON() in d_materialise_unique() (or, these days, its equivalent in
      d_splice_alias()).
      Tested-by: default avatarOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44d86dbf
    • Ben Hutchings's avatar
      nfsd: check permissions when setting ACLs · 412cfeec
      Ben Hutchings authored
      commit 99965378 upstream.
      
      Use set_posix_acl, which includes proper permission checks, instead of
      calling ->set_acl directly.  Without this anyone may be able to grant
      themselves permissions to a file by setting the ACL.
      
      Lock the inode to make the new checks atomic with respect to set_acl.
      (Also, nfsd was the only caller of set_acl not locking the inode, so I
      suspect this may fix other races.)
      
      This also simplifies the code, and ensures our ACLs are checked by
      posix_acl_valid.
      
      The permission checks and the inode locking were lost with commit
      4ac7249e, which changed nfsd to use the set_acl inode operation directly
      instead of going through xattr handlers.
      Reported-by: default avatarDavid Sinquin <david@sinquin.eu>
      [agreunba@redhat.com: use set_posix_acl]
      Fixes: 4ac7249e
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      412cfeec
    • Andreas Gruenbacher's avatar
      posix_acl: Add set_posix_acl · c3fa141c
      Andreas Gruenbacher authored
      commit 485e71e8 upstream.
      
      Factor out part of posix_acl_xattr_set into a common function that takes
      a posix_acl, which nfsd can also call.
      
      The prototype already exists in include/linux/posix_acl.h.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3fa141c
    • Oleg Drokin's avatar
      nfsd: Extend the mutex holding region around in nfsd4_process_open2() · f78ffdc2
      Oleg Drokin authored
      commit 5cc1fb2a upstream.
      
      To avoid racing entry into nfs4_get_vfs_file().
      Make init_open_stateid() return with locked stateid to be unlocked
      by the caller.
      Signed-off-by: default avatarOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f78ffdc2
    • Oleg Drokin's avatar
      nfsd: Always lock state exclusively. · 087f8fe0
      Oleg Drokin authored
      commit feb9dad5 upstream.
      
      It used to be the case that state had an rwlock that was locked for write
      by downgrades, but for read for upgrades (opens). Well, the problem is
      if there are two competing opens for the same state, they step on
      each other toes potentially leading to leaking file descriptors
      from the state structure, since access mode is a bitmap only set once.
      Signed-off-by: default avatarOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      087f8fe0
    • J. Bruce Fields's avatar
      nfsd4/rpc: move backchannel create logic into rpc code · 58e9e70a
      J. Bruce Fields authored
      commit d50039ea upstream.
      
      Also simplify the logic a bit.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Acked-by: default avatarTrond Myklebust <trondmy@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58e9e70a
    • Tejun Heo's avatar
      writeback: use higher precision calculation in domain_dirty_limits() · 400850bd
      Tejun Heo authored
      commit 62a584fe upstream.
      
      As vm.dirty_[background_]bytes can't be applied verbatim to multiple
      cgroup writeback domains, they get converted to percentages in
      domain_dirty_limits() and applied the same way as
      vm.dirty_[background]ratio.  However, if the specified bytes is lower
      than 1% of available memory, the calculated ratios become zero and the
      writeback domain gets throttled constantly.
      
      Fix it by using per-PAGE_SIZE instead of percentage for ratio
      calculations.  Also, the updated DIV_ROUND_UP() usages now should
      yield 1/4096 (0.0244%) as the minimum ratio as long as the specified
      bytes are above zero.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarMiao Xie <miaoxie@huawei.com>
      Link: http://lkml.kernel.org/g/57333E75.3080309@huawei.com
      Fixes: 9fc3a43e ("writeback: separate out domain_dirty_limits()")
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      Adjusted comment based on Jan's suggestion.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      400850bd
    • Lukasz Luba's avatar
      thermal: cpu_cooling: fix improper order during initialization · a519bfe6
      Lukasz Luba authored
      commit f840ab18 upstream.
      
      The freq_table array is not populated before calling
      thermal_of_cooling_register. The code which populates the freq table was
      introduced in commit f6859014.
      This should be done before registering new thermal cooling device.
      The log shows effects of this wrong decision.
      [    2.172614] cpu cpu1: Failed to get voltage for frequency 1984518656000: -34
      [    2.220863] cpu cpu0: Failed to get voltage for frequency 1984524416000: -34
      
      Fixes: f6859014 ("thermal: cpu_cooling: Store frequencies in descending order")
      Signed-off-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Acked-by: default avatarJavi Merino <javi.merino@arm.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarZhang Rui <rui.zhang@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a519bfe6
    • Andy Lutomirski's avatar
      uvc: Forward compat ioctls to their handlers directly · f77ea5ca
      Andy Lutomirski authored
      commit a44323e2 upstream.
      
      The current code goes through a lot of indirection just to call a
      known handler.  Simplify it: just call the handlers directly.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f77ea5ca
    • Johan Hovold's avatar
      Revert "gpiolib: Split GPIO flags parsing and GPIO configuration" · 93f25db7
      Johan Hovold authored
      commit 85b03b30 upstream.
      
      This reverts commit 923b93e4.
      
      Make sure consumers do not overwrite gpio flags for pins that have
      already been claimed.
      
      While adding support for gpio drivers to refuse a request using
      unsupported flags, the order of when the requested flag was checked and
      the new flags were applied was reversed to that consumers could
      overwrite flags for already requested gpios.
      
      This not only affects device-tree setups where two drivers could request
      the same gpio using conflicting configurations, but also allowed user
      space to clear gpio flags for already claimed pins simply by attempting
      to export them through the sysfs interface. By for example clearing the
      FLAG_ACTIVE_LOW flag this way, user space could effectively change the
      polarity of a signal.
      
      Reverting this change obviously prevents gpio drivers from doing sanity
      checks on the flags in their request callbacks. Fortunately only one
      recently added driver (gpio-tps65218 in v4.6) appears to do this, and a
      follow up patch could restore this functionality through a different
      interface.
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93f25db7
    • Borislav Petkov's avatar
      x86/amd_nb: Fix boot crash on non-AMD systems · d9c5952f
      Borislav Petkov authored
      commit 1ead852d upstream.
      
      Fix boot crash that triggers if this driver is built into a kernel and
      run on non-AMD systems.
      
      AMD northbridges users call amd_cache_northbridges() and it returns
      a negative value to signal that we weren't able to cache/detect any
      northbridges on the system.
      
      At least, it should do so as all its callers expect it to do so. But it
      does return a negative value only when kmalloc() fails.
      
      Fix it to return -ENODEV if there are no NBs cached as otherwise, amd_nb
      users like amd64_edac, for example, which relies on it to know whether
      it should load or not, gets loaded on systems like Intel Xeons where it
      shouldn't.
      Reported-and-tested-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1466097230-5333-2-git-send-email-bp@alien8.de
      Link: https://lkml.kernel.org/r/5761BEB0.9000807@cybernetics.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9c5952f
    • Masami Hiramatsu's avatar
      kprobes/x86: Clear TF bit in fault on single-stepping · 66af3f62
      Masami Hiramatsu authored
      commit dcfc4724 upstream.
      
      Fix kprobe_fault_handler() to clear the TF (trap flag) bit of
      the flags register in the case of a fault fixup on single-stepping.
      
      If we put a kprobe on the instruction which caused a
      page fault (e.g. actual mov instructions in copy_user_*),
      that fault happens on the single-stepping buffer. In this
      case, kprobes resets running instance so that the CPU can
      retry execution on the original ip address.
      
      However, current code forgets to reset the TF bit. Since this
      fault happens with TF bit set for enabling single-stepping,
      when it retries, it causes a debug exception and kprobes
      can not handle it because it already reset itself.
      
      On the most of x86-64 platform, it can be easily reproduced
      by using kprobe tracer. E.g.
      
        # cd /sys/kernel/debug/tracing
        # echo p copy_user_enhanced_fast_string+5 > kprobe_events
        # echo 1 > events/kprobes/enable
      
      And you'll see a kernel panic on do_debug(), since the debug
      trap is not handled by kprobes.
      
      To fix this problem, we just need to clear the TF bit when
      resetting running kprobe.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarAnanth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: systemtap@sourceware.org
      Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox
      [ Updated the comments. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66af3f62
    • H. Peter Anvin's avatar
      x86, build: copy ldlinux.c32 to image.iso · f7acd40e
      H. Peter Anvin authored
      commit 9c77679c upstream.
      
      For newer versions of Syslinux, we need ldlinux.c32 in addition to
      isolinux.bin to reside on the boot disk, so if the latter is found,
      copy it, too, to the isoimage tree.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f7acd40e
    • Paolo Bonzini's avatar
      locking/static_key: Fix concurrent static_key_slow_inc() · 71ef2c11
      Paolo Bonzini authored
      commit 4c5ea0a9 upstream.
      
      The following scenario is possible:
      
          CPU 1                                   CPU 2
          static_key_slow_inc()
           atomic_inc_not_zero()
            -> key.enabled == 0, no increment
           jump_label_lock()
           atomic_inc_return()
            -> key.enabled == 1 now
                                                  static_key_slow_inc()
                                                   atomic_inc_not_zero()
                                                    -> key.enabled == 1, inc to 2
                                                   return
                                                  ** static key is wrong!
           jump_label_update()
           jump_label_unlock()
      
      Testing the static key at the point marked by (**) will follow the
      wrong path for jumps that have not been patched yet.  This can
      actually happen when creating many KVM virtual machines with userspace
      LAPIC emulation; just run several copies of the following program:
      
          #include <fcntl.h>
          #include <unistd.h>
          #include <sys/ioctl.h>
          #include <linux/kvm.h>
      
          int main(void)
          {
              for (;;) {
                  int kvmfd = open("/dev/kvm", O_RDONLY);
                  int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
                  close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
                  close(vmfd);
                  close(kvmfd);
              }
              return 0;
          }
      
      Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc() call.
      The static key's purpose is to skip NULL pointer checks and indeed one
      of the processes eventually dereferences NULL.
      
      As explained in the commit that introduced the bug:
      
        706249c2 ("locking/static_keys: Rework update logic")
      
      jump_label_update() needs key.enabled to be true.  The solution adopted
      here is to temporarily make key.enabled == -1, and use go down the
      slow path when key.enabled <= 0.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 706249c2 ("locking/static_keys: Rework update logic")
      Link: http://lkml.kernel.org/r/1466527937-69798-1-git-send-email-pbonzini@redhat.com
      [ Small stylistic edits to the changelog and the code. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71ef2c11
    • Peter Zijlstra's avatar
      locking/qspinlock: Fix spin_unlock_wait() some more · a39e660a
      Peter Zijlstra authored
      commit 2c610022 upstream.
      
      While this prior commit:
      
        54cf809b ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
      
      ... fixes spin_is_locked() and spin_unlock_wait() for the usage
      in ipc/sem and netfilter, it does not in fact work right for the
      usage in task_work and futex.
      
      So while the 2 locks crossed problem:
      
      	spin_lock(A)		spin_lock(B)
      	if (!spin_is_locked(B)) spin_unlock_wait(A)
      	  foo()			foo();
      
      ... works with the smp_mb() injected by both spin_is_locked() and
      spin_unlock_wait(), this is not sufficient for:
      
      	flag = 1;
      	smp_mb();		spin_lock()
      	spin_unlock_wait()	if (!flag)
      				  // add to lockless list
      	// iterate lockless list
      
      ... because in this scenario, the store from spin_lock() can be delayed
      past the load of flag, uncrossing the variables and loosing the
      guarantee.
      
      This patch reworks spin_is_locked() and spin_unlock_wait() to work in
      both cases by exploiting the observation that while the lock byte
      store can be delayed, the contender must have registered itself
      visibly in other state contained in the word.
      
      It also allows for architectures to override both functions, as PPC
      and ARM64 have an additional issue for which we currently have no
      generic solution.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Giovanni Gherdovich <ggherdovich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <waiman.long@hpe.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 54cf809b ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a39e660a
    • Chris Wilson's avatar
      locking/ww_mutex: Report recursive ww_mutex locking early · c7f47e59
      Chris Wilson authored
      commit 0422e83d upstream.
      
      Recursive locking for ww_mutexes was originally conceived as an
      exception. However, it is heavily used by the DRM atomic modesetting
      code. Currently, the recursive deadlock is checked after we have queued
      up for a busy-spin and as we never release the lock, we spin until
      kicked, whereupon the deadlock is discovered and reported.
      
      A simple solution for the now common problem is to move the recursive
      deadlock discovery to the first action when taking the ww_mutex.
      Suggested-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1464293297-19777-1-git-send-email-chris@chris-wilson.co.ukSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7f47e59
    • Sergei Shtylyov's avatar
      of: irq: fix of_irq_get[_byname]() kernel-doc · c5f2e833
      Sergei Shtylyov authored
      commit 39935466 upstream.
      
      The kernel-doc for the of_irq_get[_byname]()  is clearly inadequate in
      describing the return values -- of_irq_get_byname() is documented better
      than of_irq_get() but it  still doesn't mention that 0 is returned iff
      irq_create_of_mapping() fails (it doesn't return an error code in this
      case). Document all possible return value variants, making the writing
      of the word "IRQ" consistent, while at it...
      
      Fixes: 9ec36caf ("of/irq: do irq resolution in platform_get_irq")
      Fixes: ad69674e ("of/irq: do irq resolution in platform_get_irq_byname()")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5f2e833
    • Wolfram Sang's avatar
      of: fix autoloading due to broken modalias with no 'compatible' · 6d58954b
      Wolfram Sang authored
      commit b3c0a4da upstream.
      
      Because of an improper dereference, a stray 'C' character was output to
      the modalias when no 'compatible' was specified. This is the case for
      some old PowerMac drivers which only set the 'name' property. Fix it to
      let them match again.
      Reported-by: default avatarMathieu Malaterre <malat@debian.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Tested-by: default avatarMathieu Malaterre <malat@debian.org>
      Cc: Philipp Zabel <p.zabel@pengutronix.de>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Fixes: 6543becf ("mod/file2alias: make modalias generation safe for cross compiling")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d58954b
    • Eric W. Biederman's avatar
      mnt: If fs_fully_visible fails call put_filesystem. · a400a793
      Eric W. Biederman authored
      commit 97c1df3e upstream.
      
      Add this trivial missing error handling.
      
      Fixes: 1b852bce ("mnt: Refactor the logic for mounting sysfs and proc in a user namespace")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a400a793
    • Eric W. Biederman's avatar
      mnt: Account for MS_RDONLY in fs_fully_visible · 6256d2f5
      Eric W. Biederman authored
      commit 695e9df0 upstream.
      
      In rare cases it is possible for s_flags & MS_RDONLY to be set but
      MNT_READONLY to be clear.  This starting combination can cause
      fs_fully_visible to fail to ensure that the new mount is readonly.
      Therefore force MNT_LOCK_READONLY in the new mount if MS_RDONLY
      is set on the source filesystem of the mount.
      
      In general both MS_RDONLY and MNT_READONLY are set at the same for
      mounts so I don't expect any programs to care.  Nor do I expect
      MS_RDONLY to be set on proc or sysfs in the initial user namespace,
      which further decreases the likelyhood of problems.
      
      Which means this change should only affect system configurations by
      paranoid sysadmins who should welcome the additional protection
      as it keeps people from wriggling out of their policies.
      
      Fixes: 8c6cf9cc ("mnt: Modify fs_fully_visible to deal with locked ro nodev and atime")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6256d2f5
    • Eric W. Biederman's avatar
      mnt: fs_fully_visible test the proper mount for MNT_LOCKED · 57eb6e3d
      Eric W. Biederman authored
      commit d71ed6c9 upstream.
      
      MNT_LOCKED implies on a child mount implies the child is locked to the
      parent.  So while looping through the children the children should be
      tested (not their parent).
      
      Typically an unshare of a mount namespace locks all mounts together
      making both the parent and the slave as locked but there are a few
      corner cases where other things work.
      
      Fixes: ceeb0e5d ("vfs: Ignore unlocked mounts in fs_fully_visible")
      Reported-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57eb6e3d
    • Oscar's avatar
      usb: common: otg-fsm: add license to usb-otg-fsm · 67799eb4
      Oscar authored
      commit ea1d39a3 upstream.
      
      Fix warning about tainted kernel because usb-otg-fsm has no license.
      WARNING: with this patch usb-otg-fsm module can be loaded
      but then the kernel will hang. Tested with a udoo quad board.
      Signed-off-by: default avatarOscar <oscar@naiandei.net>
      Signed-off-by: default avatarPeter Chen <peter.chen@nxp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      67799eb4
    • Alan Stern's avatar
      USB: EHCI: declare hostpc register as zero-length array · 7577b854
      Alan Stern authored
      commit 7e8b3dfe upstream.
      
      The HOSTPC extension registers found in some EHCI implementations form
      a variable-length array, with one element for each port.  Therefore
      the hostpc field in struct ehci_regs should be declared as a
      zero-length array, not a single-element array.
      
      This fixes a problem reported by UBSAN.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarWilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
      Tested-by: default avatarWilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7577b854
    • Arnd Bergmann's avatar
      usb: dwc2: fix regression on big-endian PowerPC/ARM systems · 7b74d56f
      Arnd Bergmann authored
      commit 23e34392 upstream.
      
      A patch that went into Linux-4.4 to fix big-endian mode on a Lantiq
      MIPS system unfortunately broke big-endian operation on PowerPC
      APM82181 as reported by Christian Lamparter, and likely other
      systems.
      
      It actually introduced multiple issues:
      
      - it broke big-endian ARM kernels: any machine that was working
        correctly with a little-endian kernel is no longer using byteswaps
        on big-endian kernels, which clearly breaks them.
      - On PowerPC the same thing must be true: if it was working before,
        using big-endian kernels is now broken. Unlike ARM, 32-bit PowerPC
        usually uses big-endian kernels, so they are likely all broken.
      - The barrier for dwc2_writel is on the wrong side of the __raw_writel(),
        so the MMIO no longer synchronizes with DMA operations.
      - On architectures that require specific CPU instructions for MMIO
        access, using the __raw_ variant may turn this into a pointer
        dereference that does not have the same effect as the readl/writel.
      
      This patch is a simple revert for all architectures other than MIPS,
      in the hope that we can more easily backport it to fix the regression
      on PowerPC and ARM systems without breaking the Lantiq system again.
      
      We should follow this up with a more elaborate change to add runtime
      detection of endianness, to make sure it also works on all other
      combinations of architectures and implementations of the usb-dwc2
      device. That patch however will be fairly large and not appropriate
      for backports to stable kernels.
      
      Felipe suggested a different approach, using an endianness switching
      register to always put the device into LE mode, but unfortunately
      the dwc2 hardware does not provide a generic way to do that. Also,
      I see no practical way of addressing the problem more generally by
      patching architecture specific code on MIPS.
      
      Fixes: 95c8bc36 ("usb: dwc2: Use platform endianness when accessing registers")
      Acked-by: default avatarJohn Youn <johnyoun@synopsys.com>
      Tested-by: default avatarChristian Lamparter <chunkeey@googlemail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b74d56f
    • Cyril Bur's avatar
      powerpc/tm: Always reclaim in start_thread() for exec() class syscalls · 5a35d2f9
      Cyril Bur authored
      commit 8e96a87c upstream.
      
      Userspace can quite legitimately perform an exec() syscall with a
      suspended transaction. exec() does not return to the old process, rather
      it load a new one and starts that, the expectation therefore is that the
      new process starts not in a transaction. Currently exec() is not treated
      any differently to any other syscall which creates problems.
      
      Firstly it could allow a new process to start with a suspended
      transaction for a binary that no longer exists. This means that the
      checkpointed state won't be valid and if the suspended transaction were
      ever to be resumed and subsequently aborted (a possibility which is
      exceedingly likely as exec()ing will likely doom the transaction) the
      new process will jump to invalid state.
      
      Secondly the incorrect attempt to keep the transactional state while
      still zeroing state for the new process creates at least two TM Bad
      Things. The first triggers on the rfid to return to userspace as
      start_thread() has given the new process a 'clean' MSR but the suspend
      will still be set in the hardware MSR. The second TM Bad Thing triggers
      in __switch_to() as the processor is still transactionally suspended but
      __switch_to() wants to zero the TM sprs for the new process.
      
      This is an example of the outcome of calling exec() with a suspended
      transaction. Note the first 700 is likely the first TM bad thing
      decsribed earlier only the kernel can't report it as we've loaded
      userspace registers. c000000000009980 is the rfid in
      fast_exception_return()
      
        Bad kernel stack pointer 3fffcfa1a370 at c000000000009980
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        CPU: 0 PID: 2006 Comm: tm-execed Not tainted
        NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000
        REGS: c00000003ffefd40 TRAP: 0700   Not tainted
        MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]>  CR: 00000000  XER: 00000000
        CFAR: c0000000000098b4 SOFTE: 0
        PACATMSCRATCH: b00000010000d033
        GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000
        GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000
        NIP [c000000000009980] fast_exception_return+0xb0/0xb8
        LR [0000000000000000]           (null)
        Call Trace:
        Instruction dump:
        f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070
        e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b
      
        Kernel BUG at c000000000043e80 [verbose debug info unavailable]
        Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033)
        Oops: Unrecoverable exception, sig: 6 [#2]
        CPU: 0 PID: 2006 Comm: tm-execed Tainted: G      D
        task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
        NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000
        REGS: c00000003ffef7e0 TRAP: 0700   Tainted: G      D
        MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]>  CR: 28002828  XER: 00000000
        CFAR: c000000000015a20 SOFTE: 0
        PACATMSCRATCH: b00000010000d033
        GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000
        GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000
        GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004
        GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000
        GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000
        GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80
        NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c
        LR [c000000000015a24] __switch_to+0x1f4/0x420
        Call Trace:
        Instruction dump:
        7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
        4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
      
      This fixes CVE-2016-5828.
      
      Fixes: bc2a9408 ("powerpc: Hook in new transactional memory code")
      Signed-off-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a35d2f9
    • Michael Ellerman's avatar
      powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added · 044af1b0
      Michael Ellerman authored
      commit 2c2a63e3 upstream.
      
      The recent commit 7cc85103 ("powerpc/pseries: Add POWER8NVL support
      to ibm,client-architecture-support call") added a new PVR mask & value
      to the start of the ibm_architecture_vec[] array.
      
      However it missed the fact that further down in the array, we hard code
      the offset of one of the fields, and then at boot use that value to
      patch the value in the array. This means every update to the array must
      also update the #define, ugh.
      
      This means that on pseries machines we will misreport to firmware the
      number of cores we support, by a factor of threads_per_core.
      
      Fix it for now by updating the #define.
      
      Fixes: 7cc85103 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      044af1b0
    • Gavin Shan's avatar
      powerpc/pseries: Fix PCI config address for DDW · 3abd809e
      Gavin Shan authored
      commit 8a934efe upstream.
      
      In commit 8445a87f "powerpc/iommu: Remove the dependency on EEH
      struct in DDW mechanism", the PE address was replaced with the PCI
      config address in order to remove dependency on EEH. According to PAPR
      spec, firmware (pHyp or QEMU) should accept "xxBBSSxx" format PCI config
      address, not "xxxxBBSS" provided by the patch. Note that "BB" is PCI bus
      number and "SS" is the combination of slot and function number.
      
      This fixes the PCI address passed to DDW RTAS calls.
      
      Fixes: 8445a87f ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism")
      Reported-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Tested-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3abd809e
    • Guilherme G. Piccoli's avatar
      powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism · 6bd26f4c
      Guilherme G. Piccoli authored
      commit 8445a87f upstream.
      
      Commit 39baadbf ("powerpc/eeh: Remove eeh information from pci_dn")
      changed the pci_dn struct by removing its EEH-related members.
      As part of this clean-up, DDW mechanism was modified to read the device
      configuration address from eeh_dev struct.
      
      As a consequence, now if we disable EEH mechanism on kernel command-line
      for example, the DDW mechanism will fail, generating a kernel oops by
      dereferencing a NULL pointer (which turns to be the eeh_dev pointer).
      
      This patch just changes the configuration address calculation on DDW
      functions to a manual calculation based on pci_dn members instead of
      using eeh_dev-based address.
      
      No functional changes were made. This was tested on pSeries, both
      in PHyp and qemu guest.
      
      Fixes: 39baadbf ("powerpc/eeh: Remove eeh information from pci_dn")
      Reviewed-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6bd26f4c
    • Jason Gunthorpe's avatar
      IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs · 75012a88
      Jason Gunthorpe authored
      commit 8c5122e4 upstream.
      
      When this code was reworked for IBoE support the order of assignments
      for the sl_tclass_flowlabel got flipped around resulting in
      TClass & FlowLabel being permanently set to 0 in the packet headers.
      
      This breaks IB routers that rely on these headers, but only affects
      kernel users - libmlx4 does this properly for user space.
      
      Fixes: fa417f7b ("IB/mlx4: Add support for IBoE")
      Signed-off-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75012a88
    • Bart Van Assche's avatar
      IB/cm: Fix a recently introduced locking bug · abb24acc
      Bart Van Assche authored
      commit 943f44d9 upstream.
      
      ib_cm_notify() can be called from interrupt context. Hence do not
      reenable interrupts unconditionally in cm_establish().
      
      This patch avoids that lockdep reports the following warning:
      
      WARNING: CPU: 0 PID: 23317 at kernel/locking/lockdep.c:2624 trace _hardirqs_on_caller+0x112/0x1b0
      DEBUG_LOCKS_WARN_ON(current->hardirq_context)
      Call Trace:
       <IRQ>  [<ffffffff812bd0e5>] dump_stack+0x67/0x92
       [<ffffffff81056f21>] __warn+0xc1/0xe0
       [<ffffffff81056f8a>] warn_slowpath_fmt+0x4a/0x50
       [<ffffffff810a5932>] trace_hardirqs_on_caller+0x112/0x1b0
       [<ffffffff810a59dd>] trace_hardirqs_on+0xd/0x10
       [<ffffffff815992c7>] _raw_spin_unlock_irq+0x27/0x40
       [<ffffffffa0382e9c>] ib_cm_notify+0x25c/0x290 [ib_cm]
       [<ffffffffa068fbc1>] srpt_qp_event+0xa1/0xf0 [ib_srpt]
       [<ffffffffa04efb97>] mlx4_ib_qp_event+0x67/0xd0 [mlx4_ib]
       [<ffffffffa034ec0a>] mlx4_qp_event+0x5a/0xc0 [mlx4_core]
       [<ffffffffa03365f8>] mlx4_eq_int+0x3d8/0xcf0 [mlx4_core]
       [<ffffffffa0336f9c>] mlx4_msi_x_interrupt+0xc/0x20 [mlx4_core]
       [<ffffffff810b0914>] handle_irq_event_percpu+0x64/0x100
       [<ffffffff810b09e4>] handle_irq_event+0x34/0x60
       [<ffffffff810b3a6a>] handle_edge_irq+0x6a/0x150
       [<ffffffff8101ad05>] handle_irq+0x15/0x20
       [<ffffffff8101a66c>] do_IRQ+0x5c/0x110
       [<ffffffff8159a2c9>] common_interrupt+0x89/0x89
       [<ffffffff81297a17>] blk_run_queue_async+0x37/0x40
       [<ffffffffa0163e53>] rq_completed+0x43/0x70 [dm_mod]
       [<ffffffffa0164896>] dm_softirq_done+0x176/0x280 [dm_mod]
       [<ffffffff812a26c2>] blk_done_softirq+0x52/0x90
       [<ffffffff8105bc1f>] __do_softirq+0x10f/0x230
       [<ffffffff8105bec8>] irq_exit+0xa8/0xb0
       [<ffffffff8103653e>] smp_trace_call_function_single_interrupt+0x2e/0x30
       [<ffffffff81036549>] smp_call_function_single_interrupt+0x9/0x10
       [<ffffffff8159a959>] call_function_single_interrupt+0x89/0x90
       <EOI>
      
      Fixes: commit be4b4993 (IB/cm: Do not queue work to a device that's going away)
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Erez Shitrit <erezsh@mellanox.com>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Nikolay Borisov <kernel@kyup.com>
      Acked-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abb24acc
    • Tony Luck's avatar
      EDAC, sb_edac: Fix rank lookup on Broadwell · 7bf50606
      Tony Luck authored
      commit c7103f65 upstream.
      
      Broadwell made a small change to the rank target register moving the
      target rank ID field up from bits 16:19 to bits 20:23.
      
      Also found that the offset field grew by one bit in the IVY_BRIDGE to
      HASWELL transition, so fix the RIR_OFFSET() macro too.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: Aristeu Rozanski <arozansk@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: http://lkml.kernel.org/r/2943fb819b1f7e396681165db9c12bb3df0e0b16.1464735623.git.tony.luck@intel.comSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7bf50606