1. 25 Sep, 2018 2 commits
    • Michael Neuling's avatar
      powerpc/tm: Avoid possible userspace r1 corruption on reclaim · 96dc89d5
      Michael Neuling authored
      Current we store the userspace r1 to PACATMSCRATCH before finally
      saving it to the thread struct.
      
      In theory an exception could be taken here (like a machine check or
      SLB miss) that could write PACATMSCRATCH and hence corrupt the
      userspace r1. The SLB fault currently doesn't touch PACATMSCRATCH, but
      others do.
      
      We've never actually seen this happen but it's theoretically
      possible. Either way, the code is fragile as it is.
      
      This patch saves r1 to the kernel stack (which can't fault) before we
      turn MSR[RI] back on. PACATMSCRATCH is still used but only with
      MSR[RI] off. We then copy r1 from the kernel stack to the thread
      struct once we have MSR[RI] back on.
      Suggested-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      96dc89d5
    • Michael Neuling's avatar
      powerpc/tm: Fix userspace r13 corruption · cf13435b
      Michael Neuling authored
      When we treclaim we store the userspace checkpointed r13 to a scratch
      SPR and then later save the scratch SPR to the user thread struct.
      
      Unfortunately, this doesn't work as accessing the user thread struct
      can take an SLB fault and the SLB fault handler will write the same
      scratch SPRG that now contains the userspace r13.
      
      To fix this, we store r13 to the kernel stack (which can't fault)
      before we access the user thread struct.
      
      Found by running P8 guest + powervm + disable_1tb_segments + TM. Seen
      as a random userspace segfault with r13 looking like a kernel address.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reviewed-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      cf13435b
  2. 24 Sep, 2018 1 commit
    • Michael Bringmann's avatar
      powerpc/pseries: Fix unitialized timer reset on migration · 8604895a
      Michael Bringmann authored
      After migration of a powerpc LPAR, the kernel executes code to
      update the system state to reflect new platform characteristics.
      
      Such changes include modifications to device tree properties provided
      to the system by PHYP. Property notifications received by the
      post_mobility_fixup() code are passed along to the kernel in general
      through a call to of_update_property() which in turn passes such
      events back to all modules through entries like the '.notifier_call'
      function within the NUMA module.
      
      When the NUMA module updates its state, it resets its event timer. If
      this occurs after a previous call to stop_topology_update() or on a
      system without VPHN enabled, the code runs into an unitialized timer
      structure and crashes. This patch adds a safety check along this path
      toward the problem code.
      
      An example crash log is as follows.
      
        ibmvscsi 30000081: Re-enabling adapter!
        ------------[ cut here ]------------
        kernel BUG at kernel/time/timer.c:958!
        Oops: Exception in kernel mode, sig: 5 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag lockd unix_diag af_packet_diag netlink_diag grace fscache sunrpc xts vmx_crypto pseries_rng sg binfmt_misc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
        CPU: 11 PID: 3067 Comm: drmgr Not tainted 4.17.0+ #179
        ...
        NIP mod_timer+0x4c/0x400
        LR  reset_topology_timer+0x40/0x60
        Call Trace:
          0xc0000003f9407830 (unreliable)
          reset_topology_timer+0x40/0x60
          dt_update_callback+0x100/0x120
          notifier_call_chain+0x90/0x100
          __blocking_notifier_call_chain+0x60/0x90
          of_property_notify+0x90/0xd0
          of_update_property+0x104/0x150
          update_dt_property+0xdc/0x1f0
          pseries_devicetree_update+0x2d0/0x510
          post_mobility_fixup+0x7c/0xf0
          migration_store+0xa4/0xc0
          kobj_attr_store+0x30/0x60
          sysfs_kf_write+0x64/0xa0
          kernfs_fop_write+0x16c/0x240
          __vfs_write+0x40/0x200
          vfs_write+0xc8/0x240
          ksys_write+0x5c/0x100
          system_call+0x58/0x6c
      
      Fixes: 5d88aa85 ("powerpc/pseries: Update CPU maps when device tree is updated")
      Cc: stable@vger.kernel.org # v3.10+
      Signed-off-by: default avatarMichael Bringmann <mwb@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      8604895a
  3. 20 Sep, 2018 3 commits
    • Thiago Jung Bauermann's avatar
      powerpc/pkeys: Fix reading of ibm, processor-storage-keys property · c716a25b
      Thiago Jung Bauermann authored
      scan_pkey_feature() uses of_property_read_u32_array() to read the
      ibm,processor-storage-keys property and calls be32_to_cpu() on the
      value it gets. The problem is that of_property_read_u32_array() already
      returns the value converted to the CPU byte order.
      
      The value of pkeys_total ends up more or less sane because there's a min()
      call in pkey_initialize() which reduces pkeys_total to 32. So in practice
      the kernel ignores the fact that the hypervisor reserved one key for
      itself (the device tree advertises 31 keys in my test VM).
      
      This is wrong, but the effect in practice is that when a process tries to
      allocate the 32nd key, it gets an -EINVAL error instead of -ENOSPC which
      would indicate that there aren't any keys available
      
      Fixes: cf43d3b2 ("powerpc: Enable pkey subsystem")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: default avatarThiago Jung Bauermann <bauerman@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c716a25b
    • Christophe Leroy's avatar
      powerpc: fix csum_ipv6_magic() on little endian platforms · 85682a7e
      Christophe Leroy authored
      On little endian platforms, csum_ipv6_magic() keeps len and proto in
      CPU byte order. This generates a bad results leading to ICMPv6 packets
      from other hosts being dropped by powerpc64le platforms.
      
      In order to fix this, len and proto should be converted to network
      byte order ie bigendian byte order. However checksumming 0x12345678
      and 0x56341278 provide the exact same result so it is enough to
      rotate the sum of len and proto by 1 byte.
      
      PPC32 only support bigendian so the fix is needed for PPC64 only
      
      Fixes: e9c4943a ("powerpc: Implement csum_ipv6_magic in assembly")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Reported-by: default avatarXin Long <lucien.xin@gmail.com>
      Cc: <stable@vger.kernel.org> # 4.18+
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Tested-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      85682a7e
    • Alexey Kardashevskiy's avatar
      powerpc/powernv/ioda2: Reduce upper limit for DMA window size (again) · 7233b8ca
      Alexey Kardashevskiy authored
      mpe: This was fixed originally in commit d3d4ffaa
      ("powerpc/powernv/ioda2: Reduce upper limit for DMA window size"), but
      contrary to what the merge commit says was inadvertently lost by me in
      commit ce57c661 ("Merge branch 'topic/ppc-kvm' into next") which
      brought in changes that moved the code to a new file. So reapply it to
      the new file.
      
      Original commit message follows:
      
      We use PHB in mode1 which uses bit 59 to select a correct DMA window.
      However there is mode2 which uses bits 59:55 and allows up to 32 DMA
      windows per a PE.
      
      Even though documentation does not clearly specify that, it seems that
      the actual hardware does not support bits 59:55 even in mode1, in
      other words we can create a window as big as 1<<58 but DMA simply
      won't work.
      
      This reduces the upper limit from 59 to 55 bits to let the userspace
      know about the hardware limits.
      
      Fixes: ce57c661 ("Merge branch 'topic/ppc-kvm' into next")
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      7233b8ca
  4. 18 Sep, 2018 1 commit
    • Michael Neuling's avatar
      powerpc: Avoid code patching freed init sections · 51c3c62b
      Michael Neuling authored
      This stops us from doing code patching in init sections after they've
      been freed.
      
      In this chain:
        kvm_guest_init() ->
          kvm_use_magic_page() ->
            fault_in_pages_readable() ->
      	 __get_user() ->
      	   __get_user_nocheck() ->
      	     barrier_nospec();
      
      We have a code patching location at barrier_nospec() and
      kvm_guest_init() is an init function. This whole chain gets inlined,
      so when we free the init section (hence kvm_guest_init()), this code
      goes away and hence should no longer be patched.
      
      We seen this as userspace memory corruption when using a memory
      checker while doing partition migration testing on powervm (this
      starts the code patching post migration via
      /sys/kernel/mobility/migration). In theory, it could also happen when
      using /sys/kernel/debug/powerpc/barrier_nospec.
      
      Cc: stable@vger.kernel.org # 4.13+
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      51c3c62b
  5. 17 Sep, 2018 1 commit
  6. 10 Sep, 2018 1 commit
  7. 09 Sep, 2018 7 commits
  8. 08 Sep, 2018 6 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f8f65382
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "ARM:
         - Fix a VFP corruption in 32-bit guest
         - Add missing cache invalidation for CoW pages
         - Two small cleanups
      
        s390:
         - Fallout from the hugetlbfs support: pfmf interpretion and locking
         - VSIE: fix keywrapping for nested guests
      
        PPC:
         - Fix a bug where pages might not get marked dirty, causing guest
           memory corruption on migration
         - Fix a bug causing reads from guest memory to use the wrong guest
           real address for very large HPT guests (>256G of memory), leading
           to failures in instruction emulation.
      
        x86:
         - Fix out of bound access from malicious pv ipi hypercalls
           (introduced in rc1)
         - Fix delivery of pending interrupts when entering a nested guest,
           preventing arbitrarily late injection
         - Sanitize kvm_stat output after destroying a guest
         - Fix infinite loop when emulating a nested guest page fault and
           improve the surrounding emulation code
         - Two minor cleanups"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
        KVM: LAPIC: Fix pv ipis out-of-bounds access
        KVM: nVMX: Fix loss of pending IRQ/NMI before entering L2
        arm64: KVM: Remove pgd_lock
        KVM: Remove obsolete kvm_unmap_hva notifier backend
        arm64: KVM: Only force FPEXC32_EL2.EN if trapping FPSIMD
        KVM: arm/arm64: Clean dcache to PoC when changing PTE due to CoW
        KVM: s390: Properly lock mm context allow_gmap_hpage_1m setting
        KVM: s390: vsie: copy wrapping keys to right place
        KVM: s390: Fix pfmf and conditional skey emulation
        tools/kvm_stat: re-animate display of dead guests
        tools/kvm_stat: indicate dead guests as such
        tools/kvm_stat: handle guest removals more gracefully
        tools/kvm_stat: don't reset stats when setting PID filter for debugfs
        tools/kvm_stat: fix updates for dead guests
        tools/kvm_stat: fix handling of invalid paths in debugfs provider
        tools/kvm_stat: fix python3 issues
        KVM: x86: Unexport x86_emulate_instruction()
        KVM: x86: Rename emulate_instruction() to kvm_emulate_instruction()
        KVM: x86: Do not re-{try,execute} after failed emulation in L2
        KVM: x86: Default to not allowing emulation retry in kvm_mmu_page_fault
        ...
      f8f65382
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 0f3aa48a
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A few more fixes who have trickled in:
      
         - MMC bus width fixup for some Allwinner platforms
      
         - Fix for NULL deref in ti-aemif when no platform data is passed in
      
         - Fix div by 0 in SCMI code
      
         - Add a missing module alias in a new RPi driver"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        memory: ti-aemif: fix a potential NULL-pointer dereference
        firmware: arm_scmi: fix divide by zero when sustained_perf_level is zero
        hwmon: rpi: add module alias to raspberrypi-hwmon
        arm64: allwinner: dts: h6: fix Pine H64 MMC bus width
      0f3aa48a
    • Olof Johansson's avatar
      Merge tag 'sunxi-fixes-for-4.19' of... · a132bb90
      Olof Johansson authored
      Merge tag 'sunxi-fixes-for-4.19' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
      
      Allwinner fixes for 4.19
      
      Just one fix for H6 mmc on the Pine H64: the mmc bus width was missing
      from the device tree. This was added in 4.19-rc1.
      
      * tag 'sunxi-fixes-for-4.19' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
        arm64: allwinner: dts: h6: fix Pine H64 MMC bus width
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      a132bb90
    • Nadav Amit's avatar
      x86/mm: Use WRITE_ONCE() when setting PTEs · 9bc4f28a
      Nadav Amit authored
      When page-table entries are set, the compiler might optimize their
      assignment by using multiple instructions to set the PTE. This might
      turn into a security hazard if the user somehow manages to use the
      interim PTE. L1TF does not make our lives easier, making even an interim
      non-present PTE a security hazard.
      
      Using WRITE_ONCE() to set PTEs and friends should prevent this potential
      security hazard.
      
      I skimmed the differences in the binary with and without this patch. The
      differences are (obviously) greater when CONFIG_PARAVIRT=n as more
      code optimizations are possible. For better and worse, the impact on the
      binary with this patch is pretty small. Skimming the code did not cause
      anything to jump out as a security hazard, but it seems that at least
      move_soft_dirty_pte() caused set_pte_at() to use multiple writes.
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180902181451.80520-1-namit@vmware.com
      9bc4f28a
    • Thomas Gleixner's avatar
      x86/apic/vector: Make error return value negative · 47b7360c
      Thomas Gleixner authored
      activate_managed() returns EINVAL instead of -EINVAL in case of
      error. While this is unlikely to happen, the positive return value would
      cause further malfunction at the call site.
      
      Fixes: 2db1f959 ("x86/vector: Handle managed interrupts proper")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      47b7360c
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · d7b686eb
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
      
       - bugfixes for uniphier, i801, and xiic drivers
      
       - ID removal (never produced) for imx
      
       - one MAINTAINER addition
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: xiic: Record xilinx i2c with Zynq fragment
        i2c: xiic: Make the start and the byte count write atomic
        i2c: i801: fix DNV's SMBCTRL register offset
        i2c: imx-lpi2c: Remove mx8dv compatible entry
        dt-bindings: imx-lpi2c: Remove mx8dv compatible entry
        i2c: uniphier-f: issue STOP only for last message or I2C_M_STOP
        i2c: uniphier: issue STOP only for last message or I2C_M_STOP
      d7b686eb
  9. 07 Sep, 2018 18 commits