1. 22 Sep, 2020 1 commit
  2. 20 Sep, 2020 3 commits
  3. 18 Sep, 2020 2 commits
  4. 14 Sep, 2020 1 commit
  5. 12 Sep, 2020 8 commits
  6. 11 Sep, 2020 9 commits
    • Vitaly Kuznetsov's avatar
      KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN · d831de17
      Vitaly Kuznetsov authored
      Even without in-kernel LAPIC we should allow writing '0' to
      MSR_KVM_ASYNC_PF_EN as we're not enabling the mechanism. In
      particular, QEMU with 'kernel-irqchip=off' fails to start
      a guest with
      
      qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0
      
      Fixes: 9d3c447c ("KVM: X86: Fix async pf caused null-ptr-deref")
      Reported-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200911093147.484565-1-vkuznets@redhat.com>
      [Actually commit the version proposed by Sean Christopherson. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d831de17
    • David Rientjes's avatar
      KVM: SVM: Periodically schedule when unregistering regions on destroy · 7be74942
      David Rientjes authored
      There may be many encrypted regions that need to be unregistered when a
      SEV VM is destroyed.  This can lead to soft lockups.  For example, on a
      host running 4.15:
      
      watchdog: BUG: soft lockup - CPU#206 stuck for 11s! [t_virtual_machi:194348]
      CPU: 206 PID: 194348 Comm: t_virtual_machi
      RIP: 0010:free_unref_page_list+0x105/0x170
      ...
      Call Trace:
       [<0>] release_pages+0x159/0x3d0
       [<0>] sev_unpin_memory+0x2c/0x50 [kvm_amd]
       [<0>] __unregister_enc_region_locked+0x2f/0x70 [kvm_amd]
       [<0>] svm_vm_destroy+0xa9/0x200 [kvm_amd]
       [<0>] kvm_arch_destroy_vm+0x47/0x200
       [<0>] kvm_put_kvm+0x1a8/0x2f0
       [<0>] kvm_vm_release+0x25/0x30
       [<0>] do_exit+0x335/0xc10
       [<0>] do_group_exit+0x3f/0xa0
       [<0>] get_signal+0x1bc/0x670
       [<0>] do_signal+0x31/0x130
      
      Although the CLFLUSH is no longer issued on every encrypted region to be
      unregistered, there are no other changes that can prevent soft lockups for
      very large SEV VMs in the latest kernel.
      
      Periodically schedule if necessary.  This still holds kvm->lock across the
      resched, but since this only happens when the VM is destroyed this is
      assumed to be acceptable.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Message-Id: <alpine.DEB.2.23.453.2008251255240.2987727@chino.kir.corp.google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7be74942
    • Huacai Chen's avatar
      KVM: MIPS: Change the definition of kvm type · 15e9e35c
      Huacai Chen authored
      MIPS defines two kvm types:
      
       #define KVM_VM_MIPS_TE          0
       #define KVM_VM_MIPS_VZ          1
      
      In Documentation/virt/kvm/api.rst it is said that "You probably want to
      use 0 as machine type", which implies that type 0 be the "automatic" or
      "default" type. And, in user-space libvirt use the null-machine (with
      type 0) to detect the kvm capability, which returns "KVM not supported"
      on a VZ platform.
      
      I try to fix it in QEMU but it is ugly:
      https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html
      
      And Thomas Huth suggests me to change the definition of kvm type:
      https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html
      
      So I define like this:
      
       #define KVM_VM_MIPS_AUTO        0
       #define KVM_VM_MIPS_VZ          1
       #define KVM_VM_MIPS_TE          2
      
      Since VZ and TE cannot co-exists, using type 0 on a TE platform will
      still return success (so old user-space tools have no problems on new
      kernels); the advantage is that using type 0 on a VZ platform will not
      return failure. So, the only problem is "new user-space tools use type
      2 on old kernels", but if we treat this as a kernel bug, we can backport
      this patch to old stable kernels.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Message-Id: <1599734031-28746-1-git-send-email-chenhc@lemote.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      15e9e35c
    • Lai Jiangshan's avatar
      kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed · f6f6195b
      Lai Jiangshan authored
      When kvm_mmu_get_page() gets a page with unsynced children, the spt
      pagetable is unsynchronized with the guest pagetable. But the
      guest might not issue a "flush" operation on it when the pagetable
      entry is changed from zero or other cases. The hypervisor has the
      responsibility to synchronize the pagetables.
      
      KVM behaved as above for many years, But commit 8c8560b8
      ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes")
      inadvertently included a line of code to change it without giving any
      reason in the changelog. It is clear that the commit's intention was to
      change KVM_REQ_TLB_FLUSH -> KVM_REQ_TLB_FLUSH_CURRENT, so we don't
      needlessly flush other contexts; however, one of the hunks changed
      a nearby KVM_REQ_MMU_SYNC instead.  This patch changes it back.
      
      Link: https://lore.kernel.org/lkml/20200320212833.3507-26-sean.j.christopherson@intel.com/
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20200902135421.31158-1-jiangshanlai@gmail.com>
      fixes: 8c8560b8 ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f6f6195b
    • Chenyi Qiang's avatar
      KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control · c6b177a3
      Chenyi Qiang authored
      A minor fix for the update of VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL field
      in exit_ctls_high.
      
      Fixes: 03a8871a ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
      VM-{Entry,Exit} control")
      Signed-off-by: default avatarChenyi Qiang <chenyi.qiang@intel.com>
      Reviewed-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Message-Id: <20200828085622.8365-5-chenyi.qiang@intel.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c6b177a3
    • Rustam Kovhaev's avatar
      KVM: fix memory leak in kvm_io_bus_unregister_dev() · f6588660
      Rustam Kovhaev authored
      when kmalloc() fails in kvm_io_bus_unregister_dev(), before removing
      the bus, we should iterate over all other devices linked to it and call
      kvm_iodevice_destructor() for them
      
      Fixes: 90db1043 ("KVM: kvm_io_bus_unregister_dev() should never fail")
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: syzbot+f196caa45793d6374707@syzkaller.appspotmail.com
      Link: https://syzkaller.appspot.com/bug?extid=f196caa45793d6374707Signed-off-by: default avatarRustam Kovhaev <rkovhaev@gmail.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200907185535.233114-1-rkovhaev@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f6588660
    • Haiwei Li's avatar
      KVM: Check the allocation of pv cpu mask · 0f990222
      Haiwei Li authored
      check the allocation of per-cpu __pv_cpu_mask. Initialize ops only when
      successful.
      Signed-off-by: default avatarHaiwei Li <lihaiwei@tencent.com>
      Message-Id: <d59f05df-e6d3-3d31-a036-cc25a2b2f33f@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0f990222
    • Peter Shier's avatar
      KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected · 43fea4e4
      Peter Shier authored
      When L2 uses PAE, L0 intercepts of L2 writes to CR0/CR3/CR4 call
      load_pdptrs to read the possibly updated PDPTEs from the guest
      physical address referenced by CR3.  It loads them into
      vcpu->arch.walk_mmu->pdptrs and sets VCPU_EXREG_PDPTR in
      vcpu->arch.regs_dirty.
      
      At the subsequent assumed reentry into L2, the mmu will call
      vmx_load_mmu_pgd which calls ept_load_pdptrs. ept_load_pdptrs sees
      VCPU_EXREG_PDPTR set in vcpu->arch.regs_dirty and loads
      VMCS02.GUEST_PDPTRn from vcpu->arch.walk_mmu->pdptrs[]. This all works
      if the L2 CRn write intercept always resumes L2.
      
      The resume path calls vmx_check_nested_events which checks for
      exceptions, MTF, and expired VMX preemption timers. If
      vmx_check_nested_events finds any of these conditions pending it will
      reflect the corresponding exit into L1. Live migration at this point
      would also cause a missed immediate reentry into L2.
      
      After L1 exits, vmx_vcpu_run calls vmx_register_cache_reset which
      clears VCPU_EXREG_PDPTR in vcpu->arch.regs_dirty.  When L2 next
      resumes, ept_load_pdptrs finds VCPU_EXREG_PDPTR clear in
      vcpu->arch.regs_dirty and does not load VMCS02.GUEST_PDPTRn from
      vcpu->arch.walk_mmu->pdptrs[]. prepare_vmcs02 will then load
      VMCS02.GUEST_PDPTRn from vmcs12->pdptr0/1/2/3 which contain the stale
      values stored at last L2 exit. A repro of this bug showed L2 entering
      triple fault immediately due to the bad VMCS02.GUEST_PDPTRn values.
      
      When L2 is in PAE paging mode add a call to ept_load_pdptrs before
      leaving L2. This will update VMCS02.GUEST_PDPTRn if they are dirty in
      vcpu->arch.walk_mmu->pdptrs[].
      
      Tested:
      kvm-unit-tests with new directed test: vmx_mtf_pdpte_test.
      Verified that test fails without the fix.
      
      Also ran Google internal VMM with an Ubuntu 16.04 4.4.0-83 guest running a
      custom hypervisor with a 32-bit Windows XP L2 guest using PAE. Prior to fix
      would repro readily. Ran 14 simultaneous L2s for 140 iterations with no
      failures.
      Signed-off-by: default avatarPeter Shier <pshier@google.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Message-Id: <20200820230545.2411347-1-pshier@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      43fea4e4
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-5.9-1' of... · 1b67fd08
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for Linux 5.9, take #1
      
      - Multiple stolen time fixes, with a new capability to match x86
      - Fix for hugetlbfs mappings when PUD and PMD are the same level
      - Fix for hugetlbfs mappings when PTE mappings are enforced
        (dirty logging, for example)
      - Fix tracing output of 64bit values
      1b67fd08
  7. 07 Sep, 2020 7 commits
  8. 06 Sep, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block · a8205e31
      Linus Torvalds authored
      Pull more io_uring fixes from Jens Axboe:
       "Two followup fixes. One is fixing a regression from this merge window,
        the other is two commits fixing cancelation of deferred requests.
      
        Both have gone through full testing, and both spawned a few new
        regression test additions to liburing.
      
         - Don't play games with const, properly store the output iovec and
           assign it as needed.
      
         - Deferred request cancelation fix (Pavel)"
      
      * tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
        io_uring: fix linked deferred ->files cancellation
        io_uring: fix cancel of deferred reqs with ->files
        io_uring: fix explicit async read/write mapping for large segments
      a8205e31
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 2ccdd9f8
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - three Intel VT-d fixes to fix address handling on 32bit, fix a NULL
         pointer dereference bug and serialize a hardware register access as
         required by the VT-d spec.
      
       - two patches for AMD IOMMU to force AMD GPUs into translation mode
         when memory encryption is active and disallow using IOMMUv2
         functionality.  This makes the AMDGPU driver work when memory
         encryption is active.
      
       - two more fixes for AMD IOMMU to fix updating the Interrupt Remapping
         Table Entries.
      
       - MAINTAINERS file update for the Qualcom IOMMU driver.
      
      * tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Handle 36bit addressing for x86-32
        iommu/amd: Do not use IOMMUv2 functionality when SME is active
        iommu/amd: Do not force direct mapping when SME is active
        iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
        iommu/amd: Restore IRTE.RemapEn bit after programming IRTE
        iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
        iommu/vt-d: Serialize IOMMU GCMD register modifications
        MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
      2ccdd9f8
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 015b3155
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - more generic entry code ABI fallout
      
       - debug register handling bugfixes
      
       - fix vmalloc mappings on 32-bit kernels
      
       - kprobes instrumentation output fix on 32-bit kernels
      
       - fix over-eager WARN_ON_ONCE() on !SMAP hardware
      
       - NUMA debugging fix
      
       - fix Clang related crash on !RETPOLINE kernels
      
      * tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry: Unbreak 32bit fast syscall
        x86/debug: Allow a single level of #DB recursion
        x86/entry: Fix AC assertion
        tracing/kprobes, x86/ptrace: Fix regs argument order for i386
        x86, fakenuma: Fix invalid starting node ID
        x86/mm/32: Bring back vmalloc faulting on x86_32
        x86/cmdline: Disable jump tables for cmdline.c
      015b3155
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 68beef57
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
       "A small series for fixing a problem with Xen PVH guests when running
        as backends (e.g. as dom0).
      
        Mapping other guests' memory is now working via ZONE_DEVICE, thus not
        requiring to abuse the memory hotplug functionality for that purpose"
      
      * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: add helpers to allocate unpopulated memory
        memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC
        xen/balloon: add header guard
      68beef57
  9. 05 Sep, 2020 5 commits
    • Pavel Begunkov's avatar
      io_uring: fix linked deferred ->files cancellation · c127a2a1
      Pavel Begunkov authored
      While looking for ->files in ->defer_list, consider that requests there
      may actually be links.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c127a2a1
    • Pavel Begunkov's avatar
      io_uring: fix cancel of deferred reqs with ->files · b7ddce3c
      Pavel Begunkov authored
      While trying to cancel requests with ->files, it also should look for
      requests in ->defer_list, otherwise it might end up hanging a thread.
      
      Cancel all requests in ->defer_list up to the last request there with
      matching ->files, that's needed to follow drain ordering semantics.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b7ddce3c
    • Linus Torvalds's avatar
      Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4'... · dd9fb9bb
      Linus Torvalds authored
      Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux
      
      Pull misc fixes from Miguel Ojeda:
       "A trivial patch for auxdisplay:
      
         - Replace HTTP links with HTTPS ones (Alexander A. Klimov)
      
        The usual clang-format trivial update:
      
         - Update with the latest for_each macro list (Miguel Ojeda)
      
        And Luc requested me to pick a sparse fix on my queue, so here it goes
        along with other two trivial Compiler Attributes ones (also from Luc).
      
         - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van
           Oostenryck)
      
         - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van
           Oostenryck)
      
         - Compiler Attributes: remove comment about sparse not supporting
           __has_attribute (Luc Van Oostenryck)"
      
      * tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
        auxdisplay: Replace HTTP links with HTTPS ones
      
      * tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
        clang-format: Update with the latest for_each macro list
      
      * tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
        sparse: use static inline for __chk_{user,io}_ptr()
        Compiler Attributes: fix comment concerning GCC 4.6
        Compiler Attributes: remove comment about sparse not supporting __has_attribute
      dd9fb9bb
    • Linus Torvalds's avatar
      Merge tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 70187f77
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - HSDK-4xd Dev system: perf driver updates for sampling interrupt
      
       - HSDK* Dev System: Ethernet broken [Evgeniy Didin]
      
       - HIGHMEM broken (2 memory banks) [Mike Rapoport]
      
       - show_regs() rewrite once and for all
      
       - Other minor fixes
      
      * tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id
        arc: fix memory initialization for systems with two memory banks
        irqchip/eznps: Fix build error for !ARC700 builds
        ARC: show_regs: fix r12 printing and simplify
        ARC: HSDK: wireup perf irq
        ARC: perf: don't bail setup if pct irq missing in device-tree
        ARC: pgalloc.h: delete a duplicated word + other fixes
      70187f77
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 7514c036
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "19 patches.
      
        Subsystems affected by this patch series: MAINTAINERS, ipc, fork,
        checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration,
        hugetlb)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        include/linux/log2.h: add missing () around n in roundup_pow_of_two()
        mm/khugepaged.c: fix khugepaged's request size in collapse_file
        mm/hugetlb: fix a race between hugetlb sysctl handlers
        mm/hugetlb: try preferred node first when alloc gigantic page from cma
        mm/migrate: preserve soft dirty in remove_migration_pte()
        mm/migrate: remove unnecessary is_zone_device_page() check
        mm/rmap: fixup copying of soft dirty and uffd ptes
        mm/migrate: fixup setting UFFD_WP flag
        mm: madvise: fix vma user-after-free
        checkpatch: fix the usage of capture group ( ... )
        fork: adjust sysctl_max_threads definition to match prototype
        ipc: adjust proc_ipc_sem_dointvec definition to match prototype
        mm: track page table modifications in __apply_to_page_range()
        MAINTAINERS: IA64: mark Status as Odd Fixes only
        MAINTAINERS: add LLVM maintainers
        MAINTAINERS: update Cavium/Marvell entries
        mm: slub: fix conversion of freelist_corrupted()
        mm: memcg: fix memcg reclaim soft lockup
        memcg: fix use-after-free in uncharge_batch
      7514c036