1. 27 May, 2021 16 commits
    • Axel Rasmussen's avatar
      KVM: selftests: allow different backing source types · 0368c2c1
      Axel Rasmussen authored
      Add an argument which lets us specify a different backing memory type
      for the test. The default is just to use anonymous, matching existing
      behavior.
      
      This is in preparation for testing UFFD minor faults. For that, we'll
      need to use a new backing memory type which is setup with MAP_SHARED.
      Signed-off-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Message-Id: <20210519200339.829146-6-axelrasmussen@google.com>
      Reviewed-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0368c2c1
    • Axel Rasmussen's avatar
      KVM: selftests: compute correct demand paging size · 32ffa4f7
      Axel Rasmussen authored
      This is a preparatory commit needed before we can use different kinds of
      backing pages for guest memory.
      
      Previously, we used perf_test_args.host_page_size, which is the host's
      native page size (commonly 4K). For VM_MEM_SRC_ANONYMOUS this turns out
      to be okay, but in a follow-up commit we want to allow using different
      kinds of backing memory.
      
      Take VM_MEM_SRC_ANONYMOUS_HUGETLB for example. Without this change, if
      we used that backing page type, when we issued a UFFDIO_COPY ioctl we'd
      only do so with 4K, rather than the full 2M of a backing hugepage. In
      this case, UFFDIO_COPY returns -EINVAL (__mcopy_atomic_hugetlb checks
      the size).
      Signed-off-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Message-Id: <20210519200339.829146-5-axelrasmussen@google.com>
      Reviewed-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      32ffa4f7
    • Axel Rasmussen's avatar
      KVM: selftests: simplify setup_demand_paging error handling · 25408e5a
      Axel Rasmussen authored
      A small cleanup. Our caller writes:
      
        r = setup_demand_paging(...);
        if (r < 0) exit(-r);
      
      Since we're just going to exit anyway, instead of returning an error we
      can just re-use TEST_ASSERT. This makes the caller simpler, as well as
      the function itself - no need to write our branches, etc.
      Signed-off-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Message-Id: <20210519200339.829146-3-axelrasmussen@google.com>
      Reviewed-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      25408e5a
    • David Matlack's avatar
      KVM: selftests: Print a message if /dev/kvm is missing · 2aab4b35
      David Matlack authored
      If a KVM selftest is run on a machine without /dev/kvm, it will exit
      silently. Make it easy to tell what's happening by printing an error
      message.
      
      Opportunistically consolidate all codepaths that open /dev/kvm into a
      single function so they all print the same message.
      
      This slightly changes the semantics of vm_is_unrestricted_guest() by
      changing a TEST_ASSERT() to exit(KSFT_SKIP). However
      vm_is_unrestricted_guest() is only called in one place
      (x86_64/mmio_warning_test.c) and that is to determine if the test should
      be skipped or not.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210511202120.1371800-1-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2aab4b35
    • Axel Rasmussen's avatar
      KVM: selftests: trivial comment/logging fixes · c887d6a1
      Axel Rasmussen authored
      Some trivial fixes I found while touching related code in this series,
      factored out into a separate commit for easier reviewing:
      
      - s/gor/got/ and add a newline in demand_paging_test.c
      - s/backing_src/src_type/ in a comment to be consistent with the real
        function signature in kvm_util.c
      Signed-off-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Message-Id: <20210519200339.829146-2-axelrasmussen@google.com>
      Reviewed-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c887d6a1
    • David Matlack's avatar
      KVM: selftests: Fix hang in hardware_disable_test · a10453c0
      David Matlack authored
      If /dev/kvm is not available then hardware_disable_test will hang
      indefinitely because the child process exits before posting to the
      semaphore for which the parent is waiting.
      
      Fix this by making the parent periodically check if the child has
      exited. We have to be careful to forward the child's exit status to
      preserve a KSFT_SKIP status.
      
      I considered just checking for /dev/kvm before creating the child
      process, but there are so many other reasons why the child could exit
      early that it seemed better to handle that as general case.
      
      Tested:
      
      $ ./hardware_disable_test
      /dev/kvm not available, skipping test
      $ echo $?
      4
      $ modprobe kvm_intel
      $ ./hardware_disable_test
      $ echo $?
      0
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210514230521.2608768-1-dmatlack@google.com>
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a10453c0
    • David Matlack's avatar
      KVM: selftests: Ignore CPUID.0DH.1H in get_cpuid_test · 50bc913d
      David Matlack authored
      Similar to CPUID.0DH.0H this entry depends on the vCPU's XCR0 register
      and IA32_XSS MSR. Since this test does not control for either before
      assigning the vCPU's CPUID, these entries will not necessarily match
      the supported CPUID exposed by KVM.
      
      This fixes get_cpuid_test on Cascade Lake CPUs.
      Suggested-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210519211345.3944063-1-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      50bc913d
    • David Matlack's avatar
      KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn() · ef4c9f4f
      David Matlack authored
      vm_get_max_gfn() casts vm->max_gfn from a uint64_t to an unsigned int,
      which causes the upper 32-bits of the max_gfn to get truncated.
      
      Nobody noticed until now likely because vm_get_max_gfn() is only used
      as a mechanism to create a memslot in an unused region of the guest
      physical address space (the top), and the top of the 32-bit physical
      address space was always good enough.
      
      This fix reveals a bug in memslot_modification_stress_test which was
      trying to create a dummy memslot past the end of guest physical memory.
      Fix that by moving the dummy memslot lower.
      
      Fixes: 52200d0d ("KVM: selftests: Remove duplicate guest mode handling")
      Reviewed-by: default avatarVenkatesh Srinivas <venkateshs@chromium.org>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210521173828.1180619-1-dmatlack@google.com>
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ef4c9f4f
    • Maciej S. Szmigiero's avatar
      KVM: selftests: add a memslot-related performance benchmark · cad347fa
      Maciej S. Szmigiero authored
      This benchmark contains the following tests:
      * Map test, where the host unmaps guest memory while the guest writes to
      it (maps it).
      
      The test is designed in a way to make the unmap operation on the host
      take a negligible amount of time in comparison with the mapping
      operation in the guest.
      
      The test area is actually split in two: the first half is being mapped
      by the guest while the second half in being unmapped by the host.
      Then a guest <-> host sync happens and the areas are reversed.
      
      * Unmap test which is broadly similar to the above map test, but it is
      designed in an opposite way: to make the mapping operation in the guest
      take a negligible amount of time in comparison with the unmap operation
      on the host.
      This test is available in two variants: with per-page unmap operation
      or a chunked one (using 2 MiB chunk size).
      
      * Move active area test which involves moving the last (highest gfn)
      memslot a bit back and forth on the host while the guest is
      concurrently writing around the area being moved (including over the
      moved memslot).
      
      * Move inactive area test which is similar to the previous move active
      area test, but now guest writes all happen outside of the area being
      moved.
      
      * Read / write test in which the guest writes to the beginning of each
      page of the test area while the host writes to the middle of each such
      page.
      Then each side checks the values the other side has written.
      This particular test is not expected to give different results depending
      on particular memslots implementation, it is meant as a rough sanity
      check and to provide insight on the spread of test results expected.
      
      Each test performs its operation in a loop until a test period ends
      (this is 5 seconds by default, but it is configurable).
      Then the total count of loops done is divided by the actual elapsed
      time to give the test result.
      
      The tests have a configurable memslot cap with the "-s" test option, by
      default the system maximum is used.
      Each test is repeated a particular number of times (by default 20
      times), the best result achieved is printed.
      
      The test memory area is divided equally between memslots, the reminder
      is added to the last memslot.
      The test area size does not depend on the number of memslots in use.
      
      The tests also measure the time that it took to add all these memslots.
      The best result from the tests that use the whole test area is printed
      after all the requested tests are done.
      
      In general, these tests are designed to use as much memory as possible
      (within reason) while still doing 100+ loops even on high memslot counts
      with the default test length.
      Increasing the test runtime makes it increasingly more likely that some
      event will happen on the system during the test run, which might lower
      the test result.
      Signed-off-by: default avatarMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Message-Id: <8d31bb3d92bc8fa33a9756fa802ee14266ab994e.1618253574.git.maciej.szmigiero@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cad347fa
    • Maciej S. Szmigiero's avatar
      KVM: selftests: Keep track of memslots more efficiently · 22721a56
      Maciej S. Szmigiero authored
      The KVM selftest framework was using a simple list for keeping track of
      the memslots currently in use.
      This resulted in lookups and adding a single memslot being O(n), the
      later due to linear scanning of the existing memslot set to check for
      the presence of any conflicting entries.
      
      Before this change, benchmarking high count of memslots was more or less
      impossible as pretty much all the benchmark time was spent in the
      selftest framework code.
      
      We can simply use a rbtree for keeping track of both of gfn and hva.
      We don't need an interval tree for hva here as we can't have overlapping
      memslots because we allocate a completely new memory chunk for each new
      memslot.
      Signed-off-by: default avatarMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Message-Id: <b12749d47ee860468240cf027412c91b76dbe3db.1618253574.git.maciej.szmigiero@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      22721a56
    • Paolo Bonzini's avatar
      selftests: kvm: fix potential issue with ELF loading · a13534d6
      Paolo Bonzini authored
      vm_vaddr_alloc() sets up GVA to GPA mapping page by page; therefore, GPAs
      may not be continuous if same memslot is used for data and page table allocation.
      
      kvm_vm_elf_load() however expects a continuous range of HVAs (and thus GPAs)
      because it does not try to read file data page by page.  Fix this mismatch
      by allocating memory in one step.
      Reported-by: default avatarZhenzhong Duan <zhenzhong.duan@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a13534d6
    • Zhenzhong Duan's avatar
      selftests: kvm: make allocation of extra memory take effect · 39fe2fc9
      Zhenzhong Duan authored
      The extra memory pages is missed to be allocated during VM creating.
      perf_test_util and kvm_page_table_test use it to alloc extra memory
      currently.
      
      Fix it by adding extra_mem_pages to the total memory calculation before
      allocate.
      Signed-off-by: default avatarZhenzhong Duan <zhenzhong.duan@intel.com>
      Message-Id: <20210512043107.30076-1-zhenzhong.duan@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      39fe2fc9
    • Wanpeng Li's avatar
      KVM: X86: hyper-v: Task srcu lock when accessing kvm_memslots() · da6d63a0
      Wanpeng Li authored
         WARNING: suspicious RCU usage
         5.13.0-rc1 #4 Not tainted
         -----------------------------
         ./include/linux/kvm_host.h:710 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 2, debug_locks = 1
         1 lock held by hyperv_clock/8318:
          #0: ffffb6b8cb05a7d8 (&hv->hv_lock){+.+.}-{3:3}, at: kvm_hv_invalidate_tsc_page+0x3e/0xa0 [kvm]
      
        stack backtrace:
        CPU: 3 PID: 8318 Comm: hyperv_clock Not tainted 5.13.0-rc1 #4
        Call Trace:
         dump_stack+0x87/0xb7
         lockdep_rcu_suspicious+0xce/0xf0
         kvm_write_guest_page+0x1c1/0x1d0 [kvm]
         kvm_write_guest+0x50/0x90 [kvm]
         kvm_hv_invalidate_tsc_page+0x79/0xa0 [kvm]
         kvm_gen_update_masterclock+0x1d/0x110 [kvm]
         kvm_arch_vm_ioctl+0x2a7/0xc50 [kvm]
         kvm_vm_ioctl+0x123/0x11d0 [kvm]
         __x64_sys_ioctl+0x3ed/0x9d0
         do_syscall_64+0x3d/0x80
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      kvm_memslots() will be called by kvm_write_guest(), so we should take the srcu lock.
      
      Fixes: e880c6ea (KVM: x86: hyper-v: Prevent using not-yet-updated TSC page by secondary CPUs)
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1621339235-11131-4-git-send-email-wanpengli@tencent.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      da6d63a0
    • Wanpeng Li's avatar
      KVM: X86: Fix vCPU preempted state from guest's point of view · 1eff0ada
      Wanpeng Li authored
      Commit 66570e96 (kvm: x86: only provide PV features if enabled in guest's
      CPUID) avoids to access pv tlb shootdown host side logic when this pv feature
      is not exposed to guest, however, kvm_steal_time.preempted not only leveraged
      by pv tlb shootdown logic but also mitigate the lock holder preemption issue.
      From guest's point of view, vCPU is always preempted since we lose the reset
      of kvm_steal_time.preempted before vmentry if pv tlb shootdown feature is not
      exposed. This patch fixes it by clearing kvm_steal_time.preempted before
      vmentry.
      
      Fixes: 66570e96 (kvm: x86: only provide PV features if enabled in guest's CPUID)
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1621339235-11131-3-git-send-email-wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1eff0ada
    • Wanpeng Li's avatar
      KVM: X86: Bail out of direct yield in case of under-committed scenarios · 72b268a8
      Wanpeng Li authored
      In case of under-committed scenarios, vCPUs can be scheduled easily;
      kvm_vcpu_yield_to adds extra overhead, and it is also common to see
      when vcpu->ready is true but yield later failing due to p->state is
      TASK_RUNNING.
      
      Let's bail out in such scenarios by checking the length of current cpu
      runqueue, which can be treated as a hint of under-committed instead of
      guarantee of accuracy. 30%+ of directed-yield attempts can now avoid
      the expensive lookups in kvm_sched_yield() in an under-committed scenario.
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1621339235-11131-2-git-send-email-wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      72b268a8
    • Wanpeng Li's avatar
      KVM: PPC: exit halt polling on need_resched() · 6bd5b743
      Wanpeng Li authored
      This is inspired by commit 262de410 (kvm: exit halt polling on
      need_resched() as well). Due to PPC implements an arch specific halt
      polling logic, we have to the need_resched() check there as well. This
      patch adds a helper function that can be shared between book3s and generic
      halt-polling loops.
      Reviewed-by: default avatarDavid Matlack <dmatlack@google.com>
      Reviewed-by: default avatarVenkatesh Srinivas <venkateshs@chromium.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Venkatesh Srinivas <venkateshs@chromium.org>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: David Matlack <dmatlack@google.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1621339235-11131-1-git-send-email-wanpengli@tencent.com>
      [Make the function inline. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6bd5b743
  2. 24 May, 2021 3 commits
  3. 17 May, 2021 1 commit
  4. 15 May, 2021 7 commits
  5. 09 May, 2021 10 commits
    • Linus Torvalds's avatar
      Linux 5.13-rc1 · 6efb943b
      Linus Torvalds authored
      6efb943b
    • Linus Torvalds's avatar
      fbmem: fix horribly incorrect placement of __maybe_unused · 6dae40ae
      Linus Torvalds authored
      Commit b9d79e4c ("fbmem: Mark proc_fb_seq_ops as __maybe_unused")
      places the '__maybe_unused' in an entirely incorrect location between
      the "struct" keyword and the structure name.
      
      It's a wonder that gcc accepts that silently, but clang quite reasonably
      warns about it:
      
          drivers/video/fbdev/core/fbmem.c:736:21: warning: attribute declaration must precede definition [-Wignored-attributes]
          static const struct __maybe_unused seq_operations proc_fb_seq_ops = {
                              ^
      
      Fix it.
      
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6dae40ae
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2021-05-10' of git://anongit.freedesktop.org/drm/drm · efc58a96
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Bit later than usual, I queued them all up on Friday then promptly
        forgot to write the pull request email. This is mainly amdgpu fixes,
        with some radeon/msm/fbdev and one i915 gvt fix thrown in.
      
        amdgpu:
         - MPO hang workaround
         - Fix for concurrent VM flushes on vega/navi
         - dcefclk is not adjustable on navi1x and newer
         - MST HPD debugfs fix
         - Suspend/resumes fixes
         - Register VGA clients late in case driver fails to load
         - Fix GEM leak in user framebuffer create
         - Add support for polaris12 with 32 bit memory interface
         - Fix duplicate cursor issue when using overlay
         - Fix corruption with tiled surfaces on VCN3
         - Add BO size and stride check to fix BO size verification
      
        radeon:
         - Fix off-by-one in power state parsing
         - Fix possible memory leak in power state parsing
      
        msm:
         - NULL ptr dereference fix
      
        fbdev:
         - procfs disabled warning fix
      
        i915:
         - gvt: Fix a possible division by zero in vgpu display rate
           calculation"
      
      * tag 'drm-next-2021-05-10' of git://anongit.freedesktop.org/drm/drm:
        drm/amdgpu: Use device specific BO size & stride check.
        drm/amdgpu: Init GFX10_ADDR_CONFIG for VCN v3 in DPG mode.
        drm/amd/pm: initialize variable
        drm/radeon: Avoid power table parsing memory leaks
        drm/radeon: Fix off-by-one power_state index heap overwrite
        drm/amd/display: Fix two cursor duplication when using overlay
        drm/amdgpu: add new MC firmware for Polaris12 32bit ASIC
        fbmem: Mark proc_fb_seq_ops as __maybe_unused
        drm/msm/dpu: Delete bonkers code
        drm/i915/gvt: Prevent divided by zero when calculating refresh rate
        amdgpu: fix GEM obj leak in amdgpu_display_user_framebuffer_create
        drm/amdgpu: Register VGA clients after init can no longer fail
        drm/amdgpu: Handling of amdgpu_device_resume return value for graceful teardown
        drm/amdgpu: fix r initial values
        drm/amd/display: fix wrong statement in mst hpd debugfs
        amdgpu/pm: set pp_dpm_dcefclk to readonly on NAVI10 and newer gpus
        amdgpu/pm: Prevent force of DCEFCLK on NAVI10 and SIENNA_CICHLID
        drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2
        drm/amd/display: Reject non-zero src_y and src_x for video planes
      efc58a96
    • Linus Torvalds's avatar
      Merge tag 'block-5.13-2021-05-09' of git://git.kernel.dk/linux-block · 506c3079
      Linus Torvalds authored
      Pull block fix from Jens Axboe:
       "Turns out the bio max size change still has issues, so let's get it
        reverted for 5.13-rc1. We'll shake out the issues there and defer it
        to 5.14 instead"
      
      * tag 'block-5.13-2021-05-09' of git://git.kernel.dk/linux-block:
        Revert "bio: limit bio max size"
      506c3079
    • Linus Torvalds's avatar
      Merge tag '5.13-rc-smb3-part3' of git://git.samba.org/sfrench/cifs-2.6 · 0a55a1fb
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Three small SMB3 chmultichannel related changesets (also for stable)
        from the SMB3 test event this week.
      
        The other fixes are still in review/testing"
      
      * tag '5.13-rc-smb3-part3' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: if max_channels set to more than one channel request multichannel
        smb3: do not attempt multichannel to server which does not support it
        smb3: when mounting with multichannel include it in requested capabilities
      0a55a1fb
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9819f682
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "A set of scheduler updates:
      
         - Prevent PSI state corruption when schedule() races with cgroup
           move.
      
           A recent commit combined two PSI callbacks to reduce the number of
           cgroup tree updates, but missed that schedule() can drop rq::lock
           for load balancing, which opens the race window for
           cgroup_move_task() which then observes half updated state.
      
           The fix is to solely use task::ps_flags instead of looking at the
           potentially mismatching scheduler state
      
         - Prevent an out-of-bounds access in uclamp caused bu a rounding
           division which can lead to an off-by-one error exceeding the
           buckets array size.
      
         - Prevent unfairness caused by missing load decay when a task is
           attached to a cfs runqueue.
      
           The old load of the task was attached to the runqueue and never
           removed. Fix it by enforcing the load update through the hierarchy
           for unthrottled run queue instances.
      
         - A documentation fix fot the 'sched_verbose' command line option"
      
      * tag 'sched-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix unfairness caused by missing load decay
        sched: Fix out-of-bound access in uclamp
        psi: Fix psi state corruption when schedule() races with cgroup move
        sched,doc: sched_debug_verbose cmdline should be sched_verbose
      9819f682
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 732a27a0
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "A set of locking related fixes and updates:
      
         - Two fixes for the futex syscall related to the timeout handling.
      
           FUTEX_LOCK_PI does not support the FUTEX_CLOCK_REALTIME bit and
           because it's not set the time namespace adjustment for clock
           MONOTONIC is applied wrongly.
      
           FUTEX_WAIT cannot support the FUTEX_CLOCK_REALTIME bit because its
           always a relative timeout.
      
         - Cleanups in the futex syscall entry points which became obvious
           when the two timeout handling bugs were fixed.
      
         - Cleanup of queued_write_lock_slowpath() as suggested by Linus
      
         - Fixup of the smp_call_function_single_async() prototype"
      
      * tag 'locking-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        futex: Make syscall entry points less convoluted
        futex: Get rid of the val2 conditional dance
        futex: Do not apply time namespace adjustment on FUTEX_LOCK_PI
        Revert 337f1304 ("futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op")
        locking/qrwlock: Cleanup queued_write_lock_slowpath()
        smp: Fix smp_call_function_single_async prototype
      732a27a0
    • Linus Torvalds's avatar
      Merge tag 'perf_urgent_for_v5.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 85bbba1c
      Linus Torvalds authored
      Pull x86 perf fix from Borislav Petkov:
       "Handle power-gating of AMD IOMMU perf counters properly when they are
        used"
      
      * tag 'perf_urgent_for_v5.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating
      85bbba1c
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dd3e4012
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
       "A bunch of things accumulated for x86 in the last two weeks:
      
         - Fix guest vtime accounting so that ticks happening while the guest
           is running can also be accounted to it. Along with a consolidation
           to the guest-specific context tracking helpers.
      
         - Provide for the host NMI handler running after a VMX VMEXIT to be
           able to run on the kernel stack correctly.
      
         - Initialize MSR_TSC_AUX when RDPID is supported and not RDTSCP (virt
           relevant - real hw supports both)
      
         - A code generation improvement to TASK_SIZE_MAX through the use of
           alternatives
      
         - The usual misc and related cleanups and improvements"
      
      * tag 'x86_urgent_for_v5.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        KVM: x86: Consolidate guest enter/exit logic to common helpers
        context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain
        context_tracking: Consolidate guest enter/exit wrappers
        sched/vtime: Move guest enter/exit vtime accounting to vtime.h
        sched/vtime: Move vtime accounting external declarations above inlines
        KVM: x86: Defer vtime accounting 'til after IRQ handling
        context_tracking: Move guest exit vtime accounting to separate helpers
        context_tracking: Move guest exit context tracking to separate helpers
        KVM/VMX: Invoke NMI non-IST entry instead of IST entry
        x86/cpu: Remove write_tsc() and write_rdtscp_aux() wrappers
        x86/cpu: Initialize MSR_TSC_AUX if RDTSCP *or* RDPID is supported
        x86/resctrl: Fix init const confusion
        x86: Delete UD0, UD1 traces
        x86/smpboot: Remove duplicate includes
        x86/cpu: Use alternative to generate the TASK_SIZE_MAX constant
      dd3e4012
    • Jens Axboe's avatar
      Revert "bio: limit bio max size" · 35c820e7
      Jens Axboe authored
      This reverts commit cd2c7545.
      
      Alex reports that the commit causes corruption with LUKS on ext4. Revert
      it for now so that this can be investigated properly.
      
      Link: https://lore.kernel.org/linux-block/1620493841.bxdq8r5haw.none@localhost/Reported-by: default avatarAlex Xu (Hello71) <alex_y_xu@yahoo.ca>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      35c820e7
  6. 08 May, 2021 3 commits
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · b7415964
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - A fix to avoid over-allocating the kernel's mapping on !MMU systems,
         which could lead to up to 2MiB of lost memory
      
       - The SiFive address extension errata only manifest on rv64, they are
         now disabled on rv32 where they are unnecessary
      
       - A pair of late-landing cleanups
      
      * tag 'riscv-for-linus-5.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: remove unused handle_exception symbol
        riscv: Consistify protect_kernel_linear_mapping_text_rodata() use
        riscv: enable SiFive errata CIP-453 and CIP-1200 Kconfig only if CONFIG_64BIT=y
        riscv: Only extend kernel reservation if mapped read-only
      b7415964
    • Linus Torvalds's avatar
      drm/i915/display: fix compiler warning about array overrun · fec4d427
      Linus Torvalds authored
      intel_dp_check_mst_status() uses a 14-byte array to read the DPRX Event
      Status Indicator data, but then passes that buffer at offset 10 off as
      an argument to drm_dp_channel_eq_ok().
      
      End result: there are only 4 bytes remaining of the buffer, yet
      drm_dp_channel_eq_ok() wants a 6-byte buffer.  gcc-11 correctly warns
      about this case:
      
        drivers/gpu/drm/i915/display/intel_dp.c: In function ‘intel_dp_check_mst_status’:
        drivers/gpu/drm/i915/display/intel_dp.c:3491:22: warning: ‘drm_dp_channel_eq_ok’ reading 6 bytes from a region of size 4 [-Wstringop-overread]
         3491 |                     !drm_dp_channel_eq_ok(&esi[10], intel_dp->lane_count)) {
              |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        drivers/gpu/drm/i915/display/intel_dp.c:3491:22: note: referencing argument 1 of type ‘const u8 *’ {aka ‘const unsigned char *’}
        In file included from drivers/gpu/drm/i915/display/intel_dp.c:38:
        include/drm/drm_dp_helper.h:1466:6: note: in a call to function ‘drm_dp_channel_eq_ok’
         1466 | bool drm_dp_channel_eq_ok(const u8 link_status[DP_LINK_STATUS_SIZE],
              |      ^~~~~~~~~~~~~~~~~~~~
             6:14 elapsed
      
      This commit just extends the original array by 2 zero-initialized bytes,
      avoiding the warning.
      
      There may be some underlying bug in here that caused this confusion, but
      this is at least no worse than the existing situation that could use
      random data off the stack.
      
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fec4d427
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 07db0563
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "This is a set of minor fixes in various drivers (qla2xxx, ufs,
        scsi_debug, lpfc) one doc fix and a fairly large update to the fnic
        driver to remove the open coded iteration functions in favour of the
        scsi provided ones"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: fnic: Use scsi_host_busy_iter() to traverse commands
        scsi: fnic: Kill 'exclude_id' argument to fnic_cleanup_io()
        scsi: scsi_debug: Fix cmd_per_lun, set to max_queue
        scsi: ufs: core: Narrow down fast path in system suspend path
        scsi: ufs: core: Cancel rpm_dev_flush_recheck_work during system suspend
        scsi: ufs: core: Do not put UFS power into LPM if link is broken
        scsi: qla2xxx: Prevent PRLI in target mode
        scsi: qla2xxx: Add marginal path handling support
        scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found
        scsi: ufs: core: Fix a typo in ufs-sysfs.c
        scsi: lpfc: Fix bad memory access during VPD DUMP mailbox command
        scsi: lpfc: Fix DMA virtual address ptr assignment in bsg
        scsi: lpfc: Fix illegal memory access on Abort IOCBs
        scsi: blk-mq: Fix build warning when making htmldocs
      07db0563