1. 09 Oct, 2023 2 commits
  2. 06 Oct, 2023 1 commit
    • David Woodhouse's avatar
      KVM: x86: Refine calculation of guest wall clock to use a single TSC read · 5d6d6a7d
      David Woodhouse authored
      When populating the guest's PV wall clock information, KVM currently does
      a simple 'kvm_get_real_ns() - get_kvmclock_ns(kvm)'. This is an antipattern
      which should be avoided; when working with the relationship between two
      clocks, it's never correct to obtain one of them "now" and then the other
      at a slightly different "now" after an unspecified period of preemption
      (which might not even be under the control of the kernel, if this is an
      L1 hosting an L2 guest under nested virtualization).
      
      Add a kvm_get_wall_clock_epoch() function to return the guest wall clock
      epoch in nanoseconds using the same method as __get_kvmclock() — by using
      kvm_get_walltime_and_clockread() to calculate both the wall clock and KVM
      clock time from a *single* TSC reading.
      
      The condition using get_cpu_tsc_khz() is equivalent to the version in
      __get_kvmclock() which separately checks for the CONSTANT_TSC feature or
      the per-CPU cpu_tsc_khz. Which is what get_cpu_tsc_khz() does anyway.
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Link: https://lore.kernel.org/r/bfc6d3d7cfb88c47481eabbf5a30a264c58c7789.camel@infradead.orgSigned-off-by: default avatarSean Christopherson <seanjc@google.com>
      5d6d6a7d
  3. 04 Oct, 2023 2 commits
  4. 28 Sep, 2023 1 commit
  5. 27 Sep, 2023 2 commits
  6. 23 Sep, 2023 6 commits
    • Paolo Bonzini's avatar
      Merge tag 'kvm-riscv-fixes-6.6-1' of https://github.com/kvm-riscv/linux into HEAD · 5804c19b
      Paolo Bonzini authored
      KVM/riscv fixes for 6.6, take #1
      
      - Fix KVM_GET_REG_LIST API for ISA_EXT registers
      - Fix reading ISA_EXT register of a missing extension
      - Fix ISA_EXT register handling in get-reg-list test
      - Fix filtering of AIA registers in get-reg-list test
      5804c19b
    • Tom Lendacky's avatar
      KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX · 916e3e5f
      Tom Lendacky authored
      When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B"
      within the VMSA. This means that the guest value is loaded on VMRUN and
      the host value is restored from the host save area on #VMEXIT.
      
      Since the value is restored on #VMEXIT, the KVM user return MSR support
      for TSC_AUX can be replaced by populating the host save area with the
      current host value of TSC_AUX. And, since TSC_AUX is not changed by Linux
      post-boot, the host save area can be set once in svm_hardware_enable().
      This eliminates the two WRMSR instructions associated with the user return
      MSR support.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <d381de38eb0ab6c9c93dda8503b72b72546053d7.1694811272.git.thomas.lendacky@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      916e3e5f
    • Tom Lendacky's avatar
      KVM: SVM: Fix TSC_AUX virtualization setup · e0096d01
      Tom Lendacky authored
      The checks for virtualizing TSC_AUX occur during the vCPU reset processing
      path. However, at the time of initial vCPU reset processing, when the vCPU
      is first created, not all of the guest CPUID information has been set. In
      this case the RDTSCP and RDPID feature support for the guest is not in
      place and so TSC_AUX virtualization is not established.
      
      This continues for each vCPU created for the guest. On the first boot of
      an AP, vCPU reset processing is executed as a result of an APIC INIT
      event, this time with all of the guest CPUID information set, resulting
      in TSC_AUX virtualization being enabled, but only for the APs. The BSP
      always sees a TSC_AUX value of 0 which probably went unnoticed because,
      at least for Linux, the BSP TSC_AUX value is 0.
      
      Move the TSC_AUX virtualization enablement out of the init_vmcb() path and
      into the vcpu_after_set_cpuid() path to allow for proper initialization of
      the support after the guest CPUID information has been set.
      
      With the TSC_AUX virtualization support now in the vcpu_set_after_cpuid()
      path, the intercepts must be either cleared or set based on the guest
      CPUID input.
      
      Fixes: 296d5a17 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <4137fbcb9008951ab5f0befa74a0399d2cce809a.1694811272.git.thomas.lendacky@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e0096d01
    • Paolo Bonzini's avatar
      KVM: SVM: INTERCEPT_RDTSCP is never intercepted anyway · e8d93d5d
      Paolo Bonzini authored
      svm_recalc_instruction_intercepts() is always called at least once
      before the vCPU is started, so the setting or clearing of the RDTSCP
      intercept can be dropped from the TSC_AUX virtualization support.
      
      Extracted from a patch by Tom Lendacky.
      
      Cc: stable@vger.kernel.org
      Fixes: 296d5a17 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e8d93d5d
    • Sean Christopherson's avatar
      KVM: x86/mmu: Stop zapping invalidated TDP MMU roots asynchronously · 0df9dab8
      Sean Christopherson authored
      Stop zapping invalidate TDP MMU roots via work queue now that KVM
      preserves TDP MMU roots until they are explicitly invalidated.  Zapping
      roots asynchronously was effectively a workaround to avoid stalling a vCPU
      for an extended during if a vCPU unloaded a root, which at the time
      happened whenever the guest toggled CR0.WP (a frequent operation for some
      guest kernels).
      
      While a clever hack, zapping roots via an unbound worker had subtle,
      unintended consequences on host scheduling, especially when zapping
      multiple roots, e.g. as part of a memslot.  Because the work of zapping a
      root is no longer bound to the task that initiated the zap, things like
      the CPU affinity and priority of the original task get lost.  Losing the
      affinity and priority can be especially problematic if unbound workqueues
      aren't affined to a small number of CPUs, as zapping multiple roots can
      cause KVM to heavily utilize the majority of CPUs in the system, *beyond*
      the CPUs KVM is already using to run vCPUs.
      
      When deleting a memslot via KVM_SET_USER_MEMORY_REGION, the async root
      zap can result in KVM occupying all logical CPUs for ~8ms, and result in
      high priority tasks not being scheduled in in a timely manner.  In v5.15,
      which doesn't preserve unloaded roots, the issues were even more noticeable
      as KVM would zap roots more frequently and could occupy all CPUs for 50ms+.
      
      Consuming all CPUs for an extended duration can lead to significant jitter
      throughout the system, e.g. on ChromeOS with virtio-gpu, deleting memslots
      is a semi-frequent operation as memslots are deleted and recreated with
      different host virtual addresses to react to host GPU drivers allocating
      and freeing GPU blobs.  On ChromeOS, the jitter manifests as audio blips
      during games due to the audio server's tasks not getting scheduled in
      promptly, despite the tasks having a high realtime priority.
      
      Deleting memslots isn't exactly a fast path and should be avoided when
      possible, and ChromeOS is working towards utilizing MAP_FIXED to avoid the
      memslot shenanigans, but KVM is squarely in the wrong.  Not to mention
      that removing the async zapping eliminates a non-trivial amount of
      complexity.
      
      Note, one of the subtle behaviors hidden behind the async zapping is that
      KVM would zap invalidated roots only once (ignoring partial zaps from
      things like mmu_notifier events).  Preserve this behavior by adding a flag
      to identify roots that are scheduled to be zapped versus roots that have
      already been zapped but not yet freed.
      
      Add a comment calling out why kvm_tdp_mmu_invalidate_all_roots() can
      encounter invalid roots, as it's not at all obvious why zapping
      invalidated roots shouldn't simply zap all invalid roots.
      Reported-by: default avatarPattara Teerapong <pteerapong@google.com>
      Cc: David Stevens <stevensd@google.com>
      Cc: Yiwei Zhang<zzyiwei@google.com>
      Cc: Paul Hsia <paulhsia@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230916003916.2545000-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0df9dab8
    • Paolo Bonzini's avatar
      KVM: x86/mmu: Do not filter address spaces in for_each_tdp_mmu_root_yield_safe() · 441a5dfc
      Paolo Bonzini authored
      All callers except the MMU notifier want to process all address spaces.
      Remove the address space ID argument of for_each_tdp_mmu_root_yield_safe()
      and switch the MMU notifier to use __for_each_tdp_mmu_root_yield_safe().
      
      Extracted out of a patch by Sean Christopherson <seanjc@google.com>
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      441a5dfc
  7. 21 Sep, 2023 5 commits
  8. 20 Sep, 2023 1 commit
    • Sean Christopherson's avatar
      KVM: selftests: Assert that vasprintf() is successful · 7c329bbd
      Sean Christopherson authored
      Assert that vasprintf() succeeds as the "returned" string is undefined
      on failure.  Checking the result also eliminates the only warning with
      default options in KVM selftests, i.e. is the only thing getting in the
      way of compile with -Werror.
      
        lib/test_util.c: In function ‘strdup_printf’:
        lib/test_util.c:390:9: error: ignoring return value of ‘vasprintf’
        declared with attribute ‘warn_unused_result’ [-Werror=unused-result]
        390 |         vasprintf(&str, fmt, ap);
            |         ^~~~~~~~~~~~~~~~~~~~~~~~
      
      Don't bother capturing the return value, allegedly vasprintf() can only
      fail due to a memory allocation failure.
      
      Fixes: dfaf20af ("KVM: arm64: selftests: Replace str_with_index with strdup_printf")
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: Haibo Xu <haibo1.xu@intel.com>
      Cc: Anup Patel <anup@brainfault.org>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarAndrew Jones <ajones@ventanamicro.com>
      Tested-by: default avatarAndrew Jones <ajones@ventanamicro.com>
      Message-Id: <20230914010636.1391735-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7c329bbd
  9. 17 Sep, 2023 11 commits
  10. 16 Sep, 2023 9 commits
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.6' of... · f0b0d403
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix kernel-devel RPM and linux-headers Deb package
      
       - Fix too long argument list error in 'make modules_install'
      
      * tag 'kbuild-fixes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: avoid long argument lists in make modules_install
        kbuild: fix kernel-devel RPM package and linux-headers Deb package
      f0b0d403
    • Linus Torvalds's avatar
      vm: fix move_vma() memory accounting being off · 3cec5049
      Linus Torvalds authored
      Commit 408579cd ("mm: Update do_vmi_align_munmap() return
      semantics") seems to have updated one of the callers of do_vmi_munmap()
      incorrectly: it used to check for the error case (which didn't
      change: negative means error).
      
      That commit changed the check to the success case (which did change:
      before that commit, 0 was success, and 1 was "success and lock
      downgraded".  After the change, it's always 0 for success, and the lock
      will have been released if requested).
      
      This didn't change any actual VM behavior _except_ for memory accounting
      when 'VM_ACCOUNT' was set on the vma.  Which made the wrong return value
      test fairly subtle, since everything continues to work.
      
      Or rather - it continues to work but the "Committed memory" accounting
      goes all wonky (Committed_AS value in /proc/meminfo), and depending on
      settings that then causes problems much much later as the VM relies on
      bogus statistics for its heuristics.
      
      Revert that one line of the change back to the original logic.
      
      Fixes: 408579cd ("mm: Update do_vmi_align_munmap() return semantics")
      Reported-by: default avatarChristoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
      Reported-bisected-and-tested-by: default avatarMichael Labiuk <michael.labiuk@virtuozzo.com>
      Cc: Bagas Sanjaya <bagasdotme@gmail.com>
      Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
      Link: https://lore.kernel.org/all/1694366957@msgid.manchmal.in-ulm.de/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3cec5049
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · ad8a69f3
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "16 small(ish) fixes all in drivers.
      
        The major fixes are in pm8001 (fixes MSI-X issue going back to its
        origin), the qla2xxx endianness fix, which fixes a bug on big endian
        and the lpfc ones which can cause an oops on module removal without
        them"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rports
        scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmo
        scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()
        scsi: target: core: Fix target_cmd_counter leak
        scsi: pm8001: Setup IRQs on resume
        scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command
        scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command
        scsi: ufs: core: Poll HCS.UCRDY before issuing a UIC command
        scsi: ufs: core: Move __ufshcd_send_uic_cmd() outside host_lock
        scsi: qedf: Add synchronization between I/O completions and abort
        scsi: target: Replace strlcpy() with strscpy()
        scsi: qla2xxx: Fix NULL vs IS_ERR() bug for debugfs_create_dir()
        scsi: qla2xxx: Use raw_smp_processor_id() instead of smp_processor_id()
        scsi: qla2xxx: Correct endianness for rqstlen and rsplen
        scsi: ppa: Fix accidentally reversed conditions for 16-bit and 32-bit EPP
        scsi: megaraid_sas: Fix deadlock on firmware crashdump
      ad8a69f3
    • Linus Torvalds's avatar
      Merge tag 'ata-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · cc3e5afc
      Linus Torvalds authored
      Pull ata fixes from Damien Le Moal:
      
       - Fix link power management transitions to disallow unsupported states
         (Niklas)
      
       - A small string handling fix for the sata_mv driver (Christophe)
      
       - Clear port pending interrupts before reset, as per AHCI
         specifications (Szuying).
      
         Followup fixes for this one are to not clear ATA_PFLAG_EH_PENDING in
         ata_eh_reset() to allow EH to continue on with other actions recorded
         with error interrupts triggered before EH completes. And an
         additional fix to avoid thawing a port twice in EH (Niklas)
      
       - Small code style fixes in the pata_parport driver to silence the
         build bot as it keeps complaining about bad indentation (me)
      
       - A fix for the recent CDL code to avoid fetching sense data for
         successful commands when not necessary for correct operation (Niklas)
      
      * tag 'ata-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
        ata: libata-core: fetch sense data for successful commands iff CDL enabled
        ata: libata-eh: do not thaw the port twice in ata_eh_reset()
        ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
        ata: pata_parport: Fix code style issues
        ata: libahci: clear pending interrupt status
        ata: sata_mv: Fix incorrect string length computation in mv_dump_mem()
        ata: libata: disallow dev-initiated LPM transitions to unsupported states
      cc3e5afc
    • Linus Torvalds's avatar
      Merge tag 'usb-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · cce67b6b
      Linus Torvalds authored
      Pull USB fix from Greg KH:
       "Here is a single USB fix for a much-reported regression for 6.6-rc1.
      
        It resolves a crash in the typec debugfs code for many systems. It's
        been in linux-next with no reported issues, and many people have
        reported it resolving their problem with 6.6-rc1"
      
      * tag 'usb-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: ucsi: Fix NULL pointer dereference
      cce67b6b
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 205d0494
      Linus Torvalds authored
      Pull driver core fixes from Greg KH:
       "Here is a single driver core fix for a much-reported-by-sysbot issue
        that showed up in 6.6-rc1. It's been submitted by many people, all in
        the same way, so it obviously fixes things for them all.
      
        Also in here is a single documentation update adding riscv to the
        embargoed hardware document in case there are any future issues with
        that processor family.
      
        Both of these have been in linux-next with no reported problems"
      
      * tag 'driver-core-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        Documentation: embargoed-hardware-issues.rst: Add myself for RISC-V
        driver core: return an error when dev_set_name() hasn't happened
      205d0494
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · fd455e77
      Linus Torvalds authored
      Pull char/misc fix from Greg KH:
       "Here is a single patch for 6.6-rc2 that reverts a 6.5 change for the
        comedi subsystem that has ended up being incorrect and caused drivers
        that were working for people to be unable to be able to be selected to
        build at all.
      
        To fix this, the Kconfig change needs to be reverted and a future set
        of fixes for the ioport dependancies will show up in 6.7-rc1 (there's
        no rush for them.)
      
        This has been in linux-next with no reported issues"
      
      * tag 'char-misc-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        Revert "comedi: add HAS_IOPORT dependencies"
      fd455e77
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · c37f8efc
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "The main thing is the removal of 'probe_new' because all i2c client
        drivers are converted now. Thanks Uwe, this marks the end of a long
        conversion process.
      
        Other than that, we have a few Kconfig updates and driver bugfixes"
      
      * tag 'i2c-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: cadence: Fix the kernel-doc warnings
        i2c: aspeed: Reset the i2c controller when timeout occurs
        i2c: I2C_MLXCPLD on ARM64 should depend on ACPI
        i2c: Make I2C_ATR invisible
        i2c: Drop legacy callback .probe_new()
        w1: ds2482: Switch back to use struct i2c_driver's .probe()
      c37f8efc
    • Niklas Cassel's avatar
      ata: libata-core: fetch sense data for successful commands iff CDL enabled · 5e35a9ac
      Niklas Cassel authored
      Currently, we fetch sense data for a _successful_ command if either:
      1) Command was NCQ and ATA_DFLAG_CDL_ENABLED flag set (flag
         ATA_DFLAG_CDL_ENABLED will only be set if the Successful NCQ command
         sense data supported bit is set); or
      2) Command was non-NCQ and regular sense data reporting is enabled.
      
      This means that case 2) will trigger for a non-NCQ command which has
      ATA_SENSE bit set, regardless if CDL is enabled or not.
      
      This decision was by design. If the device reports that it has sense data
      available, it makes sense to fetch that sense data, since the sk/asc/ascq
      could be important information regardless if CDL is enabled or not.
      
      However, the fetching of sense data for a successful command is done via
      ATA EH. Considering how intricate the ATA EH is, we really do not want to
      invoke ATA EH unless absolutely needed.
      
      Before commit 18bd7718 ("scsi: ata: libata: Handle completion of CDL
      commands using policy 0xD") we never fetched sense data for successful
      commands.
      
      In order to not invoke the ATA EH unless absolutely necessary, even if the
      device claims support for sense data reporting, only fetch sense data for
      successful (NCQ and non-NCQ commands) commands that are using CDL.
      
      [Damien] Modified the check to test the qc flag ATA_QCFLAG_HAS_CDL
      instead of the device support for CDL, which is implied for commands
      using CDL.
      
      Fixes: 3ac873c7 ("ata: libata-core: fix when to fetch sense data for successful commands")
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      5e35a9ac