1. 11 Jul, 2023 3 commits
    • Marc Zyngier's avatar
      KVM: arm64: Disable preemption in kvm_arch_hardware_enable() · 970dee09
      Marc Zyngier authored
      Since 0bf50497 ("KVM: Drop kvm_count_lock and instead protect
      kvm_usage_count with kvm_lock"), hotplugging back a CPU whilst
      a guest is running results in a number of ugly splats as most
      of this code expects to run with preemption disabled, which isn't
      the case anymore.
      
      While the context is preemptable, it isn't migratable, which should
      be enough. But we have plenty of preemptible() checks all over
      the place, and our per-CPU accessors also disable preemption.
      
      Since this affects released versions, let's do the easy fix first,
      disabling preemption in kvm_arch_hardware_enable(). We can always
      revisit this with a more invasive fix in the future.
      
      Fixes: 0bf50497 ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock")
      Reported-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Tested-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/aeab7562-2d39-e78e-93b1-4711f8cc3fa5@arm.com
      Cc: stable@vger.kernel.org # v6.3, v6.4
      Link: https://lore.kernel.org/r/20230703163548.1498943-1-maz@kernel.orgSigned-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      970dee09
    • Sudeep Holla's avatar
      KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm · fa729bc7
      Sudeep Holla authored
      Currently there is no synchronisation between finalize_pkvm() and
      kvm_arm_init() initcalls. The finalize_pkvm() proceeds happily even if
      kvm_arm_init() fails resulting in the following warning on all the CPUs
      and eventually a HYP panic:
      
        | kvm [1]: IPA Size Limit: 48 bits
        | kvm [1]: Failed to init hyp memory protection
        | kvm [1]: error initializing Hyp mode: -22
        |
        | <snip>
        |
        | WARNING: CPU: 0 PID: 0 at arch/arm64/kvm/pkvm.c:226 _kvm_host_prot_finalize+0x30/0x50
        | Modules linked in:
        | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0 #237
        | Hardware name: FVP Base RevC (DT)
        | pstate: 634020c5 (nZCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
        | pc : _kvm_host_prot_finalize+0x30/0x50
        | lr : __flush_smp_call_function_queue+0xd8/0x230
        |
        | Call trace:
        |  _kvm_host_prot_finalize+0x3c/0x50
        |  on_each_cpu_cond_mask+0x3c/0x6c
        |  pkvm_drop_host_privileges+0x4c/0x78
        |  finalize_pkvm+0x3c/0x5c
        |  do_one_initcall+0xcc/0x240
        |  do_initcall_level+0x8c/0xac
        |  do_initcalls+0x54/0x94
        |  do_basic_setup+0x1c/0x28
        |  kernel_init_freeable+0x100/0x16c
        |  kernel_init+0x20/0x1a0
        |  ret_from_fork+0x10/0x20
        | Failed to finalize Hyp protection: -22
        |     dtb=fvp-base-revc.dtb
        | kvm [95]: nVHE hyp BUG at: arch/arm64/kvm/hyp/nvhe/mem_protect.c:540!
        | kvm [95]: nVHE call trace:
        | kvm [95]:  [<ffff800081052984>] __kvm_nvhe_hyp_panic+0xac/0xf8
        | kvm [95]:  [<ffff800081059644>] __kvm_nvhe_handle_host_mem_abort+0x1a0/0x2ac
        | kvm [95]:  [<ffff80008105511c>] __kvm_nvhe_handle_trap+0x4c/0x160
        | kvm [95]:  [<ffff8000810540fc>] __kvm_nvhe___skip_pauth_save+0x4/0x4
        | kvm [95]: ---[ end nVHE call trace ]---
        | kvm [95]: Hyp Offset: 0xfffe8db00ffa0000
        | Kernel panic - not syncing: HYP panic:
        | PS:a34023c9 PC:0000f250710b973c ESR:00000000f2000800
        | FAR:ffff000800cb00d0 HPFAR:000000000880cb00 PAR:0000000000000000
        | VCPU:0000000000000000
        | CPU: 3 PID: 95 Comm: kworker/u16:2 Tainted: G        W          6.4.0 #237
        | Hardware name: FVP Base RevC (DT)
        | Workqueue: rpciod rpc_async_schedule
        | Call trace:
        |  dump_backtrace+0xec/0x108
        |  show_stack+0x18/0x2c
        |  dump_stack_lvl+0x50/0x68
        |  dump_stack+0x18/0x24
        |  panic+0x138/0x33c
        |  nvhe_hyp_panic_handler+0x100/0x184
        |  new_slab+0x23c/0x54c
        |  ___slab_alloc+0x3e4/0x770
        |  kmem_cache_alloc_node+0x1f0/0x278
        |  __alloc_skb+0xdc/0x294
        |  tcp_stream_alloc_skb+0x2c/0xf0
        |  tcp_sendmsg_locked+0x3d0/0xda4
        |  tcp_sendmsg+0x38/0x5c
        |  inet_sendmsg+0x44/0x60
        |  sock_sendmsg+0x1c/0x34
        |  xprt_sock_sendmsg+0xdc/0x274
        |  xs_tcp_send_request+0x1ac/0x28c
        |  xprt_transmit+0xcc/0x300
        |  call_transmit+0x78/0x90
        |  __rpc_execute+0x114/0x3d8
        |  rpc_async_schedule+0x28/0x48
        |  process_one_work+0x1d8/0x314
        |  worker_thread+0x248/0x474
        |  kthread+0xfc/0x184
        |  ret_from_fork+0x10/0x20
        | SMP: stopping secondary CPUs
        | Kernel Offset: 0x57c5cb460000 from 0xffff800080000000
        | PHYS_OFFSET: 0x80000000
        | CPU features: 0x00000000,1035b7a3,ccfe773f
        | Memory Limit: none
        | ---[ end Kernel panic - not syncing: HYP panic:
        | PS:a34023c9 PC:0000f250710b973c ESR:00000000f2000800
        | FAR:ffff000800cb00d0 HPFAR:000000000880cb00 PAR:0000000000000000
        | VCPU:0000000000000000 ]---
      
      Fix it by checking for the successfull initialisation of kvm_arm_init()
      in finalize_pkvm() before proceeding any futher.
      
      Fixes: 87727ba2 ("KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege")
      Cc: Will Deacon <will@kernel.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: James Morse <james.morse@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Zenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Acked-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20230704193243.3300506-1-sudeep.holla@arm.comSigned-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      fa729bc7
    • Marc Zyngier's avatar
      KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits · fe769e6c
      Marc Zyngier authored
      It recently appeared that, when running VHE, there is a notable
      difference between using CNTKCTL_EL1 and CNTHCTL_EL2, despite what
      the architecture documents:
      
      - When accessed from EL2, bits [19:18] and [16:10] of CNTKCTL_EL1 have
        the same assignment as CNTHCTL_EL2
      - When accessed from EL1, bits [19:18] and [16:10] are RES0
      
      It is all OK, until you factor in NV, where the EL2 guest runs at EL1.
      In this configuration, CNTKCTL_EL11 doesn't trap, nor ends up in
      the VNCR page. This means that any write from the guest affecting
      CNTHCTL_EL2 using CNTKCTL_EL1 ends up losing some state. Not good.
      
      The fix it obvious: don't use CNTKCTL_EL1 if you want to change bits
      that are not part of the EL1 definition of CNTKCTL_EL1, and use
      CNTHCTL_EL2 instead. This doesn't change anything for a bare-metal OS,
      and fixes it when running under NV. The NV hypervisor will itself
      have to work harder to merge the two accessors.
      
      Note that there is a pending update to the architecture to address
      this issue by making the affected bits UNKNOWN when CNTKCTL_EL1 is
      used from EL2 with VHE enabled.
      
      Fixes: c605ee24 ("KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2")
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Cc: stable@vger.kernel.org # v6.4
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Link: https://lore.kernel.org/r/20230627140557.544885-1-maz@kernel.orgSigned-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      fe769e6c
  2. 09 Jul, 2023 10 commits
  3. 08 Jul, 2023 27 commits