• Sean Christopherson's avatar
    KVM: x86: Forcibly leave nested virt when SMM state is toggled · f7e57078
    Sean Christopherson authored
    Forcibly leave nested virtualization operation if userspace toggles SMM
    state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS.  If userspace
    forces the vCPU out of SMM while it's post-VMXON and then injects an SMI,
    vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both
    vmxon=false and smm.vmxon=false, but all other nVMX state allocated.
    
    Don't attempt to gracefully handle the transition as (a) most transitions
    are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't
    sufficient information to handle all transitions, e.g. SVM wants access
    to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede
    KVM_SET_NESTED_STATE during state restore as the latter disallows putting
    the vCPU into L2 if SMM is active, and disallows tagging the vCPU as
    being post-VMXON in SMM if SMM is not active.
    
    Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX
    due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond
    just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU
    in an architecturally impossible state.
    
      WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
      WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
      Modules linked in:
      CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
      RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
      Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00
      Call Trace:
       <TASK>
       kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123
       kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline]
       kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460
       kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline]
       kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676
       kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline]
       kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250
       kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273
       __fput+0x286/0x9f0 fs/file_table.c:311
       task_work_run+0xdd/0x1a0 kernel/task_work.c:164
       exit_task_work include/linux/task_work.h:32 [inline]
       do_exit+0xb29/0x2a30 kernel/exit.c:806
       do_group_exit+0xd2/0x2f0 kernel/exit.c:935
       get_signal+0x4b0/0x28c0 kernel/signal.c:2862
       arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
       handle_signal_work kernel/entry/common.c:148 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
       exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
       __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
       syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
       do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x44/0xae
       </TASK>
    
    Cc: stable@vger.kernel.org
    Reported-by: syzbot+8112db3ab20e70d50c31@syzkaller.appspotmail.com
    Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
    Message-Id: <20220125220358.2091737-1-seanjc@google.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    f7e57078
nested.c 207 KB