• Sean Christopherson's avatar
    KVM: Unconditionally get a ref to /dev/kvm module when creating a VM · 405294f2
    Sean Christopherson authored
    Unconditionally get a reference to the /dev/kvm module when creating a VM
    instead of using try_get_module(), which will fail if the module is in
    the process of being forcefully unloaded.  The error handling when
    try_get_module() fails doesn't properly unwind all that has been done,
    e.g. doesn't call kvm_arch_pre_destroy_vm() and doesn't remove the VM
    from the global list.  Not removing VMs from the global list tends to be
    fatal, e.g. leads to use-after-free explosions.
    
    The obvious alternative would be to add proper unwinding, but the
    justification for using try_get_module(), "rmmod --wait", is completely
    bogus as support for "rmmod --wait", i.e. delete_module() without
    O_NONBLOCK, was removed by commit 3f2b9c9c ("module: remove rmmod
    --wait option.") nearly a decade ago.
    
    It's still possible for try_get_module() to fail due to the module dying
    (more like being killed), as the module will be tagged MODULE_STATE_GOING
    by "rmmod --force", i.e. delete_module(..., O_TRUNC), but playing nice
    with forced unloading is an exercise in futility and gives a falsea sense
    of security.  Using try_get_module() only prevents acquiring _new_
    references, it doesn't magically put the references held by other VMs,
    and forced unloading doesn't wait, i.e. "rmmod --force" on KVM is all but
    guaranteed to cause spectacular fireworks; the window where KVM will fail
    try_get_module() is tiny compared to the window where KVM is building and
    running the VM with an elevated module refcount.
    
    Addressing KVM's inability to play nice with "rmmod --force" is firmly
    out-of-scope.  Forcefully unloading any module taints kernel (for obvious
    reasons)  _and_ requires the kernel to be built with
    CONFIG_MODULE_FORCE_UNLOAD=y, which is off by default and comes with the
    amusing disclaimer that it's "mainly for kernel developers and desperate
    users".  In other words, KVM is free to scoff at bug reports due to using
    "rmmod --force" while VMs may be running.
    
    Fixes: 5f6de5cb ("KVM: Prevent module exit until all VMs are freed")
    Cc: stable@vger.kernel.org
    Cc: David Matlack <dmatlack@google.com>
    Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
    Message-Id: <20220816053937.2477106-3-seanjc@google.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    405294f2
kvm_main.c 150 KB