• Paul Mackerras's avatar
    KVM: PPC: Book3S HV: Fix accounting of stolen time · c7b67670
    Paul Mackerras authored
    Currently the code that accounts stolen time tends to overestimate the
    stolen time, and will sometimes report more stolen time in a DTL
    (dispatch trace log) entry than has elapsed since the last DTL entry.
    This can cause guests to underflow the user or system time measured
    for some tasks, leading to ridiculous CPU percentages and total runtimes
    being reported by top and other utilities.
    
    In addition, the current code was designed for the previous policy where
    a vcore would only run when all the vcpus in it were runnable, and so
    only counted stolen time on a per-vcore basis.  Now that a vcore can
    run while some of the vcpus in it are doing other things in the kernel
    (e.g. handling a page fault), we need to count the time when a vcpu task
    is preempted while it is not running as part of a vcore as stolen also.
    
    To do this, we bring back the BUSY_IN_HOST vcpu state and extend the
    vcpu_load/put functions to count preemption time while the vcpu is
    in that state.  Handling the transitions between the RUNNING and
    BUSY_IN_HOST states requires checking and updating two variables
    (accumulated time stolen and time last preempted), so we add a new
    spinlock, vcpu->arch.tbacct_lock.  This protects both the per-vcpu
    stolen/preempt-time variables, and the per-vcore variables while this
    vcpu is running the vcore.
    
    Finally, we now don't count time spent in userspace as stolen time.
    The task could be executing in userspace on behalf of the vcpu, or
    it could be preempted, or the vcpu could be genuinely stopped.  Since
    we have no way of dividing up the time between these cases, we don't
    count any of it as stolen.
    Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
    Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
    c7b67670
book3s_hv.c 48.4 KB