• James Morse's avatar
    x86/resctrl: Queue mon_event_read() instead of sending an IPI · 09909e09
    James Morse authored
    Intel is blessed with an abundance of monitors, one per RMID, that can be
    read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC,
    the number implemented is up to the manufacturer. This means when there are
    fewer monitors than needed, they need to be allocated and freed.
    
    MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The
    CSU counter is allowed to return 'not ready' for a small number of
    micro-seconds after programming. To allow one CSU hardware monitor to be
    used for multiple control or monitor groups, the CPU accessing the
    monitor needs to be able to block when configuring and reading the
    counter.
    
    Worse, the domain may be broken up into slices, and the MMIO accesses
    for each slice may need performing from different CPUs.
    
    These two details mean MPAMs monitor code needs to be able to sleep, and
    IPI another CPU in the domain to read from a resource that has been sliced.
    
    mon_event_read() already invokes mon_event_count() via IPI, which means
    this isn't possible. On systems using nohz-full, some CPUs need to be
    interrupted to run kernel work as they otherwise stay in user-space
    running realtime workloads. Interrupting these CPUs should be avoided,
    and scheduling work on them may never complete.
    
    Change mon_event_read() to pick a housekeeping CPU, (one that is not using
    nohz_full) and schedule mon_event_count() and wait. If all the CPUs
    in a domain are using nohz-full, then an IPI is used as the fallback.
    
    This function is only used in response to a user-space filesystem request
    (not the timing sensitive overflow code).
    
    This allows MPAM to hide the slice behaviour from resctrl, and to keep
    the monitor-allocation in monitor.c. When the IPI fallback is used on
    machines where MPAM needs to make an access on multiple CPUs, the counter
    read will always fail.
    Signed-off-by: default avatarJames Morse <james.morse@arm.com>
    Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: default avatarShaopeng Tan <tan.shaopeng@fujitsu.com>
    Reviewed-by: default avatarPeter Newman <peternewman@google.com>
    Reviewed-by: default avatarReinette Chatre <reinette.chatre@intel.com>
    Reviewed-by: default avatarBabu Moger <babu.moger@amd.com>
    Tested-by: default avatarShaopeng Tan <tan.shaopeng@fujitsu.com>
    Tested-by: default avatarPeter Newman <peternewman@google.com>
    Tested-by: default avatarBabu Moger <babu.moger@amd.com>
    Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
    Link: https://lore.kernel.org/r/20240213184438.16675-14-james.morse@arm.comSigned-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
    09909e09
monitor.c 27.5 KB