• James Morse's avatar
    x86/resctrl: Allow resctrl_arch_rmid_read() to sleep · 6fde1424
    James Morse authored
    MPAM's cache occupancy counters can take a little while to settle once the
    monitor has been configured. The maximum settling time is described to the
    driver via a firmware table. The value could be large enough that it makes
    sense to sleep. To avoid exposing this to resctrl, it should be hidden behind
    MPAM's resctrl_arch_rmid_read().
    
    resctrl_arch_rmid_read() may be called via IPI meaning it is unable to sleep.
    In this case, it should return an error if it needs to sleep. This will only
    affect MPAM platforms where the cache occupancy counter isn't available
    immediately, nohz_full is in use, and there are no housekeeping CPUs in the
    necessary domain.
    
    There are three callers of resctrl_arch_rmid_read(): __mon_event_count() and
    __check_limbo() are both called from a non-migrateable context.
    mon_event_read() invokes __mon_event_count() using smp_call_on_cpu(), which
    adds work to the target CPUs workqueue.  rdtgroup_mutex() is held, meaning this
    cannot race with the resctrl cpuhp callback. __check_limbo() is invoked via
    schedule_delayed_work_on() also adds work to a per-cpu workqueue.
    
    The remaining call is add_rmid_to_limbo() which is called in response to
    a user-space syscall that frees an RMID. This opportunistically reads the LLC
    occupancy counter on the current domain to see if the RMID is over the dirty
    threshold. This has to disable preemption to avoid reading the wrong domain's
    value. Disabling preemption here prevents resctrl_arch_rmid_read() from
    sleeping.
    
    add_rmid_to_limbo() walks each domain, but only reads the counter on one
    domain. If the system has more than one domain, the RMID will always be added
    to the limbo list. If the RMIDs usage was not over the threshold, it will be
    removed from the list when __check_limbo() runs.  Make this the default
    behaviour. Free RMIDs are always added to the limbo list for each domain.
    
    The user visible effect of this is that a clean RMID is not available for
    re-allocation immediately after 'rmdir()' completes. This behaviour was never
    portable as it never happened on a machine with multiple domains.
    
    Removing this path allows resctrl_arch_rmid_read() to sleep if its called with
    interrupts unmasked. Document this is the expected behaviour, and add
    a might_sleep() annotation to catch changes that won't work on arm64.
    Signed-off-by: default avatarJames Morse <james.morse@arm.com>
    Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: default avatarShaopeng Tan <tan.shaopeng@fujitsu.com>
    Reviewed-by: default avatarReinette Chatre <reinette.chatre@intel.com>
    Reviewed-by: default avatarBabu Moger <babu.moger@amd.com>
    Tested-by: default avatarShaopeng Tan <tan.shaopeng@fujitsu.com>
    Tested-by: default avatarPeter Newman <peternewman@google.com>
    Tested-by: default avatarBabu Moger <babu.moger@amd.com>
    Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
    Link: https://lore.kernel.org/r/20240213184438.16675-15-james.morse@arm.comSigned-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
    6fde1424
monitor.c 27.1 KB