• Domenico Cerasuolo's avatar
    sched/psi: Allow unprivileged polling of N*2s period · d82caa27
    Domenico Cerasuolo authored
    PSI offers 2 mechanisms to get information about a specific resource
    pressure. One is reading from /proc/pressure/<resource>, which gives
    average pressures aggregated every 2s. The other is creating a pollable
    fd for a specific resource and cgroup.
    
    The trigger creation requires CAP_SYS_RESOURCE, and gives the
    possibility to pick specific time window and threshold, spawing an RT
    thread to aggregate the data.
    
    Systemd would like to provide containers the option to monitor pressure
    on their own cgroup and sub-cgroups. For example, if systemd launches a
    container that itself then launches services, the container should have
    the ability to poll() for pressure in individual services. But neither
    the container nor the services are privileged.
    
    This patch implements a mechanism to allow unprivileged users to create
    pressure triggers. The difference with privileged triggers creation is
    that unprivileged ones must have a time window that's a multiple of 2s.
    This is so that we can avoid unrestricted spawning of rt threads, and
    use instead the same aggregation mechanism done for the averages, which
    runs independently of any triggers.
    Suggested-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: default avatarDomenico Cerasuolo <cerasuolodomenico@gmail.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Link: https://lore.kernel.org/r/20230330105418.77061-5-cerasuolodomenico@gmail.com
    d82caa27
psi.rst 6.63 KB