• Jesper Dangaard Brouer's avatar
    cgroup/rstat: add cgroup_rstat_cpu_lock helpers and tracepoints · 21c38a3b
    Jesper Dangaard Brouer authored
    This closely resembles helpers added for the global cgroup_rstat_lock in
    commit fc29e04a ("cgroup/rstat: add cgroup_rstat_lock helpers and
    tracepoints"). This is for the per CPU lock cgroup_rstat_cpu_lock.
    
    Based on production workloads, we observe the fast-path "update" function
    cgroup_rstat_updated() is invoked around 3 million times per sec, while the
    "flush" function cgroup_rstat_flush_locked(), walking each possible CPU,
    can see periodic spikes of 700 invocations/sec.
    
    For this reason, the tracepoints are split into normal and fastpath
    versions for this per-CPU lock. Making it feasible for production to
    continuously monitor the non-fastpath tracepoint to detect lock contention
    issues. The reason for monitoring is that lock disables IRQs which can
    disturb e.g. softirq processing on the local CPUs involved. When the
    global cgroup_rstat_lock stops disabling IRQs (e.g converted to a mutex),
    this per CPU lock becomes the next bottleneck that can introduce latency
    variations.
    
    A practical bpftrace script for monitoring contention latency:
    
     bpftrace -e '
       tracepoint:cgroup:cgroup_rstat_cpu_lock_contended {
         @start[tid]=nsecs; @cnt[probe]=count()}
       tracepoint:cgroup:cgroup_rstat_cpu_locked {
         if (args->contended) {
           @wait_ns=hist(nsecs-@start[tid]); delete(@start[tid]);}
         @cnt[probe]=count()}
       interval:s:1 {time("%H:%M:%S "); print(@wait_ns); print(@cnt); clear(@cnt);}'
    Signed-off-by: default avatarJesper Dangaard Brouer <hawk@kernel.org>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    21c38a3b
rstat.c 18.4 KB