• Anna-Maria Behnsen's avatar
    timers: Implement the hierarchical pull model · 7ee98877
    Anna-Maria Behnsen authored
    Placing timers at enqueue time on a target CPU based on dubious heuristics
    does not make any sense:
    
     1) Most timer wheel timers are canceled or rearmed before they expire.
    
     2) The heuristics to predict which CPU will be busy when the timer expires
        are wrong by definition.
    
    So placing the timers at enqueue wastes precious cycles.
    
    The proper solution to this problem is to always queue the timers on the
    local CPU and allow the non pinned timers to be pulled onto a busy CPU at
    expiry time.
    
    Therefore split the timer storage into local pinned and global timers:
    Local pinned timers are always expired on the CPU on which they have been
    queued. Global timers can be expired on any CPU.
    
    As long as a CPU is busy it expires both local and global timers. When a
    CPU goes idle it arms for the first expiring local timer. If the first
    expiring pinned (local) timer is before the first expiring movable timer,
    then no action is required because the CPU will wake up before the first
    movable timer expires. If the first expiring movable timer is before the
    first expiring pinned (local) timer, then this timer is queued into an idle
    timerqueue and eventually expired by another active CPU.
    
    To avoid global locking the timerqueues are implemented as a hierarchy. The
    lowest level of the hierarchy holds the CPUs. The CPUs are associated to
    groups of 8, which are separated per node. If more than one CPU group
    exist, then a second level in the hierarchy collects the groups. Depending
    on the size of the system more than 2 levels are required. Each group has a
    "migrator" which checks the timerqueue during the tick for remote expirable
    timers.
    
    If the last CPU in a group goes idle it reports the first expiring event in
    the group up to the next group(s) in the hierarchy. If the last CPU goes
    idle it arms its timer for the first system wide expiring timer to ensure
    that no timer event is missed.
    Signed-off-by: default avatarAnna-Maria Behnsen <anna-maria@linutronix.de>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
    Link: https://lore.kernel.org/r/20240222103710.32582-1-anna-maria@linutronix.de
    7ee98877
timer.c 82.6 KB