• Johannes Weiner's avatar
    fs: kernfs: add poll file operation · 147e1a97
    Johannes Weiner authored
    Patch series "psi: pressure stall monitors", v3.
    
    Android is adopting psi to detect and remedy memory pressure that
    results in stuttering and decreased responsiveness on mobile devices.
    
    Psi gives us the stall information, but because we're dealing with
    latencies in the millisecond range, periodically reading the pressure
    files to detect stalls in a timely fashion is not feasible.  Psi also
    doesn't aggregate its averages at a high enough frequency right now.
    
    This patch series extends the psi interface such that users can
    configure sensitive latency thresholds and use poll() and friends to be
    notified when these are breached.
    
    As high-frequency aggregation is costly, it implements an aggregation
    method that is optimized for fast, short-interval averaging, and makes
    the aggregation frequency adaptive, such that high-frequency updates
    only happen while monitored stall events are actively occurring.
    
    With these patches applied, Android can monitor for, and ward off,
    mounting memory shortages before they cause problems for the user.  For
    example, using memory stall monitors in userspace low memory killer
    daemon (lmkd) we can detect mounting pressure and kill less important
    processes before device becomes visibly sluggish.
    
    In our memory stress testing psi memory monitors produce roughly 10x
    less false positives compared to vmpressure signals.  Having ability to
    specify multiple triggers for the same psi metric allows other parts of
    Android framework to monitor memory state of the device and act
    accordingly.
    
    The new interface is straightforward.  The user opens one of the
    pressure files for writing and writes a trigger description into the
    file descriptor that defines the stall state - some or full, and the
    maximum stall time over a given window of time.  E.g.:
    
            /* Signal when stall time exceeds 100ms of a 1s window */
            char trigger[] = "full 100000 1000000";
            fd = open("/proc/pressure/memory");
            write(fd, trigger, sizeof(trigger));
            while (poll() >= 0) {
                    ...
            }
            close(fd);
    
    When the monitored stall state is entered, psi adapts its aggregation
    frequency according to what the configured time window requires in order
    to emit event signals in a timely fashion.  Once the stalling subsides,
    aggregation reverts back to normal.
    
    The trigger is associated with the open file descriptor.  To stop
    monitoring, the user only needs to close the file descriptor and the
    trigger is discarded.
    
    Patches 1-4 prepare the psi code for polling support.  Patch 5
    implements the adaptive polling logic, the pressure growth detection
    optimized for short intervals, and hooks up write() and poll() on the
    pressure files.
    
    The patches were developed in collaboration with Johannes Weiner.
    
    This patch (of 5):
    
    Kernfs has a standardized poll/notification mechanism for waking all
    pollers on all fds when a filesystem node changes.  To allow polling for
    custom events, add a .poll callback that can override the default.
    
    This is in preparation for pollable cgroup pressure files which have
    per-fd trigger configurations.
    
    Link: http://lkml.kernel.org/r/20190124211518.244221-2-surenb@google.comSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
    Cc: Dennis Zhou <dennis@kernel.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Li Zefan <lizefan@huawei.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Tejun Heo <tj@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    147e1a97
file.c 25.7 KB