• Douglas Anderson's avatar
    migrate_pages: avoid blocking for IO in MIGRATE_SYNC_LIGHT · 4bb6dc79
    Douglas Anderson authored
    The MIGRATE_SYNC_LIGHT mode is intended to block for things that will
    finish quickly but not for things that will take a long time.  Exactly how
    long is too long is not well defined, but waits of tens of milliseconds is
    likely non-ideal.
    
    When putting a Chromebook under memory pressure (opening over 90 tabs on a
    4GB machine) it was fairly easy to see delays waiting for some locks in
    the kcompactd code path of > 100 ms.  While the laptop wasn't amazingly
    usable in this state, it was still limping along and this state isn't
    something artificial.  Sometimes we simply end up with a lot of memory
    pressure.
    
    Putting the same Chromebook under memory pressure while it was running
    Android apps (though not stressing them) showed a much worse result (NOTE:
    this was on a older kernel but the codepaths here are similar).  Android
    apps on ChromeOS currently run from a 128K-block, zlib-compressed,
    loopback-mounted squashfs disk.  If we get a page fault from something
    backed by the squashfs filesystem we could end up holding a folio lock
    while reading enough from disk to decompress 128K (and then decompressing
    it using the somewhat slow zlib algorithms).  That reading goes through
    the ext4 subsystem (because it's a loopback mount) before eventually
    ending up in the block subsystem.  This extra jaunt adds extra overhead. 
    Without much work I could see cases where we ended up blocked on a folio
    lock for over a second.  With more extreme memory pressure I could see up
    to 25 seconds.
    
    We considered adding a timeout in the case of MIGRATE_SYNC_LIGHT for the
    two locks that were seen to be slow [1] and that generated much
    discussion.  After discussion, it was decided that we should avoid waiting
    for the two locks during MIGRATE_SYNC_LIGHT if they were being held for
    IO.  We'll continue with the unbounded wait for the more full SYNC modes.
    
    With this change, I couldn't see any slow waits on these locks with my
    previous testcases.
    
    NOTE: The reason I stated digging into this originally isn't because some
    benchmark had gone awry, but because we've received in-the-field crash
    reports where we have a hung task waiting on the page lock (which is the
    equivalent code path on old kernels).  While the root cause of those
    crashes is likely unrelated and won't be fixed by this patch, analyzing
    those crash reports did point out these very long waits seemed like
    something good to fix.  With this patch we should no longer hang waiting
    on these locks, but presumably the system will still be in a bad shape and
    hang somewhere else.
    
    [1] https://lore.kernel.org/r/20230421151135.v2.1.I2b71e11264c5c214bc59744b9e13e4c353bc5714@changeid
    
    Link: https://lkml.kernel.org/r/20230428135414.v3.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeidSigned-off-by: default avatarDouglas Anderson <dianders@chromium.org>
    Suggested-by: default avatarMatthew Wilcox <willy@infradead.org>
    Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
    Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Gao Xiang <hsiangkao@linux.alibaba.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Gao Xiang <hsiangkao@linux.alibaba.com>
    Cc: Huang Ying <ying.huang@intel.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Yu Zhao <yuzhao@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    4bb6dc79
migrate.c 69.8 KB