• Gao Xiang's avatar
    erofs: kill hooked chains to avoid loops on deduplicated compressed images · 967c28b2
    Gao Xiang authored
    After heavily stressing EROFS with several images which include a
    hand-crafted image of repeated patterns for more than 46 days, I found
    two chains could be linked with each other almost simultaneously and
    form a loop so that the entire loop won't be submitted.  As a
    consequence, the corresponding file pages will remain locked forever.
    
    It can be _only_ observed on data-deduplicated compressed images.
    For example, consider two chains with five pclusters in total:
    	Chain 1:  2->3->4->5    -- The tail pcluster is 5;
            Chain 2:  5->1->2       -- The tail pcluster is 2.
    
    Chain 2 could link to Chain 1 with pcluster 5; and Chain 1 could link
    to Chain 2 at the same time with pcluster 2.
    
    Since hooked chains are all linked locklessly now, I have no idea how
    to simply avoid the race.  Instead, let's avoid hooked chains completely
    until I could work out a proper way to fix this and end users finally
    tell us that it's needed to add it back.
    
    Actually, this optimization can be found with multi-threaded workloads
    (especially even more often on deduplicated compressed images), yet I'm
    not sure about the overall system impacts of not having this compared
    with implementation complexity.
    
    Fixes: 267f2492 ("erofs: introduce multi-reference pclusters (fully-referenced)")
    Signed-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: default avatarYue Hu <huyue2@coolpad.com>
    Link: https://lore.kernel.org/r/20230526201459.128169-4-hsiangkao@linux.alibaba.com
    967c28b2
zdata.c 48.1 KB