• Gui-Dong Han's avatar
    md/raid5: fix atomicity violation in raid5_cache_count · dfd2bf43
    Gui-Dong Han authored
    In raid5_cache_count():
        if (conf->max_nr_stripes < conf->min_nr_stripes)
            return 0;
        return conf->max_nr_stripes - conf->min_nr_stripes;
    The current check is ineffective, as the values could change immediately
    after being checked.
    
    In raid5_set_cache_size():
        ...
        conf->min_nr_stripes = size;
        ...
        while (size > conf->max_nr_stripes)
            conf->min_nr_stripes = conf->max_nr_stripes;
        ...
    
    Due to intermediate value updates in raid5_set_cache_size(), concurrent
    execution of raid5_cache_count() and raid5_set_cache_size() may lead to
    inconsistent reads of conf->max_nr_stripes and conf->min_nr_stripes.
    The current checks are ineffective as values could change immediately
    after being checked, raising the risk of conf->min_nr_stripes exceeding
    conf->max_nr_stripes and potentially causing an integer overflow.
    
    This possible bug is found by an experimental static analysis tool
    developed by our team. This tool analyzes the locking APIs to extract
    function pairs that can be concurrently executed, and then analyzes the
    instructions in the paired functions to identify possible concurrency bugs
    including data races and atomicity violations. The above possible bug is
    reported when our tool analyzes the source code of Linux 6.2.
    
    To resolve this issue, it is suggested to introduce local variables
    'min_stripes' and 'max_stripes' in raid5_cache_count() to ensure the
    values remain stable throughout the check. Adding locks in
    raid5_cache_count() fails to resolve atomicity violations, as
    raid5_set_cache_size() may hold intermediate values of
    conf->min_nr_stripes while unlocked. With this patch applied, our tool no
    longer reports the bug, with the kernel configuration allyesconfig for
    x86_64. Due to the lack of associated hardware, we cannot test the patch
    in runtime testing, and just verify it according to the code logic.
    
    Fixes: edbe83ab ("md/raid5: allow the stripe_cache to grow and shrink.")
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarGui-Dong Han <2045gemini@gmail.com>
    Reviewed-by: default avatarYu Kuai <yukuai3@huawei.com>
    Signed-off-by: default avatarSong Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20240112071017.16313-1-2045gemini@gmail.comSigned-off-by: default avatarSong Liu <song@kernel.org>
    dfd2bf43
raid5.c 253 KB