• Yazen Ghannam's avatar
    RAS/AMD/FMPM: Safely handle saved records of various sizes · 9b195439
    Yazen Ghannam authored
    Currently, the size of the locally cached FRU record structures is
    based on the module parameter "max_nr_entries".
    
    This creates issues when restoring records if a user changes the
    parameter.
    
    If the number of entries is reduced, then old, larger records will not
    be restored. The opportunity to take action on the saved data is missed.
    Also, new records will be created and written to storage, even as the old
    records remain in storage, resulting in wasted space.
    
    If the number of entries is increased, then the length of the old,
    smaller records will not be adjusted. This causes a checksum failure
    which leads to the old record being cleared from storage. Again this
    results in another missed opportunity for action on the saved data.
    
    Allocate the temporary record with the maximum possible size based on
    the current maximum number of supported entries (255). This allows the
    ERST read operation to succeed if max_nr_entries has been increased.
    
    Warn the user if a saved record exceeds the expected size and fail to
    load the module. This allows the user to adjust the module parameter
    without losing data or the opportunity to restore larger records.
    
    Increase the size of a saved record up to the current max_rec_len. The
    checksum will be recalculated, and the updated record will be written to
    storage.
    
    Fixes: 6f15e617 ("RAS: Introduce a FRU memory poison manager")
    Signed-off-by: default avatarYazen Ghannam <yazen.ghannam@amd.com>
    Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
    Tested-by: default avatarMuralidhara M K <muralidhara.mk@amd.com>
    Link: https://lore.kernel.org/r/20240319113322.280096-3-yazen.ghannam@amd.com
    9b195439
fmpm.c 23.9 KB