• Ferdinand Blomqvist's avatar
    rslib: Fix remaining decoder flaws · 991305de
    Ferdinand Blomqvist authored
    The decoder is flawed in the following ways:
    
    - The decoder sometimes fails silently, i.e. it announces success but
      returns a word that is not a codeword.
    
    - The return value of the decoder is incoherent with respect to how
      fixed erasures are counted. If the word to be decoded is a codeword,
      then the decoder always returns zero even if some erasures are given.
      On the other hand, if the word to be decoded contains errors, then the
      number of erasures is always included in the count of corrected
      symbols. So the decoder handles erasures without symbol corruption
      inconsistently. This inconsistency probably doesn't affect anyone
      using the decoder, but it is inconsistent with the documentation.
    
    - The error positions returned in eras_pos include all erasures, but the
      corrections are only set in the correction buffer if there actually is
      a symbol error. So if there are erasures without symbol corruption,
      then the correction buffer will contain errors (unless initialized to
      zero before calling the decoder) or some values will be unset (if the
      correction buffer is uninitialized).
    
    - When correcting data in-place the decoder does not correct errors in
      the parity. On the other hand, when returning the errors in correction
      buffers, errors in the parity are included.
    
    The respective fixed are:
    
    - The syndrome of a codeword is always zero, and the syndrome is linear,
      .i.e, S(x+e) = S(x) + S(e). So compute the syndrome for the error and
      check whether it equals the syndrome of the received word. If it does,
      then we have decoded to a valid codeword, otherwise we know that we
      have an uncorrectable error. Fortunately, some unrecoverable error
      conditions can be detected earlier in the decoding, which saves some
      processing power.
    
    - Simply count and return the number of symbols actually corrected.
    
    - Make sure to only return positions where symbols were corrected.
    
    - Also fix errors in parity when correcting in-place. Another option
      would be to completely disregard errors in the parity, but then the
      interface makes it impossible to write tests that test for silent
      failures.
    
    Other changes:
    
    - Only fill the correction buffer and error position buffer if both of
      them are provided. Otherwise correct in place. Previously the error
      position buffer was always populated with the positions of the
      corrected errors, irrespective of whether a correction buffer was
      supplied or not. The rationale for this change is that there seems to
      be two use cases for the decoder; correct in-place or use the
      correction buffers. The caller does not need the positions of the
      corrected errors when in-place correction is used. If in-place
      correction is not used, then both the correction buffer and error
      position buffer need to be populated.
    Signed-off-by: default avatarFerdinand Blomqvist <ferdinand.blomqvist@gmail.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Link: https://lkml.kernel.org/r/20190620141039.9874-8-ferdinand.blomqvist@gmail.com
    991305de
decode_rs.c 8.03 KB