• Jason A. Donenfeld's avatar
    random: use linear min-entropy accumulation crediting · c5704490
    Jason A. Donenfeld authored
    30e37ec5 ("random: account for entropy loss due to overwrites")
    assumed that adding new entropy to the LFSR pool probabilistically
    cancelled out old entropy there, so entropy was credited asymptotically,
    approximating Shannon entropy of independent sources (rather than a
    stronger min-entropy notion) using 1/8th fractional bits and replacing
    a constant 2-2/√𝑒 term (~0.786938) with 3/4 (0.75) to slightly
    underestimate it. This wasn't superb, but it was perhaps better than
    nothing, so that's what was done. Which entropy specifically was being
    cancelled out and how much precisely each time is hard to tell, though
    as I showed with the attack code in my previous commit, a motivated
    adversary with sufficient information can actually cancel out
    everything.
    
    Since we're no longer using an LFSR for entropy accumulation, this
    probabilistic cancellation is no longer relevant. Rather, we're now
    using a computational hash function as the accumulator and we've
    switched to working in the random oracle model, from which we can now
    revisit the question of min-entropy accumulation, which is done in
    detail in <https://eprint.iacr.org/2019/198>.
    
    Consider a long input bit string that is built by concatenating various
    smaller independent input bit strings. Each one of these inputs has a
    designated min-entropy, which is what we're passing to
    credit_entropy_bits(h). When we pass the concatenation of these to a
    random oracle, it means that an adversary trying to receive back the
    same reply as us would need to become certain about each part of the
    concatenated bit string we passed in, which means becoming certain about
    all of those h values. That means we can estimate the accumulation by
    simply adding up the h values in calls to credit_entropy_bits(h);
    there's no probabilistic cancellation at play like there was said to be
    for the LFSR. Incidentally, this is also what other entropy accumulators
    based on computational hash functions do as well.
    
    So this commit replaces credit_entropy_bits(h) with essentially `total =
    min(POOL_BITS, total + h)`, done with a cmpxchg loop as before.
    
    What if we're wrong and the above is nonsense? It's not, but let's
    assume we don't want the actual _behavior_ of the code to change much.
    Currently that behavior is not extracting from the input pool until it
    has 128 bits of entropy in it. With the old algorithm, we'd hit that
    magic 128 number after roughly 256 calls to credit_entropy_bits(1). So,
    we can retain more or less the old behavior by waiting to extract from
    the input pool until it hits 256 bits of entropy using the new code. For
    people concerned about this change, it means that there's not that much
    practical behavioral change. And for folks actually trying to model
    the behavior rigorously, it means that we have an even higher margin
    against attacks.
    
    Cc: Theodore Ts'o <tytso@mit.edu>
    Cc: Dominik Brodowski <linux@dominikbrodowski.net>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
    Reviewed-by: default avatarJean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
    Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
    c5704490
random.c 53.9 KB