random: vary jitter iterations based on cycle counter speed

Currently, we do the jitter dance if two consecutive reads to the cycle counter return different values. If they do, then we consider the cycle counter to be fast enough that one trip through the scheduler will yield one "bit" of credited entropy. If those two reads return the same value, then we assume the cycle counter is too slow to show meaningful differences. This methodology is flawed for a variety of reasons, one of which Eric posted a patch to fix in [1]. The issue that patch solves is that on a system with a slow counter, you might be [un]lucky and read the counter _just_ before it changes, so that the second cycle counter you read differs from the first, even though there's usually quite a large period of time in between the two. For example: | real time | cycle counter | | --------- | ------------- | | 3 | 5 | | 4 | 5 | | 5 | 5 | | 6 | 5 | | 7 | 5 | <--- a | 8 | 6 | <--- b | 9 | 6 | <--- c If we read the counter at (a) and compare it to (b), we might be fooled into thinking that it's a fast counter, when in reality it is not. The solution in [1] is to also compare counter (b) to counter (c), on the theory that if the counter is _actually_ slow, and (a)!=(b), then certainly (b)==(c). This helps solve this particular issue, in one sense, but in another sense, it mostly functions to disallow jitter entropy on these systems, rather than simply taking more samples in that case. Instead, this patch takes a different approach. Right now we assume that a difference in one set of consecutive samples means one "bit" of credited entropy per scheduler trip. We can extend this so that a difference in two sets of consecutive samples means one "bit" of credited entropy per /two/ scheduler trips, and three for three, and four for four. In other words, we can increase the amount of jitter "work" we require for each "bit", depending on how slow the cycle counter is. So this patch takes whole bunch of samples, sees how many of them are different, and divides to find the amount of work required per "bit", and also requires that at least some minimum of them are different in order to attempt any jitter entropy. Note that this approach is still far from perfect. It's not a real statistical estimate on how much these samples vary; it's not a real-time analysis of the relevant input data. That remains a project for another time. However, it makes the same (partly flawed) assumptions as the code that's there now, so it's probably not worse than the status quo, and it handles the issue Eric mentioned in [1]. But, again, it's probably a far cry from whatever a really robust version of this would be. [1] https://lore.kernel.org/lkml/20220421233152.58522-1-ebiggers@kernel.org/ https://lore.kernel.org/lkml/20220421192939.250680-1-ebiggers@kernel.org/ Cc: Eric Biggers <ebiggers@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

random: vary jitter iterations based on cycle counter speed
Currently, we do the jitter dance if two consecutive reads to the cycle counter return different values. If they do, then we consider the cycle counter to be fast enough that one trip through the scheduler will yield one "bit" of credited entropy. If those two reads return the same value, then we assume the cycle counter is too slow to show meaningful differences. This methodology is flawed for a variety of reasons, one of which Eric posted a patch to fix in [1]. The issue that patch solves is that on a system with a slow counter, you might be [un]lucky and read the counter _just_ before it changes, so that the second cycle counter you read differs from the first, even though there's usually quite a large period of time in between the two. For example: | real time | cycle counter | | --------- | ------------- | | 3 | 5 | | 4 | 5 | | 5 | 5 | | 6 | 5 | | 7 | 5 | <--- a | 8 | 6 | <--- b | 9 | 6 | <--- c If we read the counter at (a) and compare it to (b), we might be fooled into thinking that it's a fast counter, when in reality it is not. The solution in [1] is to also compare counter (b) to counter (c), on the theory that if the counter is _actually_ slow, and (a)!=(b), then certainly (b)==(c). This helps solve this particular issue, in one sense, but in another sense, it mostly functions to disallow jitter entropy on these systems, rather than simply taking more samples in that case. Instead, this patch takes a different approach. Right now we assume that a difference in one set of consecutive samples means one "bit" of credited entropy per scheduler trip. We can extend this so that a difference in two sets of consecutive samples means one "bit" of credited entropy per /two/ scheduler trips, and three for three, and four for four. In other words, we can increase the amount of jitter "work" we require for each "bit", depending on how slow the cycle counter is. So this patch takes whole bunch of samples, sees how many of them are different, and divides to find the amount of work required per "bit", and also requires that at least some minimum of them are different in order to attempt any jitter entropy. Note that this approach is still far from perfect. It's not a real statistical estimate on how much these samples vary; it's not a real-time analysis of the relevant input data. That remains a project for another time. However, it makes the same (partly flawed) assumptions as the code that's there now, so it's probably not worse than the status quo, and it handles the issue Eric mentioned in [1]. But, again, it's probably a far cry from whatever a really robust version of this would be. [1] https://lore.kernel.org/lkml/20220421233152.58522-1-ebiggers@kernel.org/ https://lore.kernel.org/lkml/20220421192939.250680-1-ebiggers@kernel.org/ Cc: Eric Biggers <ebiggers@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
78c768e6 · Jason A. Donenfeld · 4b758eda · 78c768e6
Commit 78c768e6 authored Apr 22, 2022 by Jason A. Donenfeld
Hide whitespace changes
Inline Side-by-side

Showing with 26 additions and 10 deletions

drivers/char/random.c drivers/char/random.c +26 -10

No files found.
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1354,6 +1354,12 @@ void add_interrupt_randomness(int irq)
 }
 EXPORT_SYMBOL_GPL(add_interrupt_randomness);

+struct entropy_timer_state {
+	unsigned long entropy;
+	struct timer_list timer;
+	unsigned int samples, samples_per_bit;
+};
+
 /*
 * Each time the timer fires, we expect that we got an unpredictable
 * jump in the cycle counter. Even if the timer is running on another
@@ -1367,9 +1373,14 @@ EXPORT_SYMBOL_GPL(add_interrupt_randomness);
 *
 * So the re-arming always happens in the entropy loop itself.
 */
-static void entropy_timer(struct timer_list *t)
+static void entropy_timer(struct timer_list *timer)
 {
-	credit_entropy_bits(1);
+	struct entropy_timer_state *state = container_of(timer, struct entropy_timer_state, timer);
+
+	if (++state->samples == state->samples_per_bit) {
+		credit_entropy_bits(1);
+		state->samples = 0;
+	}
 }

 /*
@@ -1378,17 +1389,22 @@ static void entropy_timer(struct timer_list *t)
 */
 static void try_to_generate_entropy(void)
 {
-	struct {
-		unsigned long entropy;
-		struct timer_list timer;
-	} stack;
-
-	stack.entropy = random_get_entropy();
+	enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = 32 };
+	struct entropy_timer_state stack;
+	unsigned int i, num_different = 0;
+	unsigned long last = random_get_entropy();

-	/* Slow counter - or none. Don't even bother */
-	if (stack.entropy == random_get_entropy())
+	for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
+		stack.entropy = random_get_entropy();
+		if (stack.entropy != last)
+			++num_different;
+		last = stack.entropy;
+	}
+	stack.samples_per_bit = DIV_ROUND_UP(NUM_TRIAL_SAMPLES, num_different + 1);
+	if (stack.samples_per_bit > MAX_SAMPLES_PER_BIT)
 		return;

+	stack.samples = 0;
 	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
 	while (!crng_ready() && !signal_pending(current)) {
 		if (!timer_pending(&stack.timer))