Commit 8adc0486 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:

 - Huawei reported that when they updated their kernel from 4.4 to
   something much newer, some userspace code they had broke, the culprit
   being the accidental removal of O_NONBLOCK from /dev/random way back
   in 5.6. It's been gone for over 2 years now and this is the first
   we've heard of it, but userspace breakage is userspace breakage, so
   O_NONBLOCK is now back.

 - Use randomness from hardware RNGs much more often during early boot,
   at the same interval that crng reseeds are done, from Dominik.

 - A semantic change in hardware RNG throttling, so that the hwrng
   framework can properly feed random.c with randomness from hardware
   RNGs that aren't specifically marked as creditable.

   A related patch coming to you via Herbert's hwrng tree depends on
   this one, not to compile, but just to function properly, so you may
   want to merge this PULL before that one.

 - A fix to clamp credited bits from the interrupts pool to the size of
   the pool sample. This is mainly just a theoretical fix, as it'd be
   pretty hard to exceed it in practice.

 - Oracle reported that InfiniBand TCP latency regressed by around
   10-15% after a change a few cycles ago made at the request of the RT
   folks, in which we hoisted a somewhat rare operation (1 in 1024
   times) out of the hard IRQ handler and into a workqueue, a pretty
   common and boring pattern.

   It turns out, though, that scheduling a worker from there has
   overhead of its own, whereas scheduling a timer on that same CPU for
   the next jiffy amortizes better and doesn't incur the same overhead.

   I also eliminated a cache miss by moving the work_struct (and
   subsequently, the timer_list) to below a critical cache line, so that
   the more critical members that are accessed on every hard IRQ aren't
   split between two cache lines.

 - The boot-time initialization of the RNG has been split into two
   approximate phases: what we can accomplish before timekeeping is
   possible and what we can accomplish after.

   This winds up being useful so that we can use RDRAND to seed the RNG
   before CONFIG_SLAB_FREELIST_RANDOM=y systems initialize slabs, in
   addition to other early uses of randomness. The effect is that
   systems with RDRAND (or a bootloader seed) will never see any
   warnings at all when setting CONFIG_WARN_ALL_UNSEEDED_RANDOM=y. And
   kfence benefits from getting a better seed of its own.

 - Small systems without much entropy sometimes wind up putting some
   truncated serial number read from flash into hostname, so contribute
   utsname changes to the RNG, without crediting.

 - Add smaller batches to serve requests for smaller integers, and make
   use of them when people ask for random numbers bounded by a given
   compile-time constant. This has positive effects all over the tree,
   most notably in networking and kfence.

 - The original jitter algorithm intended (I believe) to schedule the
   timer for the next jiffy, not the next-next jiffy, yet it used
   mod_timer(jiffies + 1), which will fire on the next-next jiffy,
   instead of what I believe was intended, mod_timer(jiffies), which
   will fire on the next jiffy. So fix that.

 - Fix a comment typo, from William.

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: clear new batches when bringing new CPUs online
  random: fix typos in get_random_bytes() comment
  random: schedule jitter credit for next jiffy, not in two jiffies
  prandom: make use of smaller types in prandom_u32_max
  random: add 8-bit and 16-bit batches
  utsname: contribute changes to RNG
  random: use init_utsname() instead of utsname()
  kfence: use better stack hash seed
  random: split initialization into early step and later step
  random: use expired timer rather than wq for mixing fast pool
  random: avoid reading two cache lines on irq randomness
  random: clamp credited irq bits to maximum mixed
  random: throttle hwrng writes if no entropy is credited
  random: use hwgenerator randomness more frequently at early boot
  random: restore O_NONBLOCK support
parents 52abb27a a890d1c6
......@@ -712,8 +712,8 @@ static const struct memdev {
#endif
[5] = { "zero", 0666, &zero_fops, FMODE_NOWAIT },
[7] = { "full", 0666, &full_fops, 0 },
[8] = { "random", 0666, &random_fops, 0 },
[9] = { "urandom", 0666, &urandom_fops, 0 },
[8] = { "random", 0666, &random_fops, FMODE_NOWAIT },
[9] = { "urandom", 0666, &urandom_fops, FMODE_NOWAIT },
#ifdef CONFIG_PRINTK
[11] = { "kmsg", 0644, &kmsg_fops, 0 },
#endif
......
......@@ -96,8 +96,8 @@ MODULE_PARM_DESC(ratelimit_disable, "Disable random ratelimit suppression");
/*
* Returns whether or not the input pool has been seeded and thus guaranteed
* to supply cryptographically secure random numbers. This applies to: the
* /dev/urandom device, the get_random_bytes function, and the get_random_{u32,
* ,u64,int,long} family of functions.
* /dev/urandom device, the get_random_bytes function, and the get_random_{u8,
* u16,u32,u64,int,long} family of functions.
*
* Returns: true if the input pool has been seeded.
* false if the input pool has not been seeded.
......@@ -119,9 +119,9 @@ static void try_to_generate_entropy(void);
/*
* Wait for the input pool to be seeded and thus guaranteed to supply
* cryptographically secure random numbers. This applies to: the /dev/urandom
* device, the get_random_bytes function, and the get_random_{u32,u64,int,long}
* family of functions. Using any of these functions without first calling
* this function forfeits the guarantee of security.
* device, the get_random_bytes function, and the get_random_{u8,u16,u32,u64,
* int,long} family of functions. Using any of these functions without first
* calling this function forfeits the guarantee of security.
*
* Returns: 0 if the input pool has been seeded.
* -ERESTARTSYS if the function was interrupted by a signal.
......@@ -157,6 +157,8 @@ EXPORT_SYMBOL(wait_for_random_bytes);
* There are a few exported interfaces for use by other drivers:
*
* void get_random_bytes(void *buf, size_t len)
* u8 get_random_u8()
* u16 get_random_u16()
* u32 get_random_u32()
* u64 get_random_u64()
* unsigned int get_random_int()
......@@ -164,10 +166,10 @@ EXPORT_SYMBOL(wait_for_random_bytes);
*
* These interfaces will return the requested number of random bytes
* into the given buffer or as a return value. This is equivalent to
* a read from /dev/urandom. The u32, u64, int, and long family of
* functions may be higher performance for one-off random integers,
* because they do a bit of buffering and do not invoke reseeding
* until the buffer is emptied.
* a read from /dev/urandom. The u8, u16, u32, u64, int, and long
* family of functions may be higher performance for one-off random
* integers, because they do a bit of buffering and do not invoke
* reseeding until the buffer is emptied.
*
*********************************************************************/
......@@ -260,25 +262,23 @@ static void crng_fast_key_erasure(u8 key[CHACHA_KEY_SIZE],
}
/*
* Return whether the crng seed is considered to be sufficiently old
* that a reseeding is needed. This happens if the last reseeding
* was CRNG_RESEED_INTERVAL ago, or during early boot, at an interval
* Return the interval until the next reseeding, which is normally
* CRNG_RESEED_INTERVAL, but during early boot, it is at an interval
* proportional to the uptime.
*/
static bool crng_has_old_seed(void)
static unsigned int crng_reseed_interval(void)
{
static bool early_boot = true;
unsigned long interval = CRNG_RESEED_INTERVAL;
if (unlikely(READ_ONCE(early_boot))) {
time64_t uptime = ktime_get_seconds();
if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2)
WRITE_ONCE(early_boot, false);
else
interval = max_t(unsigned int, CRNG_RESEED_START_INTERVAL,
return max_t(unsigned int, CRNG_RESEED_START_INTERVAL,
(unsigned int)uptime / 2 * HZ);
}
return time_is_before_jiffies(READ_ONCE(base_crng.birth) + interval);
return CRNG_RESEED_INTERVAL;
}
/*
......@@ -320,7 +320,7 @@ static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS],
* If the base_crng is old enough, we reseed, which in turn bumps the
* generation counter that we check below.
*/
if (unlikely(crng_has_old_seed()))
if (unlikely(time_is_before_jiffies(READ_ONCE(base_crng.birth) + crng_reseed_interval())))
crng_reseed();
local_lock_irqsave(&crngs.lock, flags);
......@@ -384,11 +384,11 @@ static void _get_random_bytes(void *buf, size_t len)
}
/*
* This function is the exported kernel interface. It returns some
* number of good random numbers, suitable for key generation, seeding
* TCP sequence numbers, etc. In order to ensure that the randomness
* by this function is okay, the function wait_for_random_bytes()
* should be called and return 0 at least once at any point prior.
* This function is the exported kernel interface. It returns some number of
* good random numbers, suitable for key generation, seeding TCP sequence
* numbers, etc. In order to ensure that the randomness returned by this
* function is okay, the function wait_for_random_bytes() should be called and
* return 0 at least once at any point prior.
*/
void get_random_bytes(void *buf, size_t len)
{
......@@ -506,8 +506,10 @@ type get_random_ ##type(void) \
} \
EXPORT_SYMBOL(get_random_ ##type);
DEFINE_BATCHED_ENTROPY(u64)
DEFINE_BATCHED_ENTROPY(u8)
DEFINE_BATCHED_ENTROPY(u16)
DEFINE_BATCHED_ENTROPY(u32)
DEFINE_BATCHED_ENTROPY(u64)
#ifdef CONFIG_SMP
/*
......@@ -522,6 +524,8 @@ int __cold random_prepare_cpu(unsigned int cpu)
* randomness.
*/
per_cpu_ptr(&crngs, cpu)->generation = ULONG_MAX;
per_cpu_ptr(&batched_entropy_u8, cpu)->position = UINT_MAX;
per_cpu_ptr(&batched_entropy_u16, cpu)->position = UINT_MAX;
per_cpu_ptr(&batched_entropy_u32, cpu)->position = UINT_MAX;
per_cpu_ptr(&batched_entropy_u64, cpu)->position = UINT_MAX;
return 0;
......@@ -774,18 +778,13 @@ static int random_pm_notification(struct notifier_block *nb, unsigned long actio
static struct notifier_block pm_notifier = { .notifier_call = random_pm_notification };
/*
* The first collection of entropy occurs at system boot while interrupts
* are still turned off. Here we push in latent entropy, RDSEED, a timestamp,
* utsname(), and the command line. Depending on the above configuration knob,
* RDSEED may be considered sufficient for initialization. Note that much
* earlier setup may already have pushed entropy into the input pool by the
* time we get here.
* This is called extremely early, before time keeping functionality is
* available, but arch randomness is. Interrupts are not yet enabled.
*/
int __init random_init(const char *command_line)
void __init random_init_early(const char *command_line)
{
ktime_t now = ktime_get_real();
size_t i, longs, arch_bits;
unsigned long entropy[BLAKE2S_BLOCK_SIZE / sizeof(long)];
size_t i, longs, arch_bits;
#if defined(LATENT_ENTROPY_PLUGIN)
static const u8 compiletime_seed[BLAKE2S_BLOCK_SIZE] __initconst __latent_entropy;
......@@ -805,34 +804,49 @@ int __init random_init(const char *command_line)
i += longs;
continue;
}
entropy[0] = random_get_entropy();
_mix_pool_bytes(entropy, sizeof(*entropy));
arch_bits -= sizeof(*entropy) * 8;
++i;
}
_mix_pool_bytes(&now, sizeof(now));
_mix_pool_bytes(utsname(), sizeof(*(utsname())));
_mix_pool_bytes(init_utsname(), sizeof(*(init_utsname())));
_mix_pool_bytes(command_line, strlen(command_line));
/* Reseed if already seeded by earlier phases. */
if (crng_ready())
crng_reseed();
else if (trust_cpu)
_credit_init_bits(arch_bits);
}
/*
* This is called a little bit after the prior function, and now there is
* access to timestamps counters. Interrupts are not yet enabled.
*/
void __init random_init(void)
{
unsigned long entropy = random_get_entropy();
ktime_t now = ktime_get_real();
_mix_pool_bytes(&now, sizeof(now));
_mix_pool_bytes(&entropy, sizeof(entropy));
add_latent_entropy();
/*
* If we were initialized by the bootloader before jump labels are
* initialized, then we should enable the static branch here, where
* If we were initialized by the cpu or bootloader before jump labels
* are initialized, then we should enable the static branch here, where
* it's guaranteed that jump labels have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
/* Reseed if already seeded by earlier phases. */
if (crng_ready())
crng_reseed();
else if (trust_cpu)
_credit_init_bits(arch_bits);
WARN_ON(register_pm_notifier(&pm_notifier));
WARN(!random_get_entropy(), "Missing cycle counter and fallback timer; RNG "
WARN(!entropy, "Missing cycle counter and fallback timer; RNG "
"entropy collection will consequently suffer.");
return 0;
}
/*
......@@ -866,11 +880,11 @@ void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy)
credit_init_bits(entropy);
/*
* Throttle writing to once every CRNG_RESEED_INTERVAL, unless
* we're not yet initialized.
* Throttle writing to once every reseed interval, unless we're not yet
* initialized or no entropy is credited.
*/
if (!kthread_should_stop() && crng_ready())
schedule_timeout_interruptible(CRNG_RESEED_INTERVAL);
if (!kthread_should_stop() && (crng_ready() || !entropy))
schedule_timeout_interruptible(crng_reseed_interval());
}
EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
......@@ -920,20 +934,23 @@ EXPORT_SYMBOL_GPL(unregister_random_vmfork_notifier);
#endif
struct fast_pool {
struct work_struct mix;
unsigned long pool[4];
unsigned long last;
unsigned int count;
struct timer_list mix;
};
static void mix_interrupt_randomness(struct timer_list *work);
static DEFINE_PER_CPU(struct fast_pool, irq_randomness) = {
#ifdef CONFIG_64BIT
#define FASTMIX_PERM SIPHASH_PERMUTATION
.pool = { SIPHASH_CONST_0, SIPHASH_CONST_1, SIPHASH_CONST_2, SIPHASH_CONST_3 }
.pool = { SIPHASH_CONST_0, SIPHASH_CONST_1, SIPHASH_CONST_2, SIPHASH_CONST_3 },
#else
#define FASTMIX_PERM HSIPHASH_PERMUTATION
.pool = { HSIPHASH_CONST_0, HSIPHASH_CONST_1, HSIPHASH_CONST_2, HSIPHASH_CONST_3 }
.pool = { HSIPHASH_CONST_0, HSIPHASH_CONST_1, HSIPHASH_CONST_2, HSIPHASH_CONST_3 },
#endif
.mix = __TIMER_INITIALIZER(mix_interrupt_randomness, 0)
};
/*
......@@ -975,7 +992,7 @@ int __cold random_online_cpu(unsigned int cpu)
}
#endif
static void mix_interrupt_randomness(struct work_struct *work)
static void mix_interrupt_randomness(struct timer_list *work)
{
struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
/*
......@@ -1006,7 +1023,7 @@ static void mix_interrupt_randomness(struct work_struct *work)
local_irq_enable();
mix_pool_bytes(pool, sizeof(pool));
credit_init_bits(max(1u, (count & U16_MAX) / 64));
credit_init_bits(clamp_t(unsigned int, (count & U16_MAX) / 64, 1, sizeof(pool) * 8));
memzero_explicit(pool, sizeof(pool));
}
......@@ -1029,10 +1046,11 @@ void add_interrupt_randomness(int irq)
if (new_count < 1024 && !time_is_before_jiffies(fast_pool->last + HZ))
return;
if (unlikely(!fast_pool->mix.func))
INIT_WORK(&fast_pool->mix, mix_interrupt_randomness);
fast_pool->count |= MIX_INFLIGHT;
queue_work_on(raw_smp_processor_id(), system_highpri_wq, &fast_pool->mix);
if (!timer_pending(&fast_pool->mix)) {
fast_pool->mix.expires = jiffies;
add_timer_on(&fast_pool->mix, raw_smp_processor_id());
}
}
EXPORT_SYMBOL_GPL(add_interrupt_randomness);
......@@ -1191,7 +1209,7 @@ static void __cold entropy_timer(struct timer_list *timer)
*/
static void __cold try_to_generate_entropy(void)
{
enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = HZ / 30 };
enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = HZ / 15 };
struct entropy_timer_state stack;
unsigned int i, num_different = 0;
unsigned long last = random_get_entropy();
......@@ -1210,7 +1228,7 @@ static void __cold try_to_generate_entropy(void)
timer_setup_on_stack(&stack.timer, entropy_timer, 0);
while (!crng_ready() && !signal_pending(current)) {
if (!timer_pending(&stack.timer))
mod_timer(&stack.timer, jiffies + 1);
mod_timer(&stack.timer, jiffies);
mix_pool_bytes(&stack.entropy, sizeof(stack.entropy));
schedule();
stack.entropy = random_get_entropy();
......@@ -1347,6 +1365,11 @@ static ssize_t random_read_iter(struct kiocb *kiocb, struct iov_iter *iter)
{
int ret;
if (!crng_ready() &&
((kiocb->ki_flags & (IOCB_NOWAIT | IOCB_NOIO)) ||
(kiocb->ki_filp->f_flags & O_NONBLOCK)))
return -EAGAIN;
ret = wait_for_random_bytes();
if (ret != 0)
return ret;
......
......@@ -12,11 +12,13 @@
#include <linux/percpu.h>
#include <linux/random.h>
/* Deprecated: use get_random_u32 instead. */
static inline u32 prandom_u32(void)
{
return get_random_u32();
}
/* Deprecated: use get_random_bytes instead. */
static inline void prandom_bytes(void *buf, size_t nbytes)
{
return get_random_bytes(buf, nbytes);
......@@ -37,17 +39,20 @@ void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state);
* prandom_u32_max - returns a pseudo-random number in interval [0, ep_ro)
* @ep_ro: right open interval endpoint
*
* Returns a pseudo-random number that is in interval [0, ep_ro). Note
* that the result depends on PRNG being well distributed in [0, ~0U]
* u32 space. Here we use maximally equidistributed combined Tausworthe
* generator, that is, prandom_u32(). This is useful when requesting a
* random index of an array containing ep_ro elements, for example.
* Returns a pseudo-random number that is in interval [0, ep_ro). This is
* useful when requesting a random index of an array containing ep_ro elements,
* for example. The result is somewhat biased when ep_ro is not a power of 2,
* so do not use this for cryptographic purposes.
*
* Returns: pseudo-random number in interval [0, ep_ro)
*/
static inline u32 prandom_u32_max(u32 ep_ro)
{
return (u32)(((u64) prandom_u32() * ep_ro) >> 32);
if (__builtin_constant_p(ep_ro <= 1U << 8) && ep_ro <= 1U << 8)
return (get_random_u8() * ep_ro) >> 8;
if (__builtin_constant_p(ep_ro <= 1U << 16) && ep_ro <= 1U << 16)
return (get_random_u16() * ep_ro) >> 16;
return ((u64)get_random_u32() * ep_ro) >> 32;
}
/*
......
......@@ -38,6 +38,8 @@ static inline int unregister_random_vmfork_notifier(struct notifier_block *nb) {
#endif
void get_random_bytes(void *buf, size_t len);
u8 get_random_u8(void);
u16 get_random_u16(void);
u32 get_random_u32(void);
u64 get_random_u64(void);
static inline unsigned int get_random_int(void)
......@@ -72,7 +74,8 @@ static inline unsigned long get_random_canary(void)
return get_random_long() & CANARY_MASK;
}
int __init random_init(const char *command_line);
void __init random_init_early(const char *command_line);
void __init random_init(void);
bool rng_is_initialized(void);
int wait_for_random_bytes(void);
......@@ -93,6 +96,8 @@ static inline int get_random_bytes_wait(void *buf, size_t nbytes)
*out = get_random_ ## name(); \
return 0; \
}
declare_get_random_var_wait(u8, u8)
declare_get_random_var_wait(u16, u16)
declare_get_random_var_wait(u32, u32)
declare_get_random_var_wait(u64, u32)
declare_get_random_var_wait(int, unsigned int)
......
......@@ -976,6 +976,9 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
parse_args("Setting extra init args", extra_init_args,
NULL, 0, -1, -1, NULL, set_init_arg);
/* Architectural and non-timekeeping rng init, before allocator init */
random_init_early(command_line);
/*
* These use large bootmem allocations and must precede
* kmem_cache_init()
......@@ -1035,17 +1038,13 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
hrtimers_init();
softirq_init();
timekeeping_init();
kfence_init();
time_init();
/*
* For best initial stack canary entropy, prepare it after:
* - setup_arch() for any UEFI RNG entropy and boot cmdline access
* - timekeeping_init() for ktime entropy used in random_init()
* - time_init() for making random_get_entropy() work on some platforms
* - random_init() to initialize the RNG from from early entropy sources
*/
random_init(command_line);
/* This must be after timekeeping is initialized */
random_init();
/* These make use of the fully initialized rng */
kfence_init();
boot_init_stack_canary();
perf_event_init();
......
......@@ -25,6 +25,7 @@
#include <linux/times.h>
#include <linux/posix-timers.h>
#include <linux/security.h>
#include <linux/random.h>
#include <linux/suspend.h>
#include <linux/tty.h>
#include <linux/signal.h>
......@@ -1366,6 +1367,7 @@ SYSCALL_DEFINE2(sethostname, char __user *, name, int, len)
if (!copy_from_user(tmp, name, len)) {
struct new_utsname *u;
add_device_randomness(tmp, len);
down_write(&uts_sem);
u = utsname();
memcpy(u->nodename, tmp, len);
......@@ -1419,6 +1421,7 @@ SYSCALL_DEFINE2(setdomainname, char __user *, name, int, len)
if (!copy_from_user(tmp, name, len)) {
struct new_utsname *u;
add_device_randomness(tmp, len);
down_write(&uts_sem);
u = utsname();
memcpy(u->domainname, tmp, len);
......
......@@ -8,6 +8,7 @@
#include <linux/export.h>
#include <linux/uts.h>
#include <linux/utsname.h>
#include <linux/random.h>
#include <linux/sysctl.h>
#include <linux/wait.h>
#include <linux/rwsem.h>
......@@ -57,6 +58,7 @@ static int proc_do_uts_string(struct ctl_table *table, int write,
* theoretically be incorrect if there are two parallel writes
* at non-zero offsets to the same sysctl.
*/
add_device_randomness(tmp_data, sizeof(tmp_data));
down_write(&uts_sem);
memcpy(get_uts(table), tmp_data, sizeof(tmp_data));
up_write(&uts_sem);
......
......@@ -864,7 +864,7 @@ static void kfence_init_enable(void)
void __init kfence_init(void)
{
stack_hash_seed = (u32)random_get_entropy();
stack_hash_seed = get_random_u32();
/* Setting kfence_sample_interval to 0 on boot disables KFENCE. */
if (!kfence_sample_interval)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment