Commit c903327d authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'printk-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:
 "This is the "last" part of the support for the new nbcon consoles.
  Where "nbcon" stays for "No Big console lock CONsoles" aka not under
  the console_lock.

  New callbacks are added to struct console:

   - write_thread() for flushing nbcon consoles in task context.

   - write_atomic() for flushing nbcon consoles in atomic context,
     including NMI.

   - con->device_lock() and device_unlock() for taking the driver
     specific lock, for example, port->lock.

  New printk-specific kthreads are created:

   - per-console kthreads which get responsible for flushing normal
     priority messages on nbcon consoles.

   - thread which gets responsible for flushing normal priority messages
     on all consoles when CONFIG_RT enabled.

  The new callbacks are called under a special per-console lock which
  has already been added back in v6.7. It allows to distinguish three
  severities: normal, emergency, and panic. A context with a higher
  priority could take over the ownership when it is safe even in the
  middle of handling a record. The panic context could do it even when
  it is not safe. But it is allowed only for the final desperate flush
  before entering the infinite loop.

  The new lock helps to flush the messages directly in emergency and
  panic contexts. But it is not enough in all situations:

   - console_lock() is still need for synchronization against boot
     consoles.

   - con->device_lock() is need for synchronization against other
     operations on the same HW, e.g. serial port speed setting,
     non-printk related read/write.

  The dependency on con->device_lock() is mutual. Any code taking the
  driver specific lock has to acquire the related nbcon console context
  as well. For example, see the new uart_port_lock() API. It provides
  the necessary synchronization against emergency and panic contexts
  where the messages are flushed only under the new per-console lock.

  Maybe surprisingly, a quite tricky part is the decision how to flush
  the consoles in various situations. It has to take into account:

   - message priority:    normal, emergency, panic

   - scheduling context:  task, atomic, deferred_legacy

   - registered consoles: boot, legacy, nbcon

   - threads are running: early boot, suspend, shutdown, panic

   - caller:              printk(), pr_flush(), printk_flush_in_panic(),
                          console_unlock(), console_start(), ...

  The primary decision is made in printk_get_console_flush_type(). It
  creates a hint what the caller should do:

   - flush nbcon consoles directly or via the kthread

   - call the legacy loop (console_unlock()) directly or via irq_work

  The existing behavior is preserved for the legacy consoles. The only
  exception is that they are not longer flushed directly from printk()
  in panic() before CPUs are stopped. But this blocking happens only
  when at least one nbcon console is registered. The motivation is to
  increase a chance to produce the crash dump. They legacy consoles
  might create a deadlock in compare with nbcon consoles. The nbcon
  console should allow to see the messages even when the crash dump
  fails.

  There are three possible ways how nbcon consoles are flushed:

   - The per-nbcon-console kthread is responsible for flushing messages
     added with the normal priority. This is the default mode.

   - The legacy loop, aka console_unlock(), is used when there is still
     a boot console registered. There is no easy way how to match an
     early console driver with a nbcon console driver. And the
     console_lock() provides the only reliable serialization at the
     moment.

     The legacy loop uses either con->write_atomic() or
     con->write_thread() callbacks depending on whether it is allowed to
     schedule. The atomic variant has to be used from printk().

   - In other situations, the messages are flushed directly using
     write_atomic() which can be called in any context, including NMI.
     It is primary needed during early boot or shutdown, in emergency
     situations, and panic.

  The emergency priority is used by a code called within
  nbcon_cpu_emergency_enter()/exit(). At the moment, it is used in four
  situations: WARN(), Oops, lockdep, and RCU stall reports.

  Finally, there is no nbcon console at the moment. It means that the
  changes should _not_ modify the existing behavior. The only exception
  is CONFIG_RT which would force offloading the legacy loop, for normal
  priority context, into the dedicated kthread"

* tag 'printk-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (54 commits)
  printk: Avoid false positive lockdep report for legacy printing
  printk: nbcon: Assign nice -20 for printing threads
  printk: Implement legacy printer kthread for PREEMPT_RT
  tty: sysfs: Add nbcon support for 'active'
  proc: Add nbcon support for /proc/consoles
  proc: consoles: Add notation to c_start/c_stop
  printk: nbcon: Show replay message on takeover
  printk: Provide helper for message prepending
  printk: nbcon: Rely on kthreads for normal operation
  printk: nbcon: Use thread callback if in task context for legacy
  printk: nbcon: Relocate nbcon_atomic_emit_one()
  printk: nbcon: Introduce printer kthreads
  printk: nbcon: Init @nbcon_seq to highest possible
  printk: nbcon: Add context to usable() and emit()
  printk: Flush console on unregister_console()
  printk: Fail pr_flush() if before SYSTEM_SCHEDULING
  printk: nbcon: Add function for printers to reacquire ownership
  printk: nbcon: Use raw_cpu_ptr() instead of open coding
  printk: Use the BITS_PER_LONG macro
  lockdep: Mark emergency sections in lockdep splats
  ...
parents daa394f0 daeed159
......@@ -423,11 +423,11 @@ static int univ8250_console_setup(struct console *co, char *options)
port = &serial8250_ports[co->index].port;
/* link port to console */
port->cons = co;
uart_port_set_cons(port, co);
retval = serial8250_console_setup(port, options, false);
if (retval != 0)
port->cons = NULL;
uart_port_set_cons(port, NULL);
return retval;
}
......@@ -485,7 +485,7 @@ static int univ8250_console_match(struct console *co, char *name, int idx,
continue;
co->index = i;
port->cons = co;
uart_port_set_cons(port, co);
return serial8250_console_setup(port, options, true);
}
......
......@@ -2480,7 +2480,7 @@ static int pl011_console_match(struct console *co, char *name, int idx,
continue;
co->index = i;
port->cons = co;
uart_port_set_cons(port, co);
return pl011_console_setup(co, options);
}
......
......@@ -3176,8 +3176,15 @@ static int serial_core_add_one_port(struct uart_driver *drv, struct uart_port *u
state->uart_port = uport;
uport->state = state;
/*
* If this port is in use as a console then the spinlock is already
* initialised.
*/
if (!uart_console_registered(uport))
uart_port_spin_lock_init(uport);
state->pm_state = UART_PM_STATE_UNDEFINED;
uport->cons = drv->cons;
uart_port_set_cons(uport, drv->cons);
uport->minor = drv->tty_driver->minor_start + uport->line;
uport->name = kasprintf(GFP_KERNEL, "%s%d", drv->dev_name,
drv->tty_driver->name_base + uport->line);
......@@ -3186,13 +3193,6 @@ static int serial_core_add_one_port(struct uart_driver *drv, struct uart_port *u
goto out;
}
/*
* If this port is in use as a console then the spinlock is already
* initialised.
*/
if (!uart_console_registered(uport))
uart_port_spin_lock_init(uport);
if (uport->cons && uport->dev)
of_console_check(uport->dev->of_node, uport->cons->name, uport->line);
......
......@@ -3573,7 +3573,7 @@ static ssize_t show_cons_active(struct device *dev,
for_each_console(c) {
if (!c->device)
continue;
if (!c->write)
if (!(c->flags & CON_NBCON) && !c->write)
continue;
if ((c->flags & CON_ENABLED) == 0)
continue;
......
......@@ -21,6 +21,7 @@ static int show_console_dev(struct seq_file *m, void *v)
{ CON_ENABLED, 'E' },
{ CON_CONSDEV, 'C' },
{ CON_BOOT, 'B' },
{ CON_NBCON, 'N' },
{ CON_PRINTBUFFER, 'p' },
{ CON_BRL, 'b' },
{ CON_ANYTIME, 'a' },
......@@ -58,8 +59,8 @@ static int show_console_dev(struct seq_file *m, void *v)
seq_printf(m, "%s%d", con->name, con->index);
seq_pad(m, ' ');
seq_printf(m, "%c%c%c (%s)", con->read ? 'R' : '-',
con->write ? 'W' : '-', con->unblank ? 'U' : '-',
flags);
((con->flags & CON_NBCON) || con->write) ? 'W' : '-',
con->unblank ? 'U' : '-', flags);
if (dev)
seq_printf(m, " %4d:%d", MAJOR(dev), MINOR(dev));
......@@ -68,6 +69,7 @@ static int show_console_dev(struct seq_file *m, void *v)
}
static void *c_start(struct seq_file *m, loff_t *pos)
__acquires(&console_mutex)
{
struct console *con;
loff_t off = 0;
......@@ -94,6 +96,7 @@ static void *c_next(struct seq_file *m, void *v, loff_t *pos)
}
static void c_stop(struct seq_file *m, void *v)
__releases(&console_mutex)
{
console_list_unlock();
}
......
......@@ -16,7 +16,9 @@
#include <linux/atomic.h>
#include <linux/bits.h>
#include <linux/irq_work.h>
#include <linux/rculist.h>
#include <linux/rcuwait.h>
#include <linux/types.h>
#include <linux/vesa.h>
......@@ -303,7 +305,7 @@ struct nbcon_write_context {
/**
* struct console - The console descriptor structure
* @name: The name of the console driver
* @write: Write callback to output messages (Optional)
* @write: Legacy write callback to output messages (Optional)
* @read: Read callback for console input (Optional)
* @device: The underlying TTY device driver (Optional)
* @unblank: Callback to unblank the console (Optional)
......@@ -320,10 +322,14 @@ struct nbcon_write_context {
* @data: Driver private data
* @node: hlist node for the console list
*
* @write_atomic: Write callback for atomic context
* @nbcon_state: State for nbcon consoles
* @nbcon_seq: Sequence number of the next record for nbcon to print
* @nbcon_device_ctxt: Context available for non-printing operations
* @nbcon_prev_seq: Seq num the previous nbcon owner was assigned to print
* @pbufs: Pointer to nbcon private buffer
* @kthread: Printer kthread for this console
* @rcuwait: RCU-safe wait object for @kthread waking
* @irq_work: Defer @kthread waking to IRQ work context
*/
struct console {
char name[16];
......@@ -345,11 +351,121 @@ struct console {
struct hlist_node node;
/* nbcon console specific members */
bool (*write_atomic)(struct console *con,
struct nbcon_write_context *wctxt);
/**
* @write_atomic:
*
* NBCON callback to write out text in any context. (Optional)
*
* This callback is called with the console already acquired. However,
* a higher priority context is allowed to take it over by default.
*
* The callback must call nbcon_enter_unsafe() and nbcon_exit_unsafe()
* around any code where the takeover is not safe, for example, when
* manipulating the serial port registers.
*
* nbcon_enter_unsafe() will fail if the context has lost the console
* ownership in the meantime. In this case, the callback is no longer
* allowed to go forward. It must back out immediately and carefully.
* The buffer content is also no longer trusted since it no longer
* belongs to the context.
*
* The callback should allow the takeover whenever it is safe. It
* increases the chance to see messages when the system is in trouble.
* If the driver must reacquire ownership in order to finalize or
* revert hardware changes, nbcon_reacquire_nobuf() can be used.
* However, on reacquire the buffer content is no longer available. A
* reacquire cannot be used to resume printing.
*
* The callback can be called from any context (including NMI).
* Therefore it must avoid usage of any locking and instead rely
* on the console ownership for synchronization.
*/
void (*write_atomic)(struct console *con, struct nbcon_write_context *wctxt);
/**
* @write_thread:
*
* NBCON callback to write out text in task context.
*
* This callback must be called only in task context with both
* device_lock() and the nbcon console acquired with
* NBCON_PRIO_NORMAL.
*
* The same rules for console ownership verification and unsafe
* sections handling applies as with write_atomic().
*
* The console ownership handling is necessary for synchronization
* against write_atomic() which is synchronized only via the context.
*
* The device_lock() provides the primary serialization for operations
* on the device. It might be as relaxed (mutex)[*] or as tight
* (disabled preemption and interrupts) as needed. It allows
* the kthread to operate in the least restrictive mode[**].
*
* [*] Standalone nbcon_context_try_acquire() is not safe with
* the preemption enabled, see nbcon_owner_matches(). But it
* can be safe when always called in the preemptive context
* under the device_lock().
*
* [**] The device_lock() makes sure that nbcon_context_try_acquire()
* would never need to spin which is important especially with
* PREEMPT_RT.
*/
void (*write_thread)(struct console *con, struct nbcon_write_context *wctxt);
/**
* @device_lock:
*
* NBCON callback to begin synchronization with driver code.
*
* Console drivers typically must deal with access to the hardware
* via user input/output (such as an interactive login shell) and
* output of kernel messages via printk() calls. This callback is
* called by the printk-subsystem whenever it needs to synchronize
* with hardware access by the driver. It should be implemented to
* use whatever synchronization mechanism the driver is using for
* itself (for example, the port lock for uart serial consoles).
*
* The callback is always called from task context. It may use any
* synchronization method required by the driver.
*
* IMPORTANT: The callback MUST disable migration. The console driver
* may be using a synchronization mechanism that already takes
* care of this (such as spinlocks). Otherwise this function must
* explicitly call migrate_disable().
*
* The flags argument is provided as a convenience to the driver. It
* will be passed again to device_unlock(). It can be ignored if the
* driver does not need it.
*/
void (*device_lock)(struct console *con, unsigned long *flags);
/**
* @device_unlock:
*
* NBCON callback to finish synchronization with driver code.
*
* It is the counterpart to device_lock().
*
* This callback is always called from task context. It must
* appropriately re-enable migration (depending on how device_lock()
* disabled migration).
*
* The flags argument is the value of the same variable that was
* passed to device_lock().
*/
void (*device_unlock)(struct console *con, unsigned long flags);
atomic_t __private nbcon_state;
atomic_long_t __private nbcon_seq;
struct nbcon_context __private nbcon_device_ctxt;
atomic_long_t __private nbcon_prev_seq;
struct printk_buffers *pbufs;
struct task_struct *kthread;
struct rcuwait rcuwait;
struct irq_work irq_work;
};
#ifdef CONFIG_LOCKDEP
......@@ -378,28 +494,34 @@ extern void console_list_unlock(void) __releases(console_mutex);
extern struct hlist_head console_list;
/**
* console_srcu_read_flags - Locklessly read the console flags
* console_srcu_read_flags - Locklessly read flags of a possibly registered
* console
* @con: struct console pointer of console to read flags from
*
* This function provides the necessary READ_ONCE() and data_race()
* notation for locklessly reading the console flags. The READ_ONCE()
* in this function matches the WRITE_ONCE() when @flags are modified
* for registered consoles with console_srcu_write_flags().
* Locklessly reading @con->flags provides a consistent read value because
* there is at most one CPU modifying @con->flags and that CPU is using only
* read-modify-write operations to do so.
*
* Requires console_srcu_read_lock to be held, which implies that @con might
* be a registered console. The purpose of holding console_srcu_read_lock is
* to guarantee that the console state is valid (CON_SUSPENDED/CON_ENABLED)
* and that no exit/cleanup routines will run if the console is currently
* undergoing unregistration.
*
* Only use this function to read console flags when locklessly
* iterating the console list via srcu.
* If the caller is holding the console_list_lock or it is _certain_ that
* @con is not and will not become registered, the caller may read
* @con->flags directly instead.
*
* Context: Any context.
* Return: The current value of the @con->flags field.
*/
static inline short console_srcu_read_flags(const struct console *con)
{
WARN_ON_ONCE(!console_srcu_read_lock_is_held());
/*
* Locklessly reading console->flags provides a consistent
* read value because there is at most one CPU modifying
* console->flags and that CPU is using only read-modify-write
* operations to do so.
* The READ_ONCE() matches the WRITE_ONCE() when @flags are modified
* for registered consoles with console_srcu_write_flags().
*/
return data_race(READ_ONCE(con->flags));
}
......@@ -477,13 +599,19 @@ static inline bool console_is_registered(const struct console *con)
hlist_for_each_entry(con, &console_list, node)
#ifdef CONFIG_PRINTK
extern void nbcon_cpu_emergency_enter(void);
extern void nbcon_cpu_emergency_exit(void);
extern bool nbcon_can_proceed(struct nbcon_write_context *wctxt);
extern bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt);
extern bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt);
extern void nbcon_reacquire_nobuf(struct nbcon_write_context *wctxt);
#else
static inline void nbcon_cpu_emergency_enter(void) { }
static inline void nbcon_cpu_emergency_exit(void) { }
static inline bool nbcon_can_proceed(struct nbcon_write_context *wctxt) { return false; }
static inline bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt) { return false; }
static inline bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt) { return false; }
static inline void nbcon_reacquire_nobuf(struct nbcon_write_context *wctxt) { }
#endif
extern int console_set_on_cmdline;
......
......@@ -9,6 +9,8 @@
#include <linux/ratelimit_types.h>
#include <linux/once_lite.h>
struct console;
extern const char linux_banner[];
extern const char linux_proc_banner[];
......@@ -161,15 +163,16 @@ int _printk(const char *fmt, ...);
*/
__printf(1, 2) __cold int _printk_deferred(const char *fmt, ...);
extern void __printk_safe_enter(void);
extern void __printk_safe_exit(void);
extern void __printk_deferred_enter(void);
extern void __printk_deferred_exit(void);
/*
* The printk_deferred_enter/exit macros are available only as a hack for
* some code paths that need to defer all printk console printing. Interrupts
* must be disabled for the deferred duration.
*/
#define printk_deferred_enter __printk_safe_enter
#define printk_deferred_exit __printk_safe_exit
#define printk_deferred_enter() __printk_deferred_enter()
#define printk_deferred_exit() __printk_deferred_exit()
/*
* Please don't use printk_ratelimit(), because it shares ratelimiting state
......@@ -197,6 +200,10 @@ extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
extern asmlinkage void dump_stack(void) __cold;
void printk_trigger_flush(void);
void console_try_replay_all(void);
void printk_legacy_allow_panic_sync(void);
extern bool nbcon_device_try_acquire(struct console *con);
extern void nbcon_device_release(struct console *con);
void nbcon_atomic_flush_unsafe(void);
#else
static inline __printf(1, 0)
int vprintk(const char *s, va_list args)
......@@ -279,6 +286,24 @@ static inline void printk_trigger_flush(void)
static inline void console_try_replay_all(void)
{
}
static inline void printk_legacy_allow_panic_sync(void)
{
}
static inline bool nbcon_device_try_acquire(struct console *con)
{
return false;
}
static inline void nbcon_device_release(struct console *con)
{
}
static inline void nbcon_atomic_flush_unsafe(void)
{
}
#endif
bool this_cpu_in_panic(void);
......
......@@ -11,6 +11,8 @@
#include <linux/compiler.h>
#include <linux/console.h>
#include <linux/interrupt.h>
#include <linux/lockdep.h>
#include <linux/printk.h>
#include <linux/spinlock.h>
#include <linux/sched.h>
#include <linux/tty.h>
......@@ -590,6 +592,95 @@ struct uart_port {
void *private_data; /* generic platform data pointer */
};
/*
* Only for console->device_lock()/_unlock() callbacks and internal
* port lock wrapper synchronization.
*/
static inline void __uart_port_lock_irqsave(struct uart_port *up, unsigned long *flags)
{
spin_lock_irqsave(&up->lock, *flags);
}
/*
* Only for console->device_lock()/_unlock() callbacks and internal
* port lock wrapper synchronization.
*/
static inline void __uart_port_unlock_irqrestore(struct uart_port *up, unsigned long flags)
{
spin_unlock_irqrestore(&up->lock, flags);
}
/**
* uart_port_set_cons - Safely set the @cons field for a uart
* @up: The uart port to set
* @con: The new console to set to
*
* This function must be used to set @up->cons. It uses the port lock to
* synchronize with the port lock wrappers in order to ensure that the console
* cannot change or disappear while another context is holding the port lock.
*/
static inline void uart_port_set_cons(struct uart_port *up, struct console *con)
{
unsigned long flags;
__uart_port_lock_irqsave(up, &flags);
up->cons = con;
__uart_port_unlock_irqrestore(up, flags);
}
/* Only for internal port lock wrapper usage. */
static inline bool __uart_port_using_nbcon(struct uart_port *up)
{
lockdep_assert_held_once(&up->lock);
if (likely(!uart_console(up)))
return false;
/*
* @up->cons is only modified under the port lock. Therefore it is
* certain that it cannot disappear here.
*
* @up->cons->node is added/removed from the console list under the
* port lock. Therefore it is certain that the registration status
* cannot change here, thus @up->cons->flags can be read directly.
*/
if (hlist_unhashed_lockless(&up->cons->node) ||
!(up->cons->flags & CON_NBCON) ||
!up->cons->write_atomic) {
return false;
}
return true;
}
/* Only for internal port lock wrapper usage. */
static inline bool __uart_port_nbcon_try_acquire(struct uart_port *up)
{
if (!__uart_port_using_nbcon(up))
return true;
return nbcon_device_try_acquire(up->cons);
}
/* Only for internal port lock wrapper usage. */
static inline void __uart_port_nbcon_acquire(struct uart_port *up)
{
if (!__uart_port_using_nbcon(up))
return;
while (!nbcon_device_try_acquire(up->cons))
cpu_relax();
}
/* Only for internal port lock wrapper usage. */
static inline void __uart_port_nbcon_release(struct uart_port *up)
{
if (!__uart_port_using_nbcon(up))
return;
nbcon_device_release(up->cons);
}
/**
* uart_port_lock - Lock the UART port
* @up: Pointer to UART port structure
......@@ -597,6 +688,7 @@ struct uart_port {
static inline void uart_port_lock(struct uart_port *up)
{
spin_lock(&up->lock);
__uart_port_nbcon_acquire(up);
}
/**
......@@ -606,6 +698,7 @@ static inline void uart_port_lock(struct uart_port *up)
static inline void uart_port_lock_irq(struct uart_port *up)
{
spin_lock_irq(&up->lock);
__uart_port_nbcon_acquire(up);
}
/**
......@@ -616,6 +709,7 @@ static inline void uart_port_lock_irq(struct uart_port *up)
static inline void uart_port_lock_irqsave(struct uart_port *up, unsigned long *flags)
{
spin_lock_irqsave(&up->lock, *flags);
__uart_port_nbcon_acquire(up);
}
/**
......@@ -626,7 +720,15 @@ static inline void uart_port_lock_irqsave(struct uart_port *up, unsigned long *f
*/
static inline bool uart_port_trylock(struct uart_port *up)
{
return spin_trylock(&up->lock);
if (!spin_trylock(&up->lock))
return false;
if (!__uart_port_nbcon_try_acquire(up)) {
spin_unlock(&up->lock);
return false;
}
return true;
}
/**
......@@ -638,7 +740,15 @@ static inline bool uart_port_trylock(struct uart_port *up)
*/
static inline bool uart_port_trylock_irqsave(struct uart_port *up, unsigned long *flags)
{
return spin_trylock_irqsave(&up->lock, *flags);
if (!spin_trylock_irqsave(&up->lock, *flags))
return false;
if (!__uart_port_nbcon_try_acquire(up)) {
spin_unlock_irqrestore(&up->lock, *flags);
return false;
}
return true;
}
/**
......@@ -647,6 +757,7 @@ static inline bool uart_port_trylock_irqsave(struct uart_port *up, unsigned long
*/
static inline void uart_port_unlock(struct uart_port *up)
{
__uart_port_nbcon_release(up);
spin_unlock(&up->lock);
}
......@@ -656,6 +767,7 @@ static inline void uart_port_unlock(struct uart_port *up)
*/
static inline void uart_port_unlock_irq(struct uart_port *up)
{
__uart_port_nbcon_release(up);
spin_unlock_irq(&up->lock);
}
......@@ -666,6 +778,7 @@ static inline void uart_port_unlock_irq(struct uart_port *up)
*/
static inline void uart_port_unlock_irqrestore(struct uart_port *up, unsigned long flags)
{
__uart_port_nbcon_release(up);
spin_unlock_irqrestore(&up->lock, flags);
}
......
......@@ -56,6 +56,7 @@
#include <linux/kprobes.h>
#include <linux/lockdep.h>
#include <linux/context_tracking.h>
#include <linux/console.h>
#include <asm/sections.h>
......@@ -573,8 +574,10 @@ static struct lock_trace *save_trace(void)
if (!debug_locks_off_graph_unlock())
return NULL;
nbcon_cpu_emergency_enter();
print_lockdep_off("BUG: MAX_STACK_TRACE_ENTRIES too low!");
dump_stack();
nbcon_cpu_emergency_exit();
return NULL;
}
......@@ -887,11 +890,13 @@ look_up_lock_class(const struct lockdep_map *lock, unsigned int subclass)
if (unlikely(subclass >= MAX_LOCKDEP_SUBCLASSES)) {
instrumentation_begin();
debug_locks_off();
nbcon_cpu_emergency_enter();
printk(KERN_ERR
"BUG: looking up invalid subclass: %u\n", subclass);
printk(KERN_ERR
"turning off the locking correctness validator.\n");
dump_stack();
nbcon_cpu_emergency_exit();
instrumentation_end();
return NULL;
}
......@@ -968,11 +973,13 @@ static bool assign_lock_key(struct lockdep_map *lock)
else {
/* Debug-check: all keys must be persistent! */
debug_locks_off();
nbcon_cpu_emergency_enter();
pr_err("INFO: trying to register non-static key.\n");
pr_err("The code is fine but needs lockdep annotation, or maybe\n");
pr_err("you didn't initialize this object before use?\n");
pr_err("turning off the locking correctness validator.\n");
dump_stack();
nbcon_cpu_emergency_exit();
return false;
}
......@@ -1316,8 +1323,10 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
return NULL;
}
nbcon_cpu_emergency_enter();
print_lockdep_off("BUG: MAX_LOCKDEP_KEYS too low!");
dump_stack();
nbcon_cpu_emergency_exit();
return NULL;
}
nr_lock_classes++;
......@@ -1349,11 +1358,13 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
if (verbose(class)) {
graph_unlock();
nbcon_cpu_emergency_enter();
printk("\nnew class %px: %s", class->key, class->name);
if (class->name_version > 1)
printk(KERN_CONT "#%d", class->name_version);
printk(KERN_CONT "\n");
dump_stack();
nbcon_cpu_emergency_exit();
if (!graph_lock()) {
return NULL;
......@@ -1392,8 +1403,10 @@ static struct lock_list *alloc_list_entry(void)
if (!debug_locks_off_graph_unlock())
return NULL;
nbcon_cpu_emergency_enter();
print_lockdep_off("BUG: MAX_LOCKDEP_ENTRIES too low!");
dump_stack();
nbcon_cpu_emergency_exit();
return NULL;
}
nr_list_entries++;
......@@ -2039,6 +2052,8 @@ static noinline void print_circular_bug(struct lock_list *this,
depth = get_lock_depth(target);
nbcon_cpu_emergency_enter();
print_circular_bug_header(target, depth, check_src, check_tgt);
parent = get_lock_parent(target);
......@@ -2057,6 +2072,8 @@ static noinline void print_circular_bug(struct lock_list *this,
printk("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
static noinline void print_bfs_bug(int ret)
......@@ -2569,6 +2586,8 @@ print_bad_irq_dependency(struct task_struct *curr,
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("=====================================================\n");
pr_warn("WARNING: %s-safe -> %s-unsafe lock order detected\n",
......@@ -2618,11 +2637,13 @@ print_bad_irq_dependency(struct task_struct *curr,
pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
next_root->trace = save_trace();
if (!next_root->trace)
return;
goto out;
print_shortest_lock_dependencies(forwards_entry, next_root);
pr_warn("\nstack backtrace:\n");
dump_stack();
out:
nbcon_cpu_emergency_exit();
}
static const char *state_names[] = {
......@@ -2987,6 +3008,8 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("============================================\n");
pr_warn("WARNING: possible recursive locking detected\n");
......@@ -3009,6 +3032,8 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
/*
......@@ -3606,6 +3631,8 @@ static void print_collision(struct task_struct *curr,
struct held_lock *hlock_next,
struct lock_chain *chain)
{
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("============================\n");
pr_warn("WARNING: chain_key collision\n");
......@@ -3622,6 +3649,8 @@ static void print_collision(struct task_struct *curr,
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
#endif
......@@ -3712,8 +3741,10 @@ static inline int add_chain_cache(struct task_struct *curr,
if (!debug_locks_off_graph_unlock())
return 0;
nbcon_cpu_emergency_enter();
print_lockdep_off("BUG: MAX_LOCKDEP_CHAINS too low!");
dump_stack();
nbcon_cpu_emergency_exit();
return 0;
}
chain->chain_key = chain_key;
......@@ -3730,8 +3761,10 @@ static inline int add_chain_cache(struct task_struct *curr,
if (!debug_locks_off_graph_unlock())
return 0;
nbcon_cpu_emergency_enter();
print_lockdep_off("BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!");
dump_stack();
nbcon_cpu_emergency_exit();
return 0;
}
......@@ -3970,6 +4003,8 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
if (!debug_locks_off() || debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("================================\n");
pr_warn("WARNING: inconsistent lock state\n");
......@@ -3998,6 +4033,8 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
/*
......@@ -4032,6 +4069,8 @@ print_irq_inversion_bug(struct task_struct *curr,
if (!debug_locks_off_graph_unlock() || debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("========================================================\n");
pr_warn("WARNING: possible irq lock inversion dependency detected\n");
......@@ -4072,11 +4111,13 @@ print_irq_inversion_bug(struct task_struct *curr,
pr_warn("\nthe shortest dependencies between 2nd lock and 1st lock:\n");
root->trace = save_trace();
if (!root->trace)
return;
goto out;
print_shortest_lock_dependencies(other, root);
pr_warn("\nstack backtrace:\n");
dump_stack();
out:
nbcon_cpu_emergency_exit();
}
/*
......@@ -4153,6 +4194,8 @@ void print_irqtrace_events(struct task_struct *curr)
{
const struct irqtrace_events *trace = &curr->irqtrace;
nbcon_cpu_emergency_enter();
printk("irq event stamp: %u\n", trace->irq_events);
printk("hardirqs last enabled at (%u): [<%px>] %pS\n",
trace->hardirq_enable_event, (void *)trace->hardirq_enable_ip,
......@@ -4166,6 +4209,8 @@ void print_irqtrace_events(struct task_struct *curr)
printk("softirqs last disabled at (%u): [<%px>] %pS\n",
trace->softirq_disable_event, (void *)trace->softirq_disable_ip,
(void *)trace->softirq_disable_ip);
nbcon_cpu_emergency_exit();
}
static int HARDIRQ_verbose(struct lock_class *class)
......@@ -4686,10 +4731,12 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
* We must printk outside of the graph_lock:
*/
if (ret == 2) {
nbcon_cpu_emergency_enter();
printk("\nmarked lock as {%s}:\n", usage_str[new_bit]);
print_lock(this);
print_irqtrace_events(curr);
dump_stack();
nbcon_cpu_emergency_exit();
}
return ret;
......@@ -4730,6 +4777,8 @@ print_lock_invalid_wait_context(struct task_struct *curr,
if (debug_locks_silent)
return 0;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("=============================\n");
pr_warn("[ BUG: Invalid wait context ]\n");
......@@ -4749,6 +4798,8 @@ print_lock_invalid_wait_context(struct task_struct *curr,
pr_warn("stack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
return 0;
}
......@@ -4956,6 +5007,8 @@ print_lock_nested_lock_not_held(struct task_struct *curr,
if (debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("==================================\n");
pr_warn("WARNING: Nested lock was not taken\n");
......@@ -4976,6 +5029,8 @@ print_lock_nested_lock_not_held(struct task_struct *curr,
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
static int __lock_is_held(const struct lockdep_map *lock, int read);
......@@ -5024,11 +5079,13 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
debug_class_ops_inc(class);
if (very_verbose(class)) {
nbcon_cpu_emergency_enter();
printk("\nacquire class [%px] %s", class->key, class->name);
if (class->name_version > 1)
printk(KERN_CONT "#%d", class->name_version);
printk(KERN_CONT "\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
/*
......@@ -5155,6 +5212,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
#endif
if (unlikely(curr->lockdep_depth >= MAX_LOCK_DEPTH)) {
debug_locks_off();
nbcon_cpu_emergency_enter();
print_lockdep_off("BUG: MAX_LOCK_DEPTH too low!");
printk(KERN_DEBUG "depth: %i max: %lu!\n",
curr->lockdep_depth, MAX_LOCK_DEPTH);
......@@ -5162,6 +5220,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
lockdep_print_held_locks(current);
debug_show_all_locks();
dump_stack();
nbcon_cpu_emergency_exit();
return 0;
}
......@@ -5181,6 +5240,8 @@ static void print_unlock_imbalance_bug(struct task_struct *curr,
if (debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("=====================================\n");
pr_warn("WARNING: bad unlock balance detected!\n");
......@@ -5197,6 +5258,8 @@ static void print_unlock_imbalance_bug(struct task_struct *curr,
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
static noinstr int match_held_lock(const struct held_lock *hlock,
......@@ -5901,6 +5964,8 @@ static void print_lock_contention_bug(struct task_struct *curr,
if (debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("=================================\n");
pr_warn("WARNING: bad contention detected!\n");
......@@ -5917,6 +5982,8 @@ static void print_lock_contention_bug(struct task_struct *curr,
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
static void
......@@ -6536,6 +6603,8 @@ print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
if (debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("=========================\n");
pr_warn("WARNING: held lock freed!\n");
......@@ -6548,6 +6617,8 @@ print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
static inline int not_in_range(const void* mem_from, unsigned long mem_len,
......@@ -6594,6 +6665,8 @@ static void print_held_locks_bug(void)
if (debug_locks_silent)
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("====================================\n");
pr_warn("WARNING: %s/%d still has locks held!\n",
......@@ -6603,6 +6676,8 @@ static void print_held_locks_bug(void)
lockdep_print_held_locks(current);
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
}
void debug_check_no_locks_held(void)
......@@ -6660,6 +6735,7 @@ asmlinkage __visible void lockdep_sys_exit(void)
if (unlikely(curr->lockdep_depth)) {
if (!debug_locks_off())
return;
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("================================================\n");
pr_warn("WARNING: lock held when returning to user space!\n");
......@@ -6668,6 +6744,7 @@ asmlinkage __visible void lockdep_sys_exit(void)
pr_warn("%s/%d is leaving the kernel with locks still held!\n",
curr->comm, curr->pid);
lockdep_print_held_locks(curr);
nbcon_cpu_emergency_exit();
}
/*
......@@ -6684,6 +6761,7 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
bool rcu = warn_rcu_enter();
/* Note: the following can be executed concurrently, so be careful. */
nbcon_cpu_emergency_enter();
pr_warn("\n");
pr_warn("=============================\n");
pr_warn("WARNING: suspicious RCU usage\n");
......@@ -6722,6 +6800,7 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
lockdep_print_held_locks(curr);
pr_warn("\nstack backtrace:\n");
dump_stack();
nbcon_cpu_emergency_exit();
warn_rcu_exit(rcu);
}
EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
......@@ -374,6 +374,8 @@ void panic(const char *fmt, ...)
panic_other_cpus_shutdown(_crash_kexec_post_notifiers);
printk_legacy_allow_panic_sync();
/*
* Run any panic handlers, including those that might need to
* add information to the kmsg dump output.
......@@ -463,6 +465,7 @@ void panic(const char *fmt, ...)
* Explicitly flush the kernel log buffer one last time.
*/
console_flush_on_panic(CONSOLE_FLUSH_PENDING);
nbcon_atomic_flush_unsafe();
local_irq_enable();
for (i = 0; ; i += PANIC_TIMER_STEP) {
......@@ -682,6 +685,7 @@ bool oops_may_print(void)
*/
void oops_enter(void)
{
nbcon_cpu_emergency_enter();
tracing_off();
/* can't trust the integrity of the kernel anymore: */
debug_locks_off();
......@@ -704,6 +708,7 @@ void oops_exit(void)
{
do_oops_enter_exit();
print_oops_end_marker();
nbcon_cpu_emergency_exit();
kmsg_dump(KMSG_DUMP_OOPS);
}
......@@ -715,6 +720,8 @@ struct warn_args {
void __warn(const char *file, int line, void *caller, unsigned taint,
struct pt_regs *regs, struct warn_args *args)
{
nbcon_cpu_emergency_enter();
disable_trace_on_warning();
if (file)
......@@ -750,6 +757,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
/* Just a warning, don't kill lockdep. */
add_taint(taint, LOCKDEP_STILL_OK);
nbcon_cpu_emergency_exit();
}
#ifdef CONFIG_BUG
......
......@@ -2,11 +2,12 @@
/*
* internal.h - printk internal definitions
*/
#include <linux/percpu.h>
#include <linux/console.h>
#include "printk_ringbuffer.h"
#include <linux/percpu.h>
#include <linux/types.h>
#if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
struct ctl_table;
void __init printk_sysctl_init(void);
int devkmsg_sysctl_set_loglvl(const struct ctl_table *table, int write,
void *buffer, size_t *lenp, loff_t *ppos);
......@@ -20,6 +21,19 @@ int devkmsg_sysctl_set_loglvl(const struct ctl_table *table, int write,
(con->flags & CON_BOOT) ? "boot" : "", \
con->name, con->index, ##__VA_ARGS__)
/*
* Identify if legacy printing is forced in a dedicated kthread. If
* true, all printing via console lock occurs within a dedicated
* legacy printer thread. The only exception is on panic, after the
* nbcon consoles have had their chance to print the panic messages
* first.
*/
#ifdef CONFIG_PREEMPT_RT
# define force_legacy_kthread() (true)
#else
# define force_legacy_kthread() (false)
#endif
#ifdef CONFIG_PRINTK
#ifdef CONFIG_PRINTK_CALLER
......@@ -43,7 +57,11 @@ enum printk_info_flags {
LOG_CONT = 8, /* text is a fragment of a continuation line */
};
struct printk_ringbuffer;
struct dev_printk_info;
extern struct printk_ringbuffer *prb;
extern bool printk_kthreads_running;
__printf(4, 0)
int vprintk_store(int facility, int level,
......@@ -53,6 +71,9 @@ int vprintk_store(int facility, int level,
__printf(1, 0) int vprintk_default(const char *fmt, va_list args);
__printf(1, 0) int vprintk_deferred(const char *fmt, va_list args);
void __printk_safe_enter(void);
void __printk_safe_exit(void);
bool printk_percpu_data_ready(void);
#define printk_safe_enter_irqsave(flags) \
......@@ -68,15 +89,85 @@ bool printk_percpu_data_ready(void);
} while (0)
void defer_console_output(void);
bool is_printk_legacy_deferred(void);
u16 printk_parse_prefix(const char *text, int *level,
enum printk_info_flags *flags);
void console_lock_spinning_enable(void);
int console_lock_spinning_disable_and_check(int cookie);
u64 nbcon_seq_read(struct console *con);
void nbcon_seq_force(struct console *con, u64 seq);
bool nbcon_alloc(struct console *con);
void nbcon_init(struct console *con);
void nbcon_free(struct console *con);
enum nbcon_prio nbcon_get_default_prio(void);
void nbcon_atomic_flush_pending(void);
bool nbcon_legacy_emit_next_record(struct console *con, bool *handover,
int cookie, bool use_atomic);
bool nbcon_kthread_create(struct console *con);
void nbcon_kthread_stop(struct console *con);
void nbcon_kthreads_wake(void);
/*
* Check if the given console is currently capable and allowed to print
* records. Note that this function does not consider the current context,
* which can also play a role in deciding if @con can be used to print
* records.
*/
static inline bool console_is_usable(struct console *con, short flags, bool use_atomic)
{
if (!(flags & CON_ENABLED))
return false;
if ((flags & CON_SUSPENDED))
return false;
if (flags & CON_NBCON) {
/* The write_atomic() callback is optional. */
if (use_atomic && !con->write_atomic)
return false;
/*
* For the !use_atomic case, @printk_kthreads_running is not
* checked because the write_thread() callback is also used
* via the legacy loop when the printer threads are not
* available.
*/
} else {
if (!con->write)
return false;
}
/*
* Console drivers may assume that per-cpu resources have been
* allocated. So unless they're explicitly marked as being able to
* cope (CON_ANYTIME) don't call them until this CPU is officially up.
*/
if (!cpu_online(raw_smp_processor_id()) && !(flags & CON_ANYTIME))
return false;
return true;
}
/**
* nbcon_kthread_wake - Wake up a console printing thread
* @con: Console to operate on
*/
static inline void nbcon_kthread_wake(struct console *con)
{
/*
* Guarantee any new records can be seen by tasks preparing to wait
* before this context checks if the rcuwait is empty.
*
* The full memory barrier in rcuwait_wake_up() pairs with the full
* memory barrier within set_current_state() of
* ___rcuwait_wait_event(), which is called after prepare_to_rcuwait()
* adds the waiter but before it has checked the wait condition.
*
* This pairs with nbcon_kthread_func:A.
*/
rcuwait_wake_up(&con->rcuwait); /* LMM(nbcon_kthread_wake:A) */
}
#else
......@@ -84,6 +175,8 @@ void nbcon_free(struct console *con);
#define PRINTK_MESSAGE_MAX 0
#define PRINTKRB_RECORD_MAX 0
#define printk_kthreads_running (false)
/*
* In !PRINTK builds we still export console_sem
* semaphore and some of console functions (console_unlock()/etc.), so
......@@ -93,14 +186,119 @@ void nbcon_free(struct console *con);
#define printk_safe_exit_irqrestore(flags) local_irq_restore(flags)
static inline bool printk_percpu_data_ready(void) { return false; }
static inline void defer_console_output(void) { }
static inline bool is_printk_legacy_deferred(void) { return false; }
static inline u64 nbcon_seq_read(struct console *con) { return 0; }
static inline void nbcon_seq_force(struct console *con, u64 seq) { }
static inline bool nbcon_alloc(struct console *con) { return false; }
static inline void nbcon_init(struct console *con) { }
static inline void nbcon_free(struct console *con) { }
static inline enum nbcon_prio nbcon_get_default_prio(void) { return NBCON_PRIO_NONE; }
static inline void nbcon_atomic_flush_pending(void) { }
static inline bool nbcon_legacy_emit_next_record(struct console *con, bool *handover,
int cookie, bool use_atomic) { return false; }
static inline void nbcon_kthread_wake(struct console *con) { }
static inline void nbcon_kthreads_wake(void) { }
static inline bool console_is_usable(struct console *con, short flags,
bool use_atomic) { return false; }
#endif /* CONFIG_PRINTK */
extern bool have_boot_console;
extern bool have_nbcon_console;
extern bool have_legacy_console;
extern bool legacy_allow_panic_sync;
/**
* struct console_flush_type - Define available console flush methods
* @nbcon_atomic: Flush directly using nbcon_atomic() callback
* @nbcon_offload: Offload flush to printer thread
* @legacy_direct: Call the legacy loop in this context
* @legacy_offload: Offload the legacy loop into IRQ or legacy thread
*
* Note that the legacy loop also flushes the nbcon consoles.
*/
struct console_flush_type {
bool nbcon_atomic;
bool nbcon_offload;
bool legacy_direct;
bool legacy_offload;
};
/*
* Identify which console flushing methods should be used in the context of
* the caller.
*/
static inline void printk_get_console_flush_type(struct console_flush_type *ft)
{
memset(ft, 0, sizeof(*ft));
switch (nbcon_get_default_prio()) {
case NBCON_PRIO_NORMAL:
if (have_nbcon_console && !have_boot_console) {
if (printk_kthreads_running)
ft->nbcon_offload = true;
else
ft->nbcon_atomic = true;
}
/* Legacy consoles are flushed directly when possible. */
if (have_legacy_console || have_boot_console) {
if (!is_printk_legacy_deferred())
ft->legacy_direct = true;
else
ft->legacy_offload = true;
}
break;
case NBCON_PRIO_EMERGENCY:
if (have_nbcon_console && !have_boot_console)
ft->nbcon_atomic = true;
/* Legacy consoles are flushed directly when possible. */
if (have_legacy_console || have_boot_console) {
if (!is_printk_legacy_deferred())
ft->legacy_direct = true;
else
ft->legacy_offload = true;
}
break;
case NBCON_PRIO_PANIC:
/*
* In panic, the nbcon consoles will directly print. But
* only allowed if there are no boot consoles.
*/
if (have_nbcon_console && !have_boot_console)
ft->nbcon_atomic = true;
if (have_legacy_console || have_boot_console) {
/*
* This is the same decision as NBCON_PRIO_NORMAL
* except that offloading never occurs in panic.
*
* Note that console_flush_on_panic() will flush
* legacy consoles anyway, even if unsafe.
*/
if (!is_printk_legacy_deferred())
ft->legacy_direct = true;
/*
* In panic, if nbcon atomic printing occurs,
* the legacy consoles must remain silent until
* explicitly allowed.
*/
if (ft->nbcon_atomic && !legacy_allow_panic_sync)
ft->legacy_direct = false;
}
break;
default:
WARN_ON_ONCE(1);
break;
}
}
extern struct printk_buffers printk_shared_pbufs;
/**
......@@ -135,4 +333,5 @@ bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
#ifdef CONFIG_PRINTK
void console_prepend_dropped(struct printk_message *pmsg, unsigned long dropped);
void console_prepend_replay(struct printk_message *pmsg);
#endif
......@@ -2,11 +2,25 @@
// Copyright (C) 2022 Linutronix GmbH, John Ogness
// Copyright (C) 2022 Intel, Thomas Gleixner
#include <linux/kernel.h>
#include <linux/atomic.h>
#include <linux/bug.h>
#include <linux/console.h>
#include <linux/delay.h>
#include <linux/errno.h>
#include <linux/export.h>
#include <linux/init.h>
#include <linux/irqflags.h>
#include <linux/kthread.h>
#include <linux/minmax.h>
#include <linux/percpu.h>
#include <linux/preempt.h>
#include <linux/slab.h>
#include <linux/smp.h>
#include <linux/stddef.h>
#include <linux/string.h>
#include <linux/types.h>
#include "internal.h"
#include "printk_ringbuffer.h"
/*
* Printk console printing implementation for consoles which does not depend
* on the legacy style console_lock mechanism.
......@@ -172,9 +186,6 @@ void nbcon_seq_force(struct console *con, u64 seq)
u64 valid_seq = max_t(u64, seq, prb_first_valid_seq(prb));
atomic_long_set(&ACCESS_PRIVATE(con, nbcon_seq), __u64seq_to_ulseq(valid_seq));
/* Clear con->seq since nbcon consoles use con->nbcon_seq instead. */
con->seq = 0;
}
/**
......@@ -231,6 +242,13 @@ static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt,
struct nbcon_state new;
do {
/*
* Panic does not imply that the console is owned. However, it
* is critical that non-panic CPUs during panic are unable to
* acquire ownership in order to satisfy the assumptions of
* nbcon_waiter_matches(). In particular, the assumption that
* lower priorities are ignored during panic.
*/
if (other_cpu_in_panic())
return -EPERM;
......@@ -262,18 +280,29 @@ static bool nbcon_waiter_matches(struct nbcon_state *cur, int expected_prio)
/*
* The request context is well defined by the @req_prio because:
*
* - Only a context with a higher priority can take over the request.
* - Only a context with a priority higher than the owner can become
* a waiter.
* - Only a context with a priority higher than the waiter can
* directly take over the request.
* - There are only three priorities.
* - Only one CPU is allowed to request PANIC priority.
* - Lower priorities are ignored during panic() until reboot.
*
* As a result, the following scenario is *not* possible:
*
* 1. Another context with a higher priority directly takes ownership.
* 2. The higher priority context releases the ownership.
* 3. A lower priority context takes the ownership.
* 4. Another context with the same priority as this context
* 1. This context is currently a waiter.
* 2. Another context with a higher priority than this context
* directly takes ownership.
* 3. The higher priority context releases the ownership.
* 4. Another lower priority context takes the ownership.
* 5. Another context with the same priority as this context
* creates a request and starts waiting.
*
* Event #1 implies this context is EMERGENCY.
* Event #2 implies the new context is PANIC.
* Event #3 occurs when panic() has flushed the console.
* Events #4 and #5 are not possible due to the other_cpu_in_panic()
* check in nbcon_context_try_acquire_direct().
*/
return (cur->req_prio == expected_prio);
......@@ -531,6 +560,7 @@ static struct printk_buffers panic_nbcon_pbufs;
* nbcon_context_try_acquire - Try to acquire nbcon console
* @ctxt: The context of the caller
*
* Context: Under @ctxt->con->device_lock() or local_irq_save().
* Return: True if the console was acquired. False otherwise.
*
* If the caller allowed an unsafe hostile takeover, on success the
......@@ -538,7 +568,6 @@ static struct printk_buffers panic_nbcon_pbufs;
* in an unsafe state. Otherwise, on success the caller may assume
* the console is not in an unsafe state.
*/
__maybe_unused
static bool nbcon_context_try_acquire(struct nbcon_context *ctxt)
{
unsigned int cpu = smp_processor_id();
......@@ -581,11 +610,29 @@ static bool nbcon_owner_matches(struct nbcon_state *cur, int expected_cpu,
int expected_prio)
{
/*
* Since consoles can only be acquired by higher priorities,
* owning contexts are uniquely identified by @prio. However,
* since contexts can unexpectedly lose ownership, it is
* possible that later another owner appears with the same
* priority. For this reason @cpu is also needed.
* A similar function, nbcon_waiter_matches(), only deals with
* EMERGENCY and PANIC priorities. However, this function must also
* deal with the NORMAL priority, which requires additional checks
* and constraints.
*
* For the case where preemption and interrupts are disabled, it is
* enough to also verify that the owning CPU has not changed.
*
* For the case where preemption or interrupts are enabled, an
* external synchronization method *must* be used. In particular,
* the driver-specific locking mechanism used in device_lock()
* (including disabling migration) should be used. It prevents
* scenarios such as:
*
* 1. [Task A] owns a context with NBCON_PRIO_NORMAL on [CPU X] and
* is scheduled out.
* 2. Another context takes over the lock with NBCON_PRIO_EMERGENCY
* and releases it.
* 3. [Task B] acquires a context with NBCON_PRIO_NORMAL on [CPU X]
* and is scheduled out.
* 4. [Task A] gets running on [CPU X] and sees that the console is
* still owned by a task on [CPU X] with NBON_PRIO_NORMAL. Thus
* [Task A] thinks it is the owner when it is not.
*/
if (cur->prio != expected_prio)
......@@ -784,6 +831,19 @@ static bool __nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsaf
return nbcon_context_can_proceed(ctxt, &cur);
}
static void nbcon_write_context_set_buf(struct nbcon_write_context *wctxt,
char *buf, unsigned int len)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
struct console *con = ctxt->console;
struct nbcon_state cur;
wctxt->outbuf = buf;
wctxt->len = len;
nbcon_state_read(con, &cur);
wctxt->unsafe_takeover = cur.unsafe_takeover;
}
/**
* nbcon_enter_unsafe - Enter an unsafe region in the driver
* @wctxt: The write context that was handed to the write function
......@@ -799,8 +859,12 @@ static bool __nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsaf
bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
bool is_owner;
return nbcon_context_enter_unsafe(ctxt);
is_owner = nbcon_context_enter_unsafe(ctxt);
if (!is_owner)
nbcon_write_context_set_buf(wctxt, NULL, 0);
return is_owner;
}
EXPORT_SYMBOL_GPL(nbcon_enter_unsafe);
......@@ -819,14 +883,47 @@ EXPORT_SYMBOL_GPL(nbcon_enter_unsafe);
bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
bool ret;
return nbcon_context_exit_unsafe(ctxt);
ret = nbcon_context_exit_unsafe(ctxt);
if (!ret)
nbcon_write_context_set_buf(wctxt, NULL, 0);
return ret;
}
EXPORT_SYMBOL_GPL(nbcon_exit_unsafe);
/**
* nbcon_reacquire_nobuf - Reacquire a console after losing ownership
* while printing
* @wctxt: The write context that was handed to the write callback
*
* Since ownership can be lost at any time due to handover or takeover, a
* printing context _must_ be prepared to back out immediately and
* carefully. However, there are scenarios where the printing context must
* reacquire ownership in order to finalize or revert hardware changes.
*
* This function allows a printing context to reacquire ownership using the
* same priority as its previous ownership.
*
* Note that after a successful reacquire the printing context will have no
* output buffer because that has been lost. This function cannot be used to
* resume printing.
*/
void nbcon_reacquire_nobuf(struct nbcon_write_context *wctxt)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
while (!nbcon_context_try_acquire(ctxt))
cpu_relax();
nbcon_write_context_set_buf(wctxt, NULL, 0);
}
EXPORT_SYMBOL_GPL(nbcon_reacquire_nobuf);
/**
* nbcon_emit_next_record - Emit a record in the acquired context
* @wctxt: The write context that will be handed to the write function
* @use_atomic: True if the write_atomic() callback is to be used
*
* Return: True if this context still owns the console. False if
* ownership was handed over or taken.
......@@ -840,8 +937,7 @@ EXPORT_SYMBOL_GPL(nbcon_exit_unsafe);
* When true is returned, @wctxt->ctxt.backlog indicates whether there are
* still records pending in the ringbuffer,
*/
__maybe_unused
static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt, bool use_atomic)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
struct console *con = ctxt->console;
......@@ -852,7 +948,22 @@ static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
unsigned long con_dropped;
struct nbcon_state cur;
unsigned long dropped;
bool done;
unsigned long ulseq;
/*
* This function should never be called for consoles that have not
* implemented the necessary callback for writing: i.e. legacy
* consoles and, when atomic, nbcon consoles with no write_atomic().
* Handle it as if ownership was lost and try to continue.
*
* Note that for nbcon consoles the write_thread() callback is
* mandatory and was already checked in nbcon_alloc().
*/
if (WARN_ON_ONCE((use_atomic && !con->write_atomic) ||
!(console_srcu_read_flags(con) & CON_NBCON))) {
nbcon_context_release(ctxt);
return false;
}
/*
* The printk buffers are filled within an unsafe section. This
......@@ -878,6 +989,29 @@ static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
if (dropped && !is_extended)
console_prepend_dropped(&pmsg, dropped);
/*
* If the previous owner was assigned the same record, this context
* has taken over ownership and is replaying the record. Prepend a
* message to let the user know the record is replayed.
*/
ulseq = atomic_long_read(&ACCESS_PRIVATE(con, nbcon_prev_seq));
if (__ulseq_to_u64seq(prb, ulseq) == pmsg.seq) {
console_prepend_replay(&pmsg);
} else {
/*
* Ensure this context is still the owner before trying to
* update @nbcon_prev_seq. Otherwise the value in @ulseq may
* not be from the previous owner and instead be some later
* value from the context that took over ownership.
*/
nbcon_state_read(con, &cur);
if (!nbcon_context_can_proceed(ctxt, &cur))
return false;
atomic_long_try_cmpxchg(&ACCESS_PRIVATE(con, nbcon_prev_seq), &ulseq,
__u64seq_to_ulseq(pmsg.seq));
}
if (!nbcon_context_exit_unsafe(ctxt))
return false;
......@@ -886,22 +1020,27 @@ static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
goto update_con;
/* Initialize the write context for driver callbacks. */
wctxt->outbuf = &pmsg.pbufs->outbuf[0];
wctxt->len = pmsg.outbuf_len;
nbcon_state_read(con, &cur);
wctxt->unsafe_takeover = cur.unsafe_takeover;
nbcon_write_context_set_buf(wctxt, &pmsg.pbufs->outbuf[0], pmsg.outbuf_len);
if (con->write_atomic) {
done = con->write_atomic(con, wctxt);
} else {
if (use_atomic)
con->write_atomic(con, wctxt);
else
con->write_thread(con, wctxt);
if (!wctxt->outbuf) {
/*
* Ownership was lost and reacquired by the driver. Handle it
* as if ownership was lost.
*/
nbcon_context_release(ctxt);
WARN_ON_ONCE(1);
done = false;
return false;
}
/* If not done, the emit was aborted. */
if (!done)
return false;
/*
* Ownership may have been lost but _not_ reacquired by the driver.
* This case is detected and handled when entering unsafe to update
* dropped/seq values.
*/
/*
* Since any dropped message was successfully output, reset the
......@@ -928,54 +1067,650 @@ static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
return nbcon_context_exit_unsafe(ctxt);
}
/*
* nbcon_emit_one - Print one record for an nbcon console using the
* specified callback
* @wctxt: An initialized write context struct to use for this context
* @use_atomic: True if the write_atomic() callback is to be used
*
* Return: True, when a record has been printed and there are still
* pending records. The caller might want to continue flushing.
*
* False, when there is no pending record, or when the console
* context cannot be acquired, or the ownership has been lost.
* The caller should give up. Either the job is done, cannot be
* done, or will be handled by the owning context.
*
* This is an internal helper to handle the locking of the console before
* calling nbcon_emit_next_record().
*/
static bool nbcon_emit_one(struct nbcon_write_context *wctxt, bool use_atomic)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
struct console *con = ctxt->console;
unsigned long flags;
bool ret = false;
if (!use_atomic) {
con->device_lock(con, &flags);
/*
* Ensure this stays on the CPU to make handover and
* takeover possible.
*/
cant_migrate();
}
if (!nbcon_context_try_acquire(ctxt))
goto out;
/*
* nbcon_emit_next_record() returns false when the console was
* handed over or taken over. In both cases the context is no
* longer valid.
*
* The higher priority printing context takes over responsibility
* to print the pending records.
*/
if (!nbcon_emit_next_record(wctxt, use_atomic))
goto out;
nbcon_context_release(ctxt);
ret = ctxt->backlog;
out:
if (!use_atomic)
con->device_unlock(con, flags);
return ret;
}
/**
* nbcon_alloc - Allocate buffers needed by the nbcon console
* @con: Console to allocate buffers for
* nbcon_kthread_should_wakeup - Check whether a printer thread should wakeup
* @con: Console to operate on
* @ctxt: The nbcon context from nbcon_context_try_acquire()
*
* Return: True on success. False otherwise and the console cannot
* be used.
* Return: True if the thread should shutdown or if the console is
* allowed to print and a record is available. False otherwise.
*
* This is not part of nbcon_init() because buffer allocation must
* be performed earlier in the console registration process.
* After the thread wakes up, it must first check if it should shutdown before
* attempting any printing.
*/
bool nbcon_alloc(struct console *con)
static bool nbcon_kthread_should_wakeup(struct console *con, struct nbcon_context *ctxt)
{
if (con->flags & CON_BOOT) {
bool ret = false;
short flags;
int cookie;
if (kthread_should_stop())
return true;
cookie = console_srcu_read_lock();
flags = console_srcu_read_flags(con);
if (console_is_usable(con, flags, false)) {
/* Bring the sequence in @ctxt up to date */
ctxt->seq = nbcon_seq_read(con);
ret = prb_read_valid(prb, ctxt->seq, NULL);
}
console_srcu_read_unlock(cookie);
return ret;
}
/**
* nbcon_kthread_func - The printer thread function
* @__console: Console to operate on
*
* Return: 0
*/
static int nbcon_kthread_func(void *__console)
{
struct console *con = __console;
struct nbcon_write_context wctxt = {
.ctxt.console = con,
.ctxt.prio = NBCON_PRIO_NORMAL,
};
struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt);
short con_flags;
bool backlog;
int cookie;
wait_for_event:
/*
* Guarantee this task is visible on the rcuwait before
* checking the wake condition.
*
* The full memory barrier within set_current_state() of
* ___rcuwait_wait_event() pairs with the full memory
* barrier within rcuwait_has_sleeper().
*
* This pairs with rcuwait_has_sleeper:A and nbcon_kthread_wake:A.
*/
rcuwait_wait_event(&con->rcuwait,
nbcon_kthread_should_wakeup(con, ctxt),
TASK_INTERRUPTIBLE); /* LMM(nbcon_kthread_func:A) */
do {
if (kthread_should_stop())
return 0;
backlog = false;
/*
* Boot console printing is synchronized with legacy console
* printing, so boot consoles can share the same global printk
* buffers.
* Keep the srcu read lock around the entire operation so that
* synchronize_srcu() can guarantee that the kthread stopped
* or suspended printing.
*/
con->pbufs = &printk_shared_pbufs;
cookie = console_srcu_read_lock();
con_flags = console_srcu_read_flags(con);
if (console_is_usable(con, con_flags, false))
backlog = nbcon_emit_one(&wctxt, false);
console_srcu_read_unlock(cookie);
cond_resched();
} while (backlog);
goto wait_for_event;
}
/**
* nbcon_irq_work - irq work to wake console printer thread
* @irq_work: The irq work to operate on
*/
static void nbcon_irq_work(struct irq_work *irq_work)
{
struct console *con = container_of(irq_work, struct console, irq_work);
nbcon_kthread_wake(con);
}
static inline bool rcuwait_has_sleeper(struct rcuwait *w)
{
/*
* Guarantee any new records can be seen by tasks preparing to wait
* before this context checks if the rcuwait is empty.
*
* This full memory barrier pairs with the full memory barrier within
* set_current_state() of ___rcuwait_wait_event(), which is called
* after prepare_to_rcuwait() adds the waiter but before it has
* checked the wait condition.
*
* This pairs with nbcon_kthread_func:A.
*/
smp_mb(); /* LMM(rcuwait_has_sleeper:A) */
return rcuwait_active(w);
}
/**
* nbcon_kthreads_wake - Wake up printing threads using irq_work
*/
void nbcon_kthreads_wake(void)
{
struct console *con;
int cookie;
if (!printk_kthreads_running)
return;
cookie = console_srcu_read_lock();
for_each_console_srcu(con) {
if (!(console_srcu_read_flags(con) & CON_NBCON))
continue;
/*
* Only schedule irq_work if the printing thread is
* actively waiting. If not waiting, the thread will
* notice by itself that it has work to do.
*/
if (rcuwait_has_sleeper(&con->rcuwait))
irq_work_queue(&con->irq_work);
}
console_srcu_read_unlock(cookie);
}
/*
* nbcon_kthread_stop - Stop a console printer thread
* @con: Console to operate on
*/
void nbcon_kthread_stop(struct console *con)
{
lockdep_assert_console_list_lock_held();
if (!con->kthread)
return;
kthread_stop(con->kthread);
con->kthread = NULL;
}
/**
* nbcon_kthread_create - Create a console printer thread
* @con: Console to operate on
*
* Return: True if the kthread was started or already exists.
* Otherwise false and @con must not be registered.
*
* This function is called when it will be expected that nbcon consoles are
* flushed using the kthread. The messages printed with NBCON_PRIO_NORMAL
* will be no longer flushed by the legacy loop. This is why failure must
* be fatal for console registration.
*
* If @con was already registered and this function fails, @con must be
* unregistered before the global state variable @printk_kthreads_running
* can be set.
*/
bool nbcon_kthread_create(struct console *con)
{
struct task_struct *kt;
lockdep_assert_console_list_lock_held();
if (con->kthread)
return true;
kt = kthread_run(nbcon_kthread_func, con, "pr/%s%d", con->name, con->index);
if (WARN_ON(IS_ERR(kt))) {
con_printk(KERN_ERR, con, "failed to start printing thread\n");
return false;
}
con->kthread = kt;
/*
* It is important that console printing threads are scheduled
* shortly after a printk call and with generous runtime budgets.
*/
sched_set_normal(con->kthread, -20);
return true;
}
/* Track the nbcon emergency nesting per CPU. */
static DEFINE_PER_CPU(unsigned int, nbcon_pcpu_emergency_nesting);
static unsigned int early_nbcon_pcpu_emergency_nesting __initdata;
/**
* nbcon_get_cpu_emergency_nesting - Get the per CPU emergency nesting pointer
*
* Context: For reading, any context. For writing, any context which could
* not be migrated to another CPU.
* Return: Either a pointer to the per CPU emergency nesting counter of
* the current CPU or to the init data during early boot.
*
* The function is safe for reading per-CPU variables in any context because
* preemption is disabled if the current CPU is in the emergency state. See
* also nbcon_cpu_emergency_enter().
*/
static __ref unsigned int *nbcon_get_cpu_emergency_nesting(void)
{
/*
* The value of __printk_percpu_data_ready gets set in normal
* context and before SMP initialization. As a result it could
* never change while inside an nbcon emergency section.
*/
if (!printk_percpu_data_ready())
return &early_nbcon_pcpu_emergency_nesting;
return raw_cpu_ptr(&nbcon_pcpu_emergency_nesting);
}
/**
* nbcon_get_default_prio - The appropriate nbcon priority to use for nbcon
* printing on the current CPU
*
* Context: Any context.
* Return: The nbcon_prio to use for acquiring an nbcon console in this
* context for printing.
*
* The function is safe for reading per-CPU data in any context because
* preemption is disabled if the current CPU is in the emergency or panic
* state.
*/
enum nbcon_prio nbcon_get_default_prio(void)
{
unsigned int *cpu_emergency_nesting;
if (this_cpu_in_panic())
return NBCON_PRIO_PANIC;
cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
if (*cpu_emergency_nesting)
return NBCON_PRIO_EMERGENCY;
return NBCON_PRIO_NORMAL;
}
/**
* nbcon_legacy_emit_next_record - Print one record for an nbcon console
* in legacy contexts
* @con: The console to print on
* @handover: Will be set to true if a printk waiter has taken over the
* console_lock, in which case the caller is no longer holding
* both the console_lock and the SRCU read lock. Otherwise it
* is set to false.
* @cookie: The cookie from the SRCU read lock.
* @use_atomic: Set true when called in an atomic or unknown context.
* It affects which nbcon callback will be used: write_atomic()
* or write_thread().
*
* When false, the write_thread() callback is used and would be
* called in a preemtible context unless disabled by the
* device_lock. The legacy handover is not allowed in this mode.
*
* Context: Any context except NMI.
* Return: True, when a record has been printed and there are still
* pending records. The caller might want to continue flushing.
*
* False, when there is no pending record, or when the console
* context cannot be acquired, or the ownership has been lost.
* The caller should give up. Either the job is done, cannot be
* done, or will be handled by the owning context.
*
* This function is meant to be called by console_flush_all() to print records
* on nbcon consoles from legacy context (printing via console unlocking).
* Essentially it is the nbcon version of console_emit_next_record().
*/
bool nbcon_legacy_emit_next_record(struct console *con, bool *handover,
int cookie, bool use_atomic)
{
struct nbcon_write_context wctxt = { };
struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt);
unsigned long flags;
bool progress;
ctxt->console = con;
ctxt->prio = nbcon_get_default_prio();
if (use_atomic) {
/*
* In an atomic or unknown context, use the same procedure as
* in console_emit_next_record(). It allows to handover.
*/
printk_safe_enter_irqsave(flags);
console_lock_spinning_enable();
stop_critical_timings();
}
progress = nbcon_emit_one(&wctxt, use_atomic);
if (use_atomic) {
start_critical_timings();
*handover = console_lock_spinning_disable_and_check(cookie);
printk_safe_exit_irqrestore(flags);
} else {
con->pbufs = kmalloc(sizeof(*con->pbufs), GFP_KERNEL);
if (!con->pbufs) {
con_printk(KERN_ERR, con, "failed to allocate printing buffer\n");
return false;
/* Non-atomic does not perform legacy spinning handovers. */
*handover = false;
}
return progress;
}
/**
* __nbcon_atomic_flush_pending_con - Flush specified nbcon console using its
* write_atomic() callback
* @con: The nbcon console to flush
* @stop_seq: Flush up until this record
* @allow_unsafe_takeover: True, to allow unsafe hostile takeovers
*
* Return: 0 if @con was flushed up to @stop_seq Otherwise, error code on
* failure.
*
* Errors:
*
* -EPERM: Unable to acquire console ownership.
*
* -EAGAIN: Another context took over ownership while printing.
*
* -ENOENT: A record before @stop_seq is not available.
*
* If flushing up to @stop_seq was not successful, it only makes sense for the
* caller to try again when -EAGAIN was returned. When -EPERM is returned,
* this context is not allowed to acquire the console. When -ENOENT is
* returned, it cannot be expected that the unfinalized record will become
* available.
*/
static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
bool allow_unsafe_takeover)
{
struct nbcon_write_context wctxt = { };
struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt);
int err = 0;
ctxt->console = con;
ctxt->spinwait_max_us = 2000;
ctxt->prio = nbcon_get_default_prio();
ctxt->allow_unsafe_takeover = allow_unsafe_takeover;
if (!nbcon_context_try_acquire(ctxt))
return -EPERM;
while (nbcon_seq_read(con) < stop_seq) {
/*
* nbcon_emit_next_record() returns false when the console was
* handed over or taken over. In both cases the context is no
* longer valid.
*/
if (!nbcon_emit_next_record(&wctxt, true))
return -EAGAIN;
if (!ctxt->backlog) {
/* Are there reserved but not yet finalized records? */
if (nbcon_seq_read(con) < stop_seq)
err = -ENOENT;
break;
}
}
return true;
nbcon_context_release(ctxt);
return err;
}
/**
* nbcon_atomic_flush_pending_con - Flush specified nbcon console using its
* write_atomic() callback
* @con: The nbcon console to flush
* @stop_seq: Flush up until this record
* @allow_unsafe_takeover: True, to allow unsafe hostile takeovers
*
* This will stop flushing before @stop_seq if another context has ownership.
* That context is then responsible for the flushing. Likewise, if new records
* are added while this context was flushing and there is no other context
* to handle the printing, this context must also flush those records.
*/
static void nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
bool allow_unsafe_takeover)
{
struct console_flush_type ft;
unsigned long flags;
int err;
again:
/*
* Atomic flushing does not use console driver synchronization (i.e.
* it does not hold the port lock for uart consoles). Therefore IRQs
* must be disabled to avoid being interrupted and then calling into
* a driver that will deadlock trying to acquire console ownership.
*/
local_irq_save(flags);
err = __nbcon_atomic_flush_pending_con(con, stop_seq, allow_unsafe_takeover);
local_irq_restore(flags);
/*
* If there was a new owner (-EPERM, -EAGAIN), that context is
* responsible for completing.
*
* Do not wait for records not yet finalized (-ENOENT) to avoid a
* possible deadlock. They will either get flushed by the writer or
* eventually skipped on panic CPU.
*/
if (err)
return;
/*
* If flushing was successful but more records are available, this
* context must flush those remaining records if the printer thread
* is not available do it.
*/
printk_get_console_flush_type(&ft);
if (!ft.nbcon_offload &&
prb_read_valid(prb, nbcon_seq_read(con), NULL)) {
stop_seq = prb_next_reserve_seq(prb);
goto again;
}
}
/**
* __nbcon_atomic_flush_pending - Flush all nbcon consoles using their
* write_atomic() callback
* @stop_seq: Flush up until this record
* @allow_unsafe_takeover: True, to allow unsafe hostile takeovers
*/
static void __nbcon_atomic_flush_pending(u64 stop_seq, bool allow_unsafe_takeover)
{
struct console *con;
int cookie;
cookie = console_srcu_read_lock();
for_each_console_srcu(con) {
short flags = console_srcu_read_flags(con);
if (!(flags & CON_NBCON))
continue;
if (!console_is_usable(con, flags, true))
continue;
if (nbcon_seq_read(con) >= stop_seq)
continue;
nbcon_atomic_flush_pending_con(con, stop_seq, allow_unsafe_takeover);
}
console_srcu_read_unlock(cookie);
}
/**
* nbcon_atomic_flush_pending - Flush all nbcon consoles using their
* write_atomic() callback
*
* Flush the backlog up through the currently newest record. Any new
* records added while flushing will not be flushed if there is another
* context available to handle the flushing. This is to avoid one CPU
* printing unbounded because other CPUs continue to add records.
*/
void nbcon_atomic_flush_pending(void)
{
__nbcon_atomic_flush_pending(prb_next_reserve_seq(prb), false);
}
/**
* nbcon_atomic_flush_unsafe - Flush all nbcon consoles using their
* write_atomic() callback and allowing unsafe hostile takeovers
*
* Flush the backlog up through the currently newest record. Unsafe hostile
* takeovers will be performed, if necessary.
*/
void nbcon_atomic_flush_unsafe(void)
{
__nbcon_atomic_flush_pending(prb_next_reserve_seq(prb), true);
}
/**
* nbcon_cpu_emergency_enter - Enter an emergency section where printk()
* messages for that CPU are flushed directly
*
* Context: Any context. Disables preemption.
*
* When within an emergency section, printk() calls will attempt to flush any
* pending messages in the ringbuffer.
*/
void nbcon_cpu_emergency_enter(void)
{
unsigned int *cpu_emergency_nesting;
preempt_disable();
cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
(*cpu_emergency_nesting)++;
}
/**
* nbcon_cpu_emergency_exit - Exit an emergency section
*
* Context: Within an emergency section. Enables preemption.
*/
void nbcon_cpu_emergency_exit(void)
{
unsigned int *cpu_emergency_nesting;
cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
if (!WARN_ON_ONCE(*cpu_emergency_nesting == 0))
(*cpu_emergency_nesting)--;
preempt_enable();
}
/**
* nbcon_init - Initialize the nbcon console specific data
* nbcon_alloc - Allocate and init the nbcon console specific data
* @con: Console to initialize
*
* nbcon_alloc() *must* be called and succeed before this function
* is called.
* Return: True if the console was fully allocated and initialized.
* Otherwise @con must not be registered.
*
* This function expects that the legacy @con->seq has been set.
* When allocation and init was successful, the console must be properly
* freed using nbcon_free() once it is no longer needed.
*/
void nbcon_init(struct console *con)
bool nbcon_alloc(struct console *con)
{
struct nbcon_state state = { };
/* nbcon_alloc() must have been called and successful! */
BUG_ON(!con->pbufs);
/* The write_thread() callback is mandatory. */
if (WARN_ON(!con->write_thread))
return false;
nbcon_seq_force(con, con->seq);
rcuwait_init(&con->rcuwait);
init_irq_work(&con->irq_work, nbcon_irq_work);
atomic_long_set(&ACCESS_PRIVATE(con, nbcon_prev_seq), -1UL);
nbcon_state_set(con, &state);
/*
* Initialize @nbcon_seq to the highest possible sequence number so
* that practically speaking it will have nothing to print until a
* desired initial sequence number has been set via nbcon_seq_force().
*/
atomic_long_set(&ACCESS_PRIVATE(con, nbcon_seq), ULSEQ_MAX(prb));
if (con->flags & CON_BOOT) {
/*
* Boot console printing is synchronized with legacy console
* printing, so boot consoles can share the same global printk
* buffers.
*/
con->pbufs = &printk_shared_pbufs;
} else {
con->pbufs = kmalloc(sizeof(*con->pbufs), GFP_KERNEL);
if (!con->pbufs) {
con_printk(KERN_ERR, con, "failed to allocate printing buffer\n");
return false;
}
if (printk_kthreads_running) {
if (!nbcon_kthread_create(con)) {
kfree(con->pbufs);
con->pbufs = NULL;
return false;
}
}
}
return true;
}
/**
......@@ -986,6 +1721,9 @@ void nbcon_free(struct console *con)
{
struct nbcon_state state = { };
if (printk_kthreads_running)
nbcon_kthread_stop(con);
nbcon_state_set(con, &state);
/* Boot consoles share global printk buffers. */
......@@ -994,3 +1732,85 @@ void nbcon_free(struct console *con)
con->pbufs = NULL;
}
/**
* nbcon_device_try_acquire - Try to acquire nbcon console and enter unsafe
* section
* @con: The nbcon console to acquire
*
* Context: Under the locking mechanism implemented in
* @con->device_lock() including disabling migration.
* Return: True if the console was acquired. False otherwise.
*
* Console drivers will usually use their own internal synchronization
* mechasism to synchronize between console printing and non-printing
* activities (such as setting baud rates). However, nbcon console drivers
* supporting atomic consoles may also want to mark unsafe sections when
* performing non-printing activities in order to synchronize against their
* atomic_write() callback.
*
* This function acquires the nbcon console using priority NBCON_PRIO_NORMAL
* and marks it unsafe for handover/takeover.
*/
bool nbcon_device_try_acquire(struct console *con)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(con, nbcon_device_ctxt);
cant_migrate();
memset(ctxt, 0, sizeof(*ctxt));
ctxt->console = con;
ctxt->prio = NBCON_PRIO_NORMAL;
if (!nbcon_context_try_acquire(ctxt))
return false;
if (!nbcon_context_enter_unsafe(ctxt))
return false;
return true;
}
EXPORT_SYMBOL_GPL(nbcon_device_try_acquire);
/**
* nbcon_device_release - Exit unsafe section and release the nbcon console
* @con: The nbcon console acquired in nbcon_device_try_acquire()
*/
void nbcon_device_release(struct console *con)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(con, nbcon_device_ctxt);
struct console_flush_type ft;
int cookie;
if (!nbcon_context_exit_unsafe(ctxt))
return;
nbcon_context_release(ctxt);
/*
* This context must flush any new records added while the console
* was locked if the printer thread is not available to do it. The
* console_srcu_read_lock must be taken to ensure the console is
* usable throughout flushing.
*/
cookie = console_srcu_read_lock();
printk_get_console_flush_type(&ft);
if (console_is_usable(con, console_srcu_read_flags(con), true) &&
!ft.nbcon_offload &&
prb_read_valid(prb, nbcon_seq_read(con), NULL)) {
/*
* If nbcon_atomic flushing is not available, fallback to
* using the legacy loop.
*/
if (ft.nbcon_atomic) {
__nbcon_atomic_flush_pending_con(con, prb_next_reserve_seq(prb), false);
} else if (ft.legacy_direct) {
if (console_trylock())
console_unlock();
} else if (ft.legacy_offload) {
printk_trigger_flush();
}
}
console_srcu_read_unlock(cookie);
}
EXPORT_SYMBOL_GPL(nbcon_device_release);
......@@ -34,6 +34,7 @@
#include <linux/security.h>
#include <linux/memblock.h>
#include <linux/syscalls.h>
#include <linux/syscore_ops.h>
#include <linux/vmcore_info.h>
#include <linux/ratelimit.h>
#include <linux/kmsg_dump.h>
......@@ -282,6 +283,7 @@ EXPORT_SYMBOL(console_list_unlock);
* Return: A cookie to pass to console_srcu_read_unlock().
*/
int console_srcu_read_lock(void)
__acquires(&console_srcu)
{
return srcu_read_lock_nmisafe(&console_srcu);
}
......@@ -295,6 +297,7 @@ EXPORT_SYMBOL(console_srcu_read_lock);
* Counterpart to console_srcu_read_lock()
*/
void console_srcu_read_unlock(int cookie)
__releases(&console_srcu)
{
srcu_read_unlock_nmisafe(&console_srcu, cookie);
}
......@@ -461,14 +464,43 @@ static int console_msg_format = MSG_FORMAT_DEFAULT;
/* syslog_lock protects syslog_* variables and write access to clear_seq. */
static DEFINE_MUTEX(syslog_lock);
/*
* Specifies if a legacy console is registered. If legacy consoles are
* present, it is necessary to perform the console lock/unlock dance
* whenever console flushing should occur.
*/
bool have_legacy_console;
/*
* Specifies if an nbcon console is registered. If nbcon consoles are present,
* synchronous printing of legacy consoles will not occur during panic until
* the backtrace has been stored to the ringbuffer.
*/
bool have_nbcon_console;
/*
* Specifies if a boot console is registered. If boot consoles are present,
* nbcon consoles cannot print simultaneously and must be synchronized by
* the console lock. This is because boot consoles and nbcon consoles may
* have mapped the same hardware.
*/
bool have_boot_console;
/* See printk_legacy_allow_panic_sync() for details. */
bool legacy_allow_panic_sync;
#ifdef CONFIG_PRINTK
DECLARE_WAIT_QUEUE_HEAD(log_wait);
static DECLARE_WAIT_QUEUE_HEAD(legacy_wait);
/* All 3 protected by @syslog_lock. */
/* the next printk record to read by syslog(READ) or /proc/kmsg */
static u64 syslog_seq;
static size_t syslog_partial;
static bool syslog_time;
/* True when _all_ printer threads are available for printing. */
bool printk_kthreads_running;
struct latched_seq {
seqcount_latch_t latch;
u64 val[2];
......@@ -1850,7 +1882,7 @@ static bool console_waiter;
* there may be a waiter spinning (like a spinlock). Also it must be
* ready to hand over the lock at the end of the section.
*/
static void console_lock_spinning_enable(void)
void console_lock_spinning_enable(void)
{
/*
* Do not use spinning in panic(). The panic CPU wants to keep the lock.
......@@ -1889,7 +1921,7 @@ static void console_lock_spinning_enable(void)
*
* Return: 1 if the lock rights were passed, 0 otherwise.
*/
static int console_lock_spinning_disable_and_check(int cookie)
int console_lock_spinning_disable_and_check(int cookie)
{
int waiter;
......@@ -2300,12 +2332,30 @@ int vprintk_store(int facility, int level,
return ret;
}
/*
* This acts as a one-way switch to allow legacy consoles to print from
* the printk() caller context on a panic CPU. It also attempts to flush
* the legacy consoles in this context.
*/
void printk_legacy_allow_panic_sync(void)
{
struct console_flush_type ft;
legacy_allow_panic_sync = true;
printk_get_console_flush_type(&ft);
if (ft.legacy_direct) {
if (console_trylock())
console_unlock();
}
}
asmlinkage int vprintk_emit(int facility, int level,
const struct dev_printk_info *dev_info,
const char *fmt, va_list args)
{
struct console_flush_type ft;
int printed_len;
bool in_sched = false;
/* Suppress unimportant messages after panic happens */
if (unlikely(suppress_printk))
......@@ -2319,17 +2369,26 @@ asmlinkage int vprintk_emit(int facility, int level,
if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
return 0;
printk_get_console_flush_type(&ft);
/* If called from the scheduler, we can not call up(). */
if (level == LOGLEVEL_SCHED) {
level = LOGLEVEL_DEFAULT;
in_sched = true;
ft.legacy_offload |= ft.legacy_direct;
ft.legacy_direct = false;
}
printk_delay(level);
printed_len = vprintk_store(facility, level, dev_info, fmt, args);
/* If called from the scheduler, we can not call up(). */
if (!in_sched) {
if (ft.nbcon_atomic)
nbcon_atomic_flush_pending();
if (ft.nbcon_offload)
nbcon_kthreads_wake();
if (ft.legacy_direct) {
/*
* The caller may be holding system-critical or
* timing-sensitive locks. Disable preemption during
......@@ -2349,7 +2408,7 @@ asmlinkage int vprintk_emit(int facility, int level,
preempt_enable();
}
if (in_sched)
if (ft.legacy_offload)
defer_console_output();
else
wake_up_klogd();
......@@ -2678,6 +2737,7 @@ void suspend_console(void)
void resume_console(void)
{
struct console_flush_type ft;
struct console *con;
if (!console_suspend_enabled)
......@@ -2695,6 +2755,12 @@ void resume_console(void)
*/
synchronize_srcu(&console_srcu);
printk_get_console_flush_type(&ft);
if (ft.nbcon_offload)
nbcon_kthreads_wake();
if (ft.legacy_offload)
defer_console_output();
pr_flush(1000, true);
}
......@@ -2709,10 +2775,16 @@ void resume_console(void)
*/
static int console_cpu_notify(unsigned int cpu)
{
struct console_flush_type ft;
if (!cpuhp_tasks_frozen) {
/* If trylock fails, someone else is doing the printing */
if (console_trylock())
console_unlock();
printk_get_console_flush_type(&ft);
if (ft.nbcon_atomic)
nbcon_atomic_flush_pending();
if (ft.legacy_direct) {
if (console_trylock())
console_unlock();
}
}
return 0;
}
......@@ -2766,36 +2838,6 @@ int is_console_locked(void)
}
EXPORT_SYMBOL(is_console_locked);
/*
* Check if the given console is currently capable and allowed to print
* records.
*
* Requires the console_srcu_read_lock.
*/
static inline bool console_is_usable(struct console *con)
{
short flags = console_srcu_read_flags(con);
if (!(flags & CON_ENABLED))
return false;
if ((flags & CON_SUSPENDED))
return false;
if (!con->write)
return false;
/*
* Console drivers may assume that per-cpu resources have been
* allocated. So unless they're explicitly marked as being able to
* cope (CON_ANYTIME) don't call them until this CPU is officially up.
*/
if (!cpu_online(raw_smp_processor_id()) && !(flags & CON_ANYTIME))
return false;
return true;
}
static void __console_unlock(void)
{
console_locked = 0;
......@@ -2805,30 +2847,31 @@ static void __console_unlock(void)
#ifdef CONFIG_PRINTK
/*
* Prepend the message in @pmsg->pbufs->outbuf with a "dropped message". This
* is achieved by shifting the existing message over and inserting the dropped
* message.
* Prepend the message in @pmsg->pbufs->outbuf. This is achieved by shifting
* the existing message over and inserting the scratchbuf message.
*
* @pmsg is the printk message to prepend.
*
* @dropped is the dropped count to report in the dropped message.
* @pmsg is the original printk message.
* @fmt is the printf format of the message which will prepend the existing one.
*
* If the message text in @pmsg->pbufs->outbuf does not have enough space for
* the dropped message, the message text will be sufficiently truncated.
* If there is not enough space in @pmsg->pbufs->outbuf, the existing
* message text will be sufficiently truncated.
*
* If @pmsg->pbufs->outbuf is modified, @pmsg->outbuf_len is updated.
*/
void console_prepend_dropped(struct printk_message *pmsg, unsigned long dropped)
__printf(2, 3)
static void console_prepend_message(struct printk_message *pmsg, const char *fmt, ...)
{
struct printk_buffers *pbufs = pmsg->pbufs;
const size_t scratchbuf_sz = sizeof(pbufs->scratchbuf);
const size_t outbuf_sz = sizeof(pbufs->outbuf);
char *scratchbuf = &pbufs->scratchbuf[0];
char *outbuf = &pbufs->outbuf[0];
va_list args;
size_t len;
len = scnprintf(scratchbuf, scratchbuf_sz,
"** %lu printk messages dropped **\n", dropped);
va_start(args, fmt);
len = vscnprintf(scratchbuf, scratchbuf_sz, fmt, args);
va_end(args);
/*
* Make sure outbuf is sufficiently large before prepending.
......@@ -2850,6 +2893,30 @@ void console_prepend_dropped(struct printk_message *pmsg, unsigned long dropped)
pmsg->outbuf_len += len;
}
/*
* Prepend the message in @pmsg->pbufs->outbuf with a "dropped message".
* @pmsg->outbuf_len is updated appropriately.
*
* @pmsg is the printk message to prepend.
*
* @dropped is the dropped count to report in the dropped message.
*/
void console_prepend_dropped(struct printk_message *pmsg, unsigned long dropped)
{
console_prepend_message(pmsg, "** %lu printk messages dropped **\n", dropped);
}
/*
* Prepend the message in @pmsg->pbufs->outbuf with a "replay message".
* @pmsg->outbuf_len is updated appropriately.
*
* @pmsg is the printk message to prepend.
*/
void console_prepend_replay(struct printk_message *pmsg)
{
console_prepend_message(pmsg, "** replaying previous printk message **\n");
}
/*
* Read and format the specified record (or a later record if the specified
* record is not available).
......@@ -2915,6 +2982,34 @@ bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
return true;
}
/*
* Legacy console printing from printk() caller context does not respect
* raw_spinlock/spinlock nesting. For !PREEMPT_RT the lockdep warning is a
* false positive. For PREEMPT_RT the false positive condition does not
* occur.
*
* This map is used to temporarily establish LD_WAIT_SLEEP context for the
* console write() callback when legacy printing to avoid false positive
* lockdep complaints, thus allowing lockdep to continue to function for
* real issues.
*/
#ifdef CONFIG_PREEMPT_RT
static inline void printk_legacy_allow_spinlock_enter(void) { }
static inline void printk_legacy_allow_spinlock_exit(void) { }
#else
static DEFINE_WAIT_OVERRIDE_MAP(printk_legacy_map, LD_WAIT_SLEEP);
static inline void printk_legacy_allow_spinlock_enter(void)
{
lock_map_acquire_try(&printk_legacy_map);
}
static inline void printk_legacy_allow_spinlock_exit(void)
{
lock_map_release(&printk_legacy_map);
}
#endif /* CONFIG_PREEMPT_RT */
/*
* Used as the printk buffers for non-panic, serialized console printing.
* This is for legacy (!CON_NBCON) as well as all boot (CON_BOOT) consoles.
......@@ -2964,31 +3059,46 @@ static bool console_emit_next_record(struct console *con, bool *handover, int co
con->dropped = 0;
}
/*
* While actively printing out messages, if another printk()
* were to occur on another CPU, it may wait for this one to
* finish. This task can not be preempted if there is a
* waiter waiting to take over.
*
* Interrupts are disabled because the hand over to a waiter
* must not be interrupted until the hand over is completed
* (@console_waiter is cleared).
*/
printk_safe_enter_irqsave(flags);
console_lock_spinning_enable();
/* Write everything out to the hardware. */
/* Do not trace print latency. */
stop_critical_timings();
if (force_legacy_kthread() && !panic_in_progress()) {
/*
* With forced threading this function is in a task context
* (either legacy kthread or get_init_console_seq()). There
* is no need for concern about printk reentrance, handovers,
* or lockdep complaints.
*/
/* Write everything out to the hardware. */
con->write(con, outbuf, pmsg.outbuf_len);
con->write(con, outbuf, pmsg.outbuf_len);
con->seq = pmsg.seq + 1;
} else {
/*
* While actively printing out messages, if another printk()
* were to occur on another CPU, it may wait for this one to
* finish. This task can not be preempted if there is a
* waiter waiting to take over.
*
* Interrupts are disabled because the hand over to a waiter
* must not be interrupted until the hand over is completed
* (@console_waiter is cleared).
*/
printk_safe_enter_irqsave(flags);
console_lock_spinning_enable();
start_critical_timings();
/* Do not trace print latency. */
stop_critical_timings();
con->seq = pmsg.seq + 1;
printk_legacy_allow_spinlock_enter();
con->write(con, outbuf, pmsg.outbuf_len);
printk_legacy_allow_spinlock_exit();
*handover = console_lock_spinning_disable_and_check(cookie);
printk_safe_exit_irqrestore(flags);
start_critical_timings();
con->seq = pmsg.seq + 1;
*handover = console_lock_spinning_disable_and_check(cookie);
printk_safe_exit_irqrestore(flags);
}
skip:
return true;
}
......@@ -3001,6 +3111,8 @@ static bool console_emit_next_record(struct console *con, bool *handover, int co
return false;
}
static inline void printk_kthreads_check_locked(void) { }
#endif /* CONFIG_PRINTK */
/*
......@@ -3028,6 +3140,7 @@ static bool console_emit_next_record(struct console *con, bool *handover, int co
*/
static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handover)
{
struct console_flush_type ft;
bool any_usable = false;
struct console *con;
bool any_progress;
......@@ -3039,15 +3152,34 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
do {
any_progress = false;
printk_get_console_flush_type(&ft);
cookie = console_srcu_read_lock();
for_each_console_srcu(con) {
short flags = console_srcu_read_flags(con);
u64 printk_seq;
bool progress;
if (!console_is_usable(con))
/*
* console_flush_all() is only responsible for nbcon
* consoles when the nbcon consoles cannot print via
* their atomic or threaded flushing.
*/
if ((flags & CON_NBCON) && (ft.nbcon_atomic || ft.nbcon_offload))
continue;
if (!console_is_usable(con, flags, !do_cond_resched))
continue;
any_usable = true;
progress = console_emit_next_record(con, handover, cookie);
if (flags & CON_NBCON) {
progress = nbcon_legacy_emit_next_record(con, handover, cookie,
!do_cond_resched);
printk_seq = nbcon_seq_read(con);
} else {
progress = console_emit_next_record(con, handover, cookie);
printk_seq = con->seq;
}
/*
* If a handover has occurred, the SRCU read lock
......@@ -3057,8 +3189,8 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
return false;
/* Track the next of the highest seq flushed. */
if (con->seq > *next_seq)
*next_seq = con->seq;
if (printk_seq > *next_seq)
*next_seq = printk_seq;
if (!progress)
continue;
......@@ -3081,19 +3213,7 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
return false;
}
/**
* console_unlock - unblock the console subsystem from printing
*
* Releases the console_lock which the caller holds to block printing of
* the console subsystem.
*
* While the console_lock was held, console output may have been buffered
* by printk(). If this is the case, console_unlock(); emits
* the output prior to releasing the lock.
*
* console_unlock(); may be called from any context.
*/
void console_unlock(void)
static void __console_flush_and_unlock(void)
{
bool do_cond_resched;
bool handover;
......@@ -3137,6 +3257,29 @@ void console_unlock(void)
*/
} while (prb_read_valid(prb, next_seq, NULL) && console_trylock());
}
/**
* console_unlock - unblock the legacy console subsystem from printing
*
* Releases the console_lock which the caller holds to block printing of
* the legacy console subsystem.
*
* While the console_lock was held, console output may have been buffered
* by printk(). If this is the case, console_unlock() emits the output on
* legacy consoles prior to releasing the lock.
*
* console_unlock(); may be called from any context.
*/
void console_unlock(void)
{
struct console_flush_type ft;
printk_get_console_flush_type(&ft);
if (ft.legacy_direct)
__console_flush_and_unlock();
else
__console_unlock();
}
EXPORT_SYMBOL(console_unlock);
/**
......@@ -3259,6 +3402,7 @@ static void __console_rewind_all(void)
*/
void console_flush_on_panic(enum con_flush_mode mode)
{
struct console_flush_type ft;
bool handover;
u64 next_seq;
......@@ -3282,7 +3426,13 @@ void console_flush_on_panic(enum con_flush_mode mode)
if (mode == CONSOLE_REPLAY_ALL)
__console_rewind_all();
console_flush_all(false, &next_seq, &handover);
printk_get_console_flush_type(&ft);
if (ft.nbcon_atomic)
nbcon_atomic_flush_pending();
/* Flush legacy consoles once allowed, even when dangerous. */
if (legacy_allow_panic_sync)
console_flush_all(false, &next_seq, &handover);
}
/*
......@@ -3339,13 +3489,236 @@ EXPORT_SYMBOL(console_stop);
void console_start(struct console *console)
{
struct console_flush_type ft;
bool is_nbcon;
console_list_lock();
console_srcu_write_flags(console, console->flags | CON_ENABLED);
is_nbcon = console->flags & CON_NBCON;
console_list_unlock();
/*
* Ensure that all SRCU list walks have completed. The related
* printing context must be able to see it is enabled so that
* it is guaranteed to wake up and resume printing.
*/
synchronize_srcu(&console_srcu);
printk_get_console_flush_type(&ft);
if (is_nbcon && ft.nbcon_offload)
nbcon_kthread_wake(console);
else if (ft.legacy_offload)
defer_console_output();
__pr_flush(console, 1000, true);
}
EXPORT_SYMBOL(console_start);
#ifdef CONFIG_PRINTK
static int unregister_console_locked(struct console *console);
/* True when system boot is far enough to create printer threads. */
static bool printk_kthreads_ready __ro_after_init;
static struct task_struct *printk_legacy_kthread;
static bool legacy_kthread_should_wakeup(void)
{
struct console_flush_type ft;
struct console *con;
bool ret = false;
int cookie;
if (kthread_should_stop())
return true;
printk_get_console_flush_type(&ft);
cookie = console_srcu_read_lock();
for_each_console_srcu(con) {
short flags = console_srcu_read_flags(con);
u64 printk_seq;
/*
* The legacy printer thread is only responsible for nbcon
* consoles when the nbcon consoles cannot print via their
* atomic or threaded flushing.
*/
if ((flags & CON_NBCON) && (ft.nbcon_atomic || ft.nbcon_offload))
continue;
if (!console_is_usable(con, flags, false))
continue;
if (flags & CON_NBCON) {
printk_seq = nbcon_seq_read(con);
} else {
/*
* It is safe to read @seq because only this
* thread context updates @seq.
*/
printk_seq = con->seq;
}
if (prb_read_valid(prb, printk_seq, NULL)) {
ret = true;
break;
}
}
console_srcu_read_unlock(cookie);
return ret;
}
static int legacy_kthread_func(void *unused)
{
for (;;) {
wait_event_interruptible(legacy_wait, legacy_kthread_should_wakeup());
if (kthread_should_stop())
break;
console_lock();
__console_flush_and_unlock();
}
return 0;
}
static bool legacy_kthread_create(void)
{
struct task_struct *kt;
lockdep_assert_console_list_lock_held();
kt = kthread_run(legacy_kthread_func, NULL, "pr/legacy");
if (WARN_ON(IS_ERR(kt))) {
pr_err("failed to start legacy printing thread\n");
return false;
}
printk_legacy_kthread = kt;
/*
* It is important that console printing threads are scheduled
* shortly after a printk call and with generous runtime budgets.
*/
sched_set_normal(printk_legacy_kthread, -20);
return true;
}
/**
* printk_kthreads_shutdown - shutdown all threaded printers
*
* On system shutdown all threaded printers are stopped. This allows printk
* to transition back to atomic printing, thus providing a robust mechanism
* for the final shutdown/reboot messages to be output.
*/
static void printk_kthreads_shutdown(void)
{
struct console *con;
console_list_lock();
if (printk_kthreads_running) {
printk_kthreads_running = false;
for_each_console(con) {
if (con->flags & CON_NBCON)
nbcon_kthread_stop(con);
}
/*
* The threads may have been stopped while printing a
* backlog. Flush any records left over.
*/
nbcon_atomic_flush_pending();
}
console_list_unlock();
}
static struct syscore_ops printk_syscore_ops = {
.shutdown = printk_kthreads_shutdown,
};
/*
* If appropriate, start nbcon kthreads and set @printk_kthreads_running.
* If any kthreads fail to start, those consoles are unregistered.
*
* Must be called under console_list_lock().
*/
static void printk_kthreads_check_locked(void)
{
struct hlist_node *tmp;
struct console *con;
lockdep_assert_console_list_lock_held();
if (!printk_kthreads_ready)
return;
if (have_legacy_console || have_boot_console) {
if (!printk_legacy_kthread &&
force_legacy_kthread() &&
!legacy_kthread_create()) {
/*
* All legacy consoles must be unregistered. If there
* are any nbcon consoles, they will set up their own
* kthread.
*/
hlist_for_each_entry_safe(con, tmp, &console_list, node) {
if (con->flags & CON_NBCON)
continue;
unregister_console_locked(con);
}
}
} else if (printk_legacy_kthread) {
kthread_stop(printk_legacy_kthread);
printk_legacy_kthread = NULL;
}
/*
* Printer threads cannot be started as long as any boot console is
* registered because there is no way to synchronize the hardware
* registers between boot console code and regular console code.
* It can only be known that there will be no new boot consoles when
* an nbcon console is registered.
*/
if (have_boot_console || !have_nbcon_console) {
/* Clear flag in case all nbcon consoles unregistered. */
printk_kthreads_running = false;
return;
}
if (printk_kthreads_running)
return;
hlist_for_each_entry_safe(con, tmp, &console_list, node) {
if (!(con->flags & CON_NBCON))
continue;
if (!nbcon_kthread_create(con))
unregister_console_locked(con);
}
printk_kthreads_running = true;
}
static int __init printk_set_kthreads_ready(void)
{
register_syscore_ops(&printk_syscore_ops);
console_list_lock();
printk_kthreads_ready = true;
printk_kthreads_check_locked();
console_list_unlock();
return 0;
}
early_initcall(printk_set_kthreads_ready);
#endif /* CONFIG_PRINTK */
static int __read_mostly keep_bootcon;
static int __init keep_bootcon_setup(char *str)
......@@ -3447,19 +3820,21 @@ static void try_enable_default_console(struct console *newcon)
newcon->flags |= CON_CONSDEV;
}
static void console_init_seq(struct console *newcon, bool bootcon_registered)
/* Return the starting sequence number for a newly registered console. */
static u64 get_init_console_seq(struct console *newcon, bool bootcon_registered)
{
struct console *con;
bool handover;
u64 init_seq;
if (newcon->flags & (CON_PRINTBUFFER | CON_BOOT)) {
/* Get a consistent copy of @syslog_seq. */
mutex_lock(&syslog_lock);
newcon->seq = syslog_seq;
init_seq = syslog_seq;
mutex_unlock(&syslog_lock);
} else {
/* Begin with next message added to ringbuffer. */
newcon->seq = prb_next_seq(prb);
init_seq = prb_next_seq(prb);
/*
* If any enabled boot consoles are due to be unregistered
......@@ -3480,7 +3855,7 @@ static void console_init_seq(struct console *newcon, bool bootcon_registered)
* Flush all consoles and set the console to start at
* the next unprinted sequence number.
*/
if (!console_flush_all(true, &newcon->seq, &handover)) {
if (!console_flush_all(true, &init_seq, &handover)) {
/*
* Flushing failed. Just choose the lowest
* sequence of the enabled boot consoles.
......@@ -3493,19 +3868,30 @@ static void console_init_seq(struct console *newcon, bool bootcon_registered)
if (handover)
console_lock();
newcon->seq = prb_next_seq(prb);
init_seq = prb_next_seq(prb);
for_each_console(con) {
if ((con->flags & CON_BOOT) &&
(con->flags & CON_ENABLED) &&
con->seq < newcon->seq) {
newcon->seq = con->seq;
u64 seq;
if (!(con->flags & CON_BOOT) ||
!(con->flags & CON_ENABLED)) {
continue;
}
if (con->flags & CON_NBCON)
seq = nbcon_seq_read(con);
else
seq = con->seq;
if (seq < init_seq)
init_seq = seq;
}
}
console_unlock();
}
}
return init_seq;
}
#define console_first() \
......@@ -3534,9 +3920,12 @@ static int unregister_console_locked(struct console *console);
*/
void register_console(struct console *newcon)
{
struct console *con;
bool use_device_lock = (newcon->flags & CON_NBCON) && newcon->write_atomic;
bool bootcon_registered = false;
bool realcon_registered = false;
struct console *con;
unsigned long flags;
u64 init_seq;
int err;
console_list_lock();
......@@ -3614,10 +4003,31 @@ void register_console(struct console *newcon)
}
newcon->dropped = 0;
console_init_seq(newcon, bootcon_registered);
init_seq = get_init_console_seq(newcon, bootcon_registered);
if (newcon->flags & CON_NBCON) {
have_nbcon_console = true;
nbcon_seq_force(newcon, init_seq);
} else {
have_legacy_console = true;
newcon->seq = init_seq;
}
if (newcon->flags & CON_BOOT)
have_boot_console = true;
if (newcon->flags & CON_NBCON)
nbcon_init(newcon);
/*
* If another context is actively using the hardware of this new
* console, it will not be aware of the nbcon synchronization. This
* is a risk that two contexts could access the hardware
* simultaneously if this new console is used for atomic printing
* and the other context is still using the hardware.
*
* Use the driver synchronization to ensure that the hardware is not
* in use while this new console transitions to being registered.
*/
if (use_device_lock)
newcon->device_lock(newcon, &flags);
/*
* Put this console in the list - keep the
......@@ -3643,6 +4053,10 @@ void register_console(struct console *newcon)
* register_console() completes.
*/
/* This new console is now registered. */
if (use_device_lock)
newcon->device_unlock(newcon, flags);
console_sysfs_notify();
/*
......@@ -3663,6 +4077,9 @@ void register_console(struct console *newcon)
unregister_console_locked(con);
}
}
/* Changed console list, may require printer threads to start/stop. */
printk_kthreads_check_locked();
unlock:
console_list_unlock();
}
......@@ -3671,6 +4088,12 @@ EXPORT_SYMBOL(register_console);
/* Must be called under console_list_lock(). */
static int unregister_console_locked(struct console *console)
{
bool use_device_lock = (console->flags & CON_NBCON) && console->write_atomic;
bool found_legacy_con = false;
bool found_nbcon_con = false;
bool found_boot_con = false;
unsigned long flags;
struct console *c;
int res;
lockdep_assert_console_list_lock_held();
......@@ -3683,14 +4106,29 @@ static int unregister_console_locked(struct console *console)
if (res > 0)
return 0;
if (!console_is_registered_locked(console))
res = -ENODEV;
else if (console_is_usable(console, console->flags, true))
__pr_flush(console, 1000, true);
/* Disable it unconditionally */
console_srcu_write_flags(console, console->flags & ~CON_ENABLED);
if (!console_is_registered_locked(console))
return -ENODEV;
if (res < 0)
return res;
/*
* Use the driver synchronization to ensure that the hardware is not
* in use while this console transitions to being unregistered.
*/
if (use_device_lock)
console->device_lock(console, &flags);
hlist_del_init_rcu(&console->node);
if (use_device_lock)
console->device_unlock(console, flags);
/*
* <HISTORICAL>
* If this isn't the last console and it has CON_CONSDEV set, we
......@@ -3718,6 +4156,29 @@ static int unregister_console_locked(struct console *console)
if (console->exit)
res = console->exit(console);
/*
* With this console gone, the global flags tracking registered
* console types may have changed. Update them.
*/
for_each_console(c) {
if (c->flags & CON_BOOT)
found_boot_con = true;
if (c->flags & CON_NBCON)
found_nbcon_con = true;
else
found_legacy_con = true;
}
if (!found_boot_con)
have_boot_console = found_boot_con;
if (!found_legacy_con)
have_legacy_console = found_legacy_con;
if (!found_nbcon_con)
have_nbcon_console = found_nbcon_con;
/* Changed console list, may require printer threads to start/stop. */
printk_kthreads_check_locked();
return res;
}
......@@ -3864,6 +4325,7 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
{
unsigned long timeout_jiffies = msecs_to_jiffies(timeout_ms);
unsigned long remaining_jiffies = timeout_jiffies;
struct console_flush_type ft;
struct console *c;
u64 last_diff = 0;
u64 printk_seq;
......@@ -3872,13 +4334,22 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
u64 diff;
u64 seq;
/* Sorry, pr_flush() will not work this early. */
if (system_state < SYSTEM_SCHEDULING)
return false;
might_sleep();
seq = prb_next_reserve_seq(prb);
/* Flush the consoles so that records up to @seq are printed. */
console_lock();
console_unlock();
printk_get_console_flush_type(&ft);
if (ft.nbcon_atomic)
nbcon_atomic_flush_pending();
if (ft.legacy_direct) {
console_lock();
console_unlock();
}
for (;;) {
unsigned long begin_jiffies;
......@@ -3891,6 +4362,12 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
* console->seq. Releasing console_lock flushes more
* records in case @seq is still not printed on all
* usable consoles.
*
* Holding the console_lock is not necessary if there
* are no legacy or boot consoles. However, such a
* console could register at any time. Always hold the
* console_lock as a precaution rather than
* synchronizing against register_console().
*/
console_lock();
......@@ -3906,8 +4383,10 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
* that they make forward progress, so only increment
* @diff for usable consoles.
*/
if (!console_is_usable(c))
if (!console_is_usable(c, flags, true) &&
!console_is_usable(c, flags, false)) {
continue;
}
if (flags & CON_NBCON) {
printk_seq = nbcon_seq_read(c);
......@@ -3975,9 +4454,13 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
int pending = this_cpu_xchg(printk_pending, 0);
if (pending & PRINTK_PENDING_OUTPUT) {
/* If trylock fails, someone else is doing the printing */
if (console_trylock())
console_unlock();
if (force_legacy_kthread()) {
if (printk_legacy_kthread)
wake_up_interruptible(&legacy_wait);
} else {
if (console_trylock())
console_unlock();
}
}
if (pending & PRINTK_PENDING_WAKEUP)
......@@ -4383,8 +4866,17 @@ EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
*/
void console_try_replay_all(void)
{
struct console_flush_type ft;
printk_get_console_flush_type(&ft);
if (console_trylock()) {
__console_rewind_all();
if (ft.nbcon_atomic)
nbcon_atomic_flush_pending();
if (ft.nbcon_offload)
nbcon_kthreads_wake();
if (ft.legacy_offload)
defer_console_output();
/* Consoles are flushed as part of console_unlock(). */
console_unlock();
}
......
......@@ -4,7 +4,10 @@
#define _KERNEL_PRINTK_RINGBUFFER_H
#include <linux/atomic.h>
#include <linux/bits.h>
#include <linux/dev_printk.h>
#include <linux/stddef.h>
#include <linux/types.h>
/*
* Meta information about each stored message.
......@@ -120,7 +123,7 @@ enum desc_state {
#define _DATA_SIZE(sz_bits) (1UL << (sz_bits))
#define _DESCS_COUNT(ct_bits) (1U << (ct_bits))
#define DESC_SV_BITS (sizeof(unsigned long) * 8)
#define DESC_SV_BITS BITS_PER_LONG
#define DESC_FLAGS_SHIFT (DESC_SV_BITS - 2)
#define DESC_FLAGS_MASK (3UL << DESC_FLAGS_SHIFT)
#define DESC_STATE(sv) (3UL & (sv >> DESC_FLAGS_SHIFT))
......@@ -401,10 +404,12 @@ u64 prb_next_reserve_seq(struct printk_ringbuffer *rb);
#define __u64seq_to_ulseq(u64seq) (u64seq)
#define __ulseq_to_u64seq(rb, ulseq) (ulseq)
#define ULSEQ_MAX(rb) (-1)
#else /* CONFIG_64BIT */
#define __u64seq_to_ulseq(u64seq) ((u32)u64seq)
#define ULSEQ_MAX(rb) __u64seq_to_ulseq(prb_first_seq(rb) + 0x80000000UL)
static inline u64 __ulseq_to_u64seq(struct printk_ringbuffer *rb, u32 ulseq)
{
......
......@@ -26,6 +26,29 @@ void __printk_safe_exit(void)
this_cpu_dec(printk_context);
}
void __printk_deferred_enter(void)
{
cant_migrate();
__printk_safe_enter();
}
void __printk_deferred_exit(void)
{
cant_migrate();
__printk_safe_exit();
}
bool is_printk_legacy_deferred(void)
{
/*
* The per-CPU variable @printk_context can be read safely in any
* context. CPU migration is always disabled when set.
*/
return (force_legacy_kthread() ||
this_cpu_read(printk_context) ||
in_nmi());
}
asmlinkage int vprintk(const char *fmt, va_list args)
{
#ifdef CONFIG_KGDB_KDB
......@@ -38,7 +61,7 @@ asmlinkage int vprintk(const char *fmt, va_list args)
* Use the main logbuf even in NMI. But avoid calling console
* drivers that might have their own locks.
*/
if (this_cpu_read(printk_context) || in_nmi())
if (is_printk_legacy_deferred())
return vprintk_deferred(fmt, args);
/* No obstacles. */
......
......@@ -7,6 +7,7 @@
* Authors: Paul E. McKenney <paulmck@linux.ibm.com>
*/
#include <linux/console.h>
#include <linux/lockdep.h>
static void rcu_exp_handler(void *unused);
......@@ -590,6 +591,9 @@ static void synchronize_rcu_expedited_wait(void)
return;
if (rcu_stall_is_suppressed())
continue;
nbcon_cpu_emergency_enter();
j = jiffies;
rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_EXP, (void *)(j - jiffies_start));
trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall"));
......@@ -643,6 +647,9 @@ static void synchronize_rcu_expedited_wait(void)
rcu_exp_print_detail_task_stall_rnp(rnp);
}
jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3;
nbcon_cpu_emergency_exit();
panic_on_rcu_stall();
}
}
......
......@@ -7,6 +7,7 @@
* Author: Paul E. McKenney <paulmck@linux.ibm.com>
*/
#include <linux/console.h>
#include <linux/kvm_para.h>
#include <linux/rcu_notifier.h>
......@@ -605,6 +606,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
if (rcu_stall_is_suppressed())
return;
nbcon_cpu_emergency_enter();
/*
* OK, time to rat on our buddy...
* See Documentation/RCU/stallwarn.rst for info on how to debug
......@@ -657,6 +660,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
rcu_check_gp_kthread_expired_fqs_timer();
rcu_check_gp_kthread_starvation();
nbcon_cpu_emergency_exit();
panic_on_rcu_stall();
rcu_force_quiescent_state(); /* Kick them all. */
......@@ -677,6 +682,8 @@ static void print_cpu_stall(unsigned long gps)
if (rcu_stall_is_suppressed())
return;
nbcon_cpu_emergency_enter();
/*
* OK, time to rat on ourselves...
* See Documentation/RCU/stallwarn.rst for info on how to debug
......@@ -706,6 +713,8 @@ static void print_cpu_stall(unsigned long gps)
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
nbcon_cpu_emergency_exit();
panic_on_rcu_stall();
/*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment