Commit 42b956fd authored by David S. Miller's avatar David S. Miller

Merge branch 'kernel-add-support-to-collect-hardware-logs-in-crash-recovery-kernel'

Rahul Lakkireddy says:

====================
kernel: add support to collect hardware logs in crash recovery kernel

On production servers running variety of workloads over time, kernel
panic can happen sporadically after days or even months. It is
important to collect as much debug logs as possible to root cause
and fix the problem, that may not be easy to reproduce. Snapshot of
underlying hardware/firmware state (like register dump, firmware
logs, adapter memory, etc.), at the time of kernel panic will be very
helpful while debugging the culprit device driver.

This series of patches add new generic framework that enable device
drivers to collect device specific snapshot of the hardware/firmware
state of the underlying device in the crash recovery kernel. In crash
recovery kernel, the collected logs are added as elf notes to
/proc/vmcore, which is copied by user space scripts for post-analysis.

The sequence of actions done by device drivers to append their device
specific hardware/firmware logs to /proc/vmcore are as follows:

1. During probe (before hardware is initialized), device drivers
register to the vmcore module (via vmcore_add_device_dump()), with
callback function, along with buffer size and log name needed for
firmware/hardware log collection.

2. vmcore module allocates the buffer with requested size. It adds
an elf note and invokes the device driver's registered callback
function.

3. Device driver collects all hardware/firmware logs into the buffer
and returns control back to vmcore module.

The device specific hardware/firmware logs can be seen as elf notes
with note type 0x700, as shown below:

Displaying notes found at file offset 0x00001000 with length 0x040032c0:
  Owner                 Data size	Description
  LINUX                0x02000fec	Unknown note type: (0x00000700)
  LINUX                0x02000fec	Unknown note type: (0x00000700)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  CORE                 0x00000150	NT_PRSTATUS (prstatus structure)
  VMCOREINFO           0x00000785	Unknown note type: (0x00000000)

Patch 1 adds API to vmcore module to allow drivers to register callback
to collect the device specific hardware/firmware logs.  The logs will
be added to /proc/vmcore as elf notes.

Patch 2 updates read and mmap logic to append device specific hardware/
firmware logs as elf notes.

Patch 3 shows a cxgb4 driver example using the API to collect
hardware/firmware logs in crash recovery kernel, before hardware is
initialized.

Thanks,
Rahul

---
v8:
- Added missing linux/types.h header include.
- Removed __vmcore_add_device_dump().

v7:
- Removed "CHELSIO" vendor identifier in Elf Note name. Instead,
  writing "LINUX".
- Moved vmcoredd_header to new file include/uapi/linux/vmcore.h
- Reworked vmcoredd_header to include Elf Note as part of the header
  itself.
- Removed vmcoredd_get_note_size().
- Renamed vmcoredd_write_note() to vmcoredd_write_header().
- Replaced all "unsigned long" with "unsigned int" for device dump
  size since max size of Elf Word is u32.

v6:
- Reworked device dump elf note name to contain vendor identifier.
- Added vmcoredd_header that precedes actual dump in the Elf Note.
- Device dump's name is moved inside vmcoredd_header.
- Added "CHELSIO" string as vendor identifier in the Elf Note name
  for cxgb4 device dumps.

v5:
- Removed enabling CONFIG_PROC_VMCORE_DEVICE_DUMP by default and
  updated help message.

v4:
- Made __vmcore_add_device_dump() static.
- Moved compile check to define vmcore_add_device_dump() to
  crash_dump.h to fix compilation when vmcore.c is not compiled in.
- Convert ---help--- to help in Kconfig as indicated by checkpatch.
- Rebased to tip.

v3:
- Dropped sysfs crashdd module.
- Exported dumps as elf notes. Suggested by Eric Biederman
  <ebiederm@xmission.com>.  Added as patch 2 in this version.
- Added CONFIG_PROC_VMCORE_DEVICE_DUMP to allow configuring device
  dump support.
- Moved logic related to adding dumps from crashdd to vmcore module.
- Rename all crashdd* to vmcoredd*.
- Updated comments.

v2:
- Added ABI Documentation for crashdd.
- Directly use octal permission instead of macro.

Changes since rfc v2:
- Moved exporting crashdd from procfs to sysfs. Suggested by
  Stephen Hemminger <stephen@networkplumber.org>
- Moved code from fs/proc/crashdd.c to fs/crashdd/ directory.
- Replaced all proc API with sysfs API and updated comments.
- Calling driver callback before creating the binary file under
  crashdd sysfs.
- Changed binary dump file permission from S_IRUSR to S_IRUGO.
- Changed module name from CRASH_DRIVER_DUMP to CRASH_DEVICE_DUMP.

rfc v2:
- Collecting logs in 2nd kernel instead of during kernel panic.
  Suggested by Eric Biederman <ebiederm@xmission.com>.
- Added new crashdd module that exports /proc/crashdd/ containing
  driver's registered hardware/firmware logs in patch 1.
- Replaced the API to allow drivers to register their hardware/firmware
  log collect routine in crash recovery kernel in patch 1.
- Updated patch 2 to use the new API in patch 1.
====================
Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 289e1f4e 1dde532d
...@@ -50,6 +50,7 @@ ...@@ -50,6 +50,7 @@
#include <linux/net_tstamp.h> #include <linux/net_tstamp.h>
#include <linux/ptp_clock_kernel.h> #include <linux/ptp_clock_kernel.h>
#include <linux/ptp_classify.h> #include <linux/ptp_classify.h>
#include <linux/crash_dump.h>
#include <asm/io.h> #include <asm/io.h>
#include "t4_chip_type.h" #include "t4_chip_type.h"
#include "cxgb4_uld.h" #include "cxgb4_uld.h"
...@@ -964,6 +965,9 @@ struct adapter { ...@@ -964,6 +965,9 @@ struct adapter {
struct hma_data hma; struct hma_data hma;
struct srq_data *srq; struct srq_data *srq;
/* Dump buffer for collecting logs in kdump kernel */
struct vmcoredd_data vmcoredd;
}; };
/* Support for "sched-class" command to allow a TX Scheduling Class to be /* Support for "sched-class" command to allow a TX Scheduling Class to be
......
...@@ -488,3 +488,28 @@ void cxgb4_init_ethtool_dump(struct adapter *adapter) ...@@ -488,3 +488,28 @@ void cxgb4_init_ethtool_dump(struct adapter *adapter)
adapter->eth_dump.version = adapter->params.fw_vers; adapter->eth_dump.version = adapter->params.fw_vers;
adapter->eth_dump.len = 0; adapter->eth_dump.len = 0;
} }
static int cxgb4_cudbg_vmcoredd_collect(struct vmcoredd_data *data, void *buf)
{
struct adapter *adap = container_of(data, struct adapter, vmcoredd);
u32 len = data->size;
return cxgb4_cudbg_collect(adap, buf, &len, CXGB4_ETH_DUMP_ALL);
}
int cxgb4_cudbg_vmcore_add_dump(struct adapter *adap)
{
struct vmcoredd_data *data = &adap->vmcoredd;
u32 len;
len = sizeof(struct cudbg_hdr) +
sizeof(struct cudbg_entity_hdr) * CUDBG_MAX_ENTITY;
len += CUDBG_DUMP_BUFF_SIZE;
data->size = len;
snprintf(data->dump_name, sizeof(data->dump_name), "%s_%s",
cxgb4_driver_name, adap->name);
data->vmcoredd_callback = cxgb4_cudbg_vmcoredd_collect;
return vmcore_add_device_dump(data);
}
...@@ -41,8 +41,11 @@ enum CXGB4_ETHTOOL_DUMP_FLAGS { ...@@ -41,8 +41,11 @@ enum CXGB4_ETHTOOL_DUMP_FLAGS {
CXGB4_ETH_DUMP_HW = (1 << 1), /* various FW and HW dumps */ CXGB4_ETH_DUMP_HW = (1 << 1), /* various FW and HW dumps */
}; };
#define CXGB4_ETH_DUMP_ALL (CXGB4_ETH_DUMP_MEM | CXGB4_ETH_DUMP_HW)
u32 cxgb4_get_dump_length(struct adapter *adap, u32 flag); u32 cxgb4_get_dump_length(struct adapter *adap, u32 flag);
int cxgb4_cudbg_collect(struct adapter *adap, void *buf, u32 *buf_size, int cxgb4_cudbg_collect(struct adapter *adap, void *buf, u32 *buf_size,
u32 flag); u32 flag);
void cxgb4_init_ethtool_dump(struct adapter *adapter); void cxgb4_init_ethtool_dump(struct adapter *adapter);
int cxgb4_cudbg_vmcore_add_dump(struct adapter *adap);
#endif /* __CXGB4_CUDBG_H__ */ #endif /* __CXGB4_CUDBG_H__ */
...@@ -5558,6 +5558,16 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) ...@@ -5558,6 +5558,16 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (err) if (err)
goto out_free_adapter; goto out_free_adapter;
if (is_kdump_kernel()) {
/* Collect hardware state and append to /proc/vmcore */
err = cxgb4_cudbg_vmcore_add_dump(adapter);
if (err) {
dev_warn(adapter->pdev_dev,
"Fail collecting vmcore device dump, err: %d. Continuing\n",
err);
err = 0;
}
}
if (!is_t4(adapter->params.chip)) { if (!is_t4(adapter->params.chip)) {
s_qpp = (QUEUESPERPAGEPF0_S + s_qpp = (QUEUESPERPAGEPF0_S +
......
...@@ -43,6 +43,21 @@ config PROC_VMCORE ...@@ -43,6 +43,21 @@ config PROC_VMCORE
help help
Exports the dump image of crashed kernel in ELF format. Exports the dump image of crashed kernel in ELF format.
config PROC_VMCORE_DEVICE_DUMP
bool "Device Hardware/Firmware Log Collection"
depends on PROC_VMCORE
default n
help
After kernel panic, device drivers can collect the device
specific snapshot of their hardware or firmware before the
underlying devices are initialized in crash recovery kernel.
Note that the device driver must be present in the crash
recovery kernel's initramfs to collect its underlying device
snapshot.
If you say Y here, the collected device dumps will be added
as ELF notes to /proc/vmcore.
config PROC_SYSCTL config PROC_SYSCTL
bool "Sysctl support (/proc/sys)" if EXPERT bool "Sysctl support (/proc/sys)" if EXPERT
depends on PROC_FS depends on PROC_FS
......
This diff is collapsed.
...@@ -5,6 +5,7 @@ ...@@ -5,6 +5,7 @@
#include <linux/kexec.h> #include <linux/kexec.h>
#include <linux/proc_fs.h> #include <linux/proc_fs.h>
#include <linux/elf.h> #include <linux/elf.h>
#include <uapi/linux/vmcore.h>
#include <asm/pgtable.h> /* for pgprot_t */ #include <asm/pgtable.h> /* for pgprot_t */
...@@ -93,4 +94,21 @@ static inline bool is_kdump_kernel(void) { return 0; } ...@@ -93,4 +94,21 @@ static inline bool is_kdump_kernel(void) { return 0; }
#endif /* CONFIG_CRASH_DUMP */ #endif /* CONFIG_CRASH_DUMP */
extern unsigned long saved_max_pfn; extern unsigned long saved_max_pfn;
/* Device Dump information to be filled by drivers */
struct vmcoredd_data {
char dump_name[VMCOREDD_MAX_NAME_BYTES]; /* Unique name of the dump */
unsigned int size; /* Size of the dump */
/* Driver's registered callback to be invoked to collect dump */
int (*vmcoredd_callback)(struct vmcoredd_data *data, void *buf);
};
#ifdef CONFIG_PROC_VMCORE_DEVICE_DUMP
int vmcore_add_device_dump(struct vmcoredd_data *data);
#else
static inline int vmcore_add_device_dump(struct vmcoredd_data *data)
{
return -EOPNOTSUPP;
}
#endif /* CONFIG_PROC_VMCORE_DEVICE_DUMP */
#endif /* LINUX_CRASHDUMP_H */ #endif /* LINUX_CRASHDUMP_H */
...@@ -28,6 +28,12 @@ struct vmcore { ...@@ -28,6 +28,12 @@ struct vmcore {
loff_t offset; loff_t offset;
}; };
struct vmcoredd_node {
struct list_head list; /* List of dumps */
void *buf; /* Buffer containing device's dump */
unsigned int size; /* Size of the buffer */
};
#ifdef CONFIG_PROC_KCORE #ifdef CONFIG_PROC_KCORE
extern void kclist_add(struct kcore_list *, void *, size_t, int type); extern void kclist_add(struct kcore_list *, void *, size_t, int type);
#else #else
......
...@@ -421,6 +421,7 @@ typedef struct elf64_shdr { ...@@ -421,6 +421,7 @@ typedef struct elf64_shdr {
#define NT_ARM_SYSTEM_CALL 0x404 /* ARM system call number */ #define NT_ARM_SYSTEM_CALL 0x404 /* ARM system call number */
#define NT_ARM_SVE 0x405 /* ARM Scalable Vector Extension registers */ #define NT_ARM_SVE 0x405 /* ARM Scalable Vector Extension registers */
#define NT_ARC_V2 0x600 /* ARCv2 accumulator/extra registers */ #define NT_ARC_V2 0x600 /* ARCv2 accumulator/extra registers */
#define NT_VMCOREDD 0x700 /* Vmcore Device Dump Note */
/* Note header in a PT_NOTE section */ /* Note header in a PT_NOTE section */
typedef struct elf32_note { typedef struct elf32_note {
......
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _UAPI_VMCORE_H
#define _UAPI_VMCORE_H
#include <linux/types.h>
#define VMCOREDD_NOTE_NAME "LINUX"
#define VMCOREDD_MAX_NAME_BYTES 44
struct vmcoredd_header {
__u32 n_namesz; /* Name size */
__u32 n_descsz; /* Content size */
__u32 n_type; /* NT_VMCOREDD */
__u8 name[8]; /* LINUX\0\0\0 */
__u8 dump_name[VMCOREDD_MAX_NAME_BYTES]; /* Device dump's name */
};
#endif /* _UAPI_VMCORE_H */
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment