Commit f8f59847 authored by David S. Miller's avatar David S. Miller

Merge branch 'implement-DEVLINK_CMD_REGION_NEW'

Jacob Keller says:

====================
implement DEVLINK_CMD_REGION_NEW

This series adds support for the DEVLINK_CMD_REGION_NEW operation, used to
enable userspace requesting a snapshot of a region on demand.

This can be useful to enable adding regions for a driver for which there is
no trigger to create snapshots. By making this a core part of devlink, there
is no need for the drivers to use a separate channel such as debugfs.

The primary intent for this kind of region is to expose device information
that might be useful for diagnostics and information gathering.

The first few patches refactor regions to support a new ops structure for
extending the available operations that regions can perform. This includes
converting the destructor into an op from a function argument.

Next, patches refactor the snapshot id allocation to use an xarray which
tracks the number of current snapshots using a given id. This is done so
that id lifetime can be determined, and ids can be released when no longer
in use.

Without this change, snapshot ids remain used forever, until the snapshot_id
count rolled over UINT_MAX.

Finally, code to enable the previously unused DEVLINK_CMD_REGION_NEW is
added. This code enforces that the snapshot id is always provided, unlike
previous revisions of this series.

Finally, a patch is added to enable using this new command via the .snapshot
callback in both netdevsim and the ice driver.

For the ice driver, a new "nvm-flash" region is added, which will enable
read access to the NVM flash contents. The intention for this is to allow
diagnostics tools to gather information about the device. By using a
snapshot and gathering the NVM contents all at once, the contents can be
atomic.

Links to previous discussions:
1st RFC - https://lore.kernel.org/netdev/20200130225913.1671982-1-jacob.e.keller@intel.com/
2nd RFC - https://lore.kernel.org/netdev/20200214232223.3442651-1-jacob.e.keller@intel.com/
v1 - https://lore.kernel.org/netdev/20200324223445.2077900-1-jacob.e.keller@intel.com/
v2 - https://lore.kernel.org/netdev/20200326035157.2211090-1-jacob.e.keller@intel.com/

Major changes since RFC:
* use an xarray for tracking snapshot ids, rather than an IDR
* remove support for auto-generated snapshot ids in DEVLINK_CMD_REGION_NEW

See each patch for an individual changelog per-patch
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 6739ce85 dce730f1
......@@ -20,6 +20,11 @@ address regions that are otherwise inaccessible to the user.
Regions may also be used to provide an additional way to debug complex error
states, but see also :doc:`devlink-health`
Regions may optionally support capturing a snapshot on demand via the
``DEVLINK_CMD_REGION_NEW`` netlink message. A driver wishing to allow
requested snapshots must implement the ``.snapshot`` callback for the region
in its ``devlink_region_ops`` structure.
example usage
-------------
......@@ -40,6 +45,9 @@ example usage
# Delete a snapshot using:
$ devlink region del pci/0000:00:05.0/cr-space snapshot 1
# Request an immediate snapshot, if supported by the region
$ devlink region new pci/0000:00:05.0/cr-space snapshot 5
# Dump a snapshot:
$ devlink region dump pci/0000:00:05.0/fw-health snapshot 1
0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
......
......@@ -69,3 +69,29 @@ The ``ice`` driver reports the following versions
- The version of the DDP package that is active in the device. Note
that both the name (as reported by ``fw.app.name``) and version are
required to uniquely identify the package.
Regions
=======
The ``ice`` driver enables access to the contents of the Non Volatile Memory
flash chip via the ``nvm-flash`` region.
Users can request an immediate capture of a snapshot via the
``DEVLINK_CMD_REGION_NEW``
.. code:: shell
$ devlink region new pci/0000:01:00.0/nvm-flash snapshot 1
$ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1
$ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1
0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8
0000000000000020 0016 0bb8 0016 1720 0000 0000 c00f 3ffc
0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5
$ devlink region read pci/0000:01:00.0/nvm-flash snapshot 1 address 0
length 16
0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
$ devlink region delete pci/0000:01:00.0/nvm-flash snapshot 1
......@@ -351,6 +351,8 @@ struct ice_pf {
/* devlink port data */
struct devlink_port devlink_port;
struct devlink_region *nvm_region;
/* OS reserved IRQ details */
struct msix_entry *msix_entries;
struct ice_res_tracker *irq_tracker;
......
......@@ -318,3 +318,99 @@ void ice_devlink_destroy_port(struct ice_pf *pf)
devlink_port_type_clear(&pf->devlink_port);
devlink_port_unregister(&pf->devlink_port);
}
/**
* ice_devlink_nvm_snapshot - Capture a snapshot of the Shadow RAM contents
* @devlink: the devlink instance
* @extack: extended ACK response structure
* @data: on exit points to snapshot data buffer
*
* This function is called in response to the DEVLINK_CMD_REGION_TRIGGER for
* the shadow-ram devlink region. It captures a snapshot of the shadow ram
* contents. This snapshot can later be viewed via the devlink-region
* interface.
*
* @returns zero on success, and updates the data pointer. Returns a non-zero
* error code on failure.
*/
static int ice_devlink_nvm_snapshot(struct devlink *devlink,
struct netlink_ext_ack *extack, u8 **data)
{
struct ice_pf *pf = devlink_priv(devlink);
struct device *dev = ice_pf_to_dev(pf);
struct ice_hw *hw = &pf->hw;
enum ice_status status;
void *nvm_data;
u32 nvm_size;
nvm_size = hw->nvm.flash_size;
nvm_data = vzalloc(nvm_size);
if (!nvm_data)
return -ENOMEM;
status = ice_acquire_nvm(hw, ICE_RES_READ);
if (status) {
dev_dbg(dev, "ice_acquire_nvm failed, err %d aq_err %d\n",
status, hw->adminq.sq_last_status);
NL_SET_ERR_MSG_MOD(extack, "Failed to acquire NVM semaphore");
vfree(nvm_data);
return -EIO;
}
status = ice_read_flat_nvm(hw, 0, &nvm_size, nvm_data, false);
if (status) {
dev_dbg(dev, "ice_read_flat_nvm failed after reading %u bytes, err %d aq_err %d\n",
nvm_size, status, hw->adminq.sq_last_status);
NL_SET_ERR_MSG_MOD(extack, "Failed to read NVM contents");
ice_release_nvm(hw);
vfree(nvm_data);
return -EIO;
}
ice_release_nvm(hw);
*data = nvm_data;
return 0;
}
static const struct devlink_region_ops ice_nvm_region_ops = {
.name = "nvm-flash",
.destructor = vfree,
.snapshot = ice_devlink_nvm_snapshot,
};
/**
* ice_devlink_init_regions - Initialize devlink regions
* @pf: the PF device structure
*
* Create devlink regions used to enable access to dump the contents of the
* flash memory on the device.
*/
void ice_devlink_init_regions(struct ice_pf *pf)
{
struct devlink *devlink = priv_to_devlink(pf);
struct device *dev = ice_pf_to_dev(pf);
u64 nvm_size;
nvm_size = pf->hw.nvm.flash_size;
pf->nvm_region = devlink_region_create(devlink, &ice_nvm_region_ops, 1,
nvm_size);
if (IS_ERR(pf->nvm_region)) {
dev_err(dev, "failed to create NVM devlink region, err %ld\n",
PTR_ERR(pf->nvm_region));
pf->nvm_region = NULL;
}
}
/**
* ice_devlink_destroy_regions - Destroy devlink regions
* @pf: the PF device structure
*
* Remove previously created regions for this PF.
*/
void ice_devlink_destroy_regions(struct ice_pf *pf)
{
if (pf->nvm_region)
devlink_region_destroy(pf->nvm_region);
}
......@@ -11,4 +11,7 @@ void ice_devlink_unregister(struct ice_pf *pf);
int ice_devlink_create_port(struct ice_pf *pf);
void ice_devlink_destroy_port(struct ice_pf *pf);
void ice_devlink_init_regions(struct ice_pf *pf);
void ice_devlink_destroy_regions(struct ice_pf *pf);
#endif /* _ICE_DEVLINK_H_ */
......@@ -3276,6 +3276,8 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
goto err_init_pf_unroll;
}
ice_devlink_init_regions(pf);
pf->num_alloc_vsi = hw->func_caps.guar_num_vsi;
if (!pf->num_alloc_vsi) {
err = -EIO;
......@@ -3385,6 +3387,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
devm_kfree(dev, pf->vsi);
err_init_pf_unroll:
ice_deinit_pf(pf);
ice_devlink_destroy_regions(pf);
ice_deinit_hw(hw);
err_exit_unroll:
ice_devlink_unregister(pf);
......@@ -3427,6 +3430,7 @@ static void ice_remove(struct pci_dev *pdev)
ice_vsi_free_q_vectors(pf->vsi[i]);
}
ice_deinit_pf(pf);
ice_devlink_destroy_regions(pf);
ice_deinit_hw(&pf->hw);
ice_devlink_unregister(pf);
......
......@@ -38,8 +38,18 @@
#define CR_ENABLE_BIT_OFFSET 0xF3F04
#define MAX_NUM_OF_DUMPS_TO_STORE (8)
static const char *region_cr_space_str = "cr-space";
static const char *region_fw_health_str = "fw-health";
static const char * const region_cr_space_str = "cr-space";
static const char * const region_fw_health_str = "fw-health";
static const struct devlink_region_ops region_cr_space_ops = {
.name = region_cr_space_str,
.destructor = &kvfree,
};
static const struct devlink_region_ops region_fw_health_ops = {
.name = region_fw_health_str,
.destructor = &kvfree,
};
/* Set to true in case cr enable bit was set to true before crdump */
static bool crdump_enbale_bit_set;
......@@ -99,7 +109,7 @@ static void mlx4_crdump_collect_crspace(struct mlx4_dev *dev,
readl(cr_space + offset);
err = devlink_region_snapshot_create(crdump->region_crspace,
crspace_data, id, &kvfree);
crspace_data, id);
if (err) {
kvfree(crspace_data);
mlx4_warn(dev, "crdump: devlink create %s snapshot id %d err %d\n",
......@@ -138,7 +148,7 @@ static void mlx4_crdump_collect_fw_health(struct mlx4_dev *dev,
readl(health_buf_start + offset);
err = devlink_region_snapshot_create(crdump->region_fw_health,
health_data, id, &kvfree);
health_data, id);
if (err) {
kvfree(health_data);
mlx4_warn(dev, "crdump: devlink create %s snapshot id %d err %d\n",
......@@ -159,6 +169,7 @@ int mlx4_crdump_collect(struct mlx4_dev *dev)
struct pci_dev *pdev = dev->persist->pdev;
unsigned long cr_res_size;
u8 __iomem *cr_space;
int err;
u32 id;
if (!dev->caps.health_buffer_addrs) {
......@@ -179,15 +190,22 @@ int mlx4_crdump_collect(struct mlx4_dev *dev)
return -ENODEV;
}
crdump_enable_crspace_access(dev, cr_space);
/* Get the available snapshot ID for the dumps */
id = devlink_region_snapshot_id_get(devlink);
err = devlink_region_snapshot_id_get(devlink, &id);
if (err) {
mlx4_err(dev, "crdump: devlink get snapshot id err %d\n", err);
return err;
}
crdump_enable_crspace_access(dev, cr_space);
/* Try to capture dumps */
mlx4_crdump_collect_crspace(dev, cr_space, id);
mlx4_crdump_collect_fw_health(dev, cr_space, id);
/* Release reference on the snapshot id */
devlink_region_snapshot_id_put(devlink, id);
crdump_disable_crspace_access(dev, cr_space);
iounmap(cr_space);
......@@ -205,7 +223,7 @@ int mlx4_crdump_init(struct mlx4_dev *dev)
/* Create cr-space region */
crdump->region_crspace =
devlink_region_create(devlink,
region_cr_space_str,
&region_cr_space_ops,
MAX_NUM_OF_DUMPS_TO_STORE,
pci_resource_len(pdev, 0));
if (IS_ERR(crdump->region_crspace))
......@@ -216,7 +234,7 @@ int mlx4_crdump_init(struct mlx4_dev *dev)
/* Create fw-health region */
crdump->region_fw_health =
devlink_region_create(devlink,
region_fw_health_str,
&region_fw_health_ops,
MAX_NUM_OF_DUMPS_TO_STORE,
HEALTH_BUFFER_SIZE);
if (IS_ERR(crdump->region_fw_health))
......
......@@ -39,24 +39,47 @@ static struct dentry *nsim_dev_ddir;
#define NSIM_DEV_DUMMY_REGION_SIZE (1024 * 32)
static int
nsim_dev_take_snapshot(struct devlink *devlink, struct netlink_ext_ack *extack,
u8 **data)
{
void *dummy_data;
dummy_data = kmalloc(NSIM_DEV_DUMMY_REGION_SIZE, GFP_KERNEL);
if (!dummy_data)
return -ENOMEM;
get_random_bytes(dummy_data, NSIM_DEV_DUMMY_REGION_SIZE);
*data = dummy_data;
return 0;
}
static ssize_t nsim_dev_take_snapshot_write(struct file *file,
const char __user *data,
size_t count, loff_t *ppos)
{
struct nsim_dev *nsim_dev = file->private_data;
void *dummy_data;
struct devlink *devlink;
u8 *dummy_data;
int err;
u32 id;
dummy_data = kmalloc(NSIM_DEV_DUMMY_REGION_SIZE, GFP_KERNEL);
if (!dummy_data)
return -ENOMEM;
devlink = priv_to_devlink(nsim_dev);
get_random_bytes(dummy_data, NSIM_DEV_DUMMY_REGION_SIZE);
err = nsim_dev_take_snapshot(devlink, NULL, &dummy_data);
if (err)
return err;
id = devlink_region_snapshot_id_get(priv_to_devlink(nsim_dev));
err = devlink_region_snapshot_id_get(devlink, &id);
if (err) {
pr_err("Failed to get snapshot id\n");
return err;
}
err = devlink_region_snapshot_create(nsim_dev->dummy_region,
dummy_data, id, kfree);
dummy_data, id);
devlink_region_snapshot_id_put(devlink, id);
if (err) {
pr_err("Failed to create region snapshot\n");
kfree(dummy_data);
......@@ -340,11 +363,17 @@ static void nsim_devlink_param_load_driverinit_values(struct devlink *devlink)
#define NSIM_DEV_DUMMY_REGION_SNAPSHOT_MAX 16
static const struct devlink_region_ops dummy_region_ops = {
.name = "dummy",
.destructor = &kfree,
.snapshot = nsim_dev_take_snapshot,
};
static int nsim_dev_dummy_region_init(struct nsim_dev *nsim_dev,
struct devlink *devlink)
{
nsim_dev->dummy_region =
devlink_region_create(devlink, "dummy",
devlink_region_create(devlink, &dummy_region_ops,
NSIM_DEV_DUMMY_REGION_SNAPSHOT_MAX,
NSIM_DEV_DUMMY_REGION_SIZE);
return PTR_ERR_OR_ZERO(nsim_dev->dummy_region);
......
......@@ -18,6 +18,7 @@
#include <net/net_namespace.h>
#include <net/flow_offload.h>
#include <uapi/linux/devlink.h>
#include <linux/xarray.h>
struct devlink_ops;
......@@ -29,13 +30,13 @@ struct devlink {
struct list_head resource_list;
struct list_head param_list;
struct list_head region_list;
u32 snapshot_id;
struct list_head reporter_list;
struct mutex reporters_lock; /* protects reporter_list */
struct devlink_dpipe_headers *dpipe_headers;
struct list_head trap_list;
struct list_head trap_group_list;
const struct devlink_ops *ops;
struct xarray snapshot_ids;
struct device *dev;
possible_net_t _net;
struct mutex lock;
......@@ -496,7 +497,21 @@ enum devlink_param_generic_id {
struct devlink_region;
struct devlink_info_req;
typedef void devlink_snapshot_data_dest_t(const void *data);
/**
* struct devlink_region_ops - Region operations
* @name: region name
* @destructor: callback used to free snapshot memory when deleting
* @snapshot: callback to request an immediate snapshot. On success,
* the data variable must be updated to point to the snapshot data.
* The function will be called while the devlink instance lock is
* held.
*/
struct devlink_region_ops {
const char *name;
void (*destructor)(const void *data);
int (*snapshot)(struct devlink *devlink, struct netlink_ext_ack *extack,
u8 **data);
};
struct devlink_fmsg;
struct devlink_health_reporter;
......@@ -963,15 +978,15 @@ void devlink_port_param_value_changed(struct devlink_port *devlink_port,
u32 param_id);
void devlink_param_value_str_fill(union devlink_param_value *dst_val,
const char *src);
struct devlink_region *devlink_region_create(struct devlink *devlink,
const char *region_name,
u32 region_max_snapshots,
u64 region_size);
struct devlink_region *
devlink_region_create(struct devlink *devlink,
const struct devlink_region_ops *ops,
u32 region_max_snapshots, u64 region_size);
void devlink_region_destroy(struct devlink_region *region);
u32 devlink_region_snapshot_id_get(struct devlink *devlink);
int devlink_region_snapshot_id_get(struct devlink *devlink, u32 *id);
void devlink_region_snapshot_id_put(struct devlink *devlink, u32 id);
int devlink_region_snapshot_create(struct devlink_region *region,
u8 *data, u32 snapshot_id,
devlink_snapshot_data_dest_t *data_destructor);
u8 *data, u32 snapshot_id);
int devlink_info_serial_number_put(struct devlink_info_req *req,
const char *sn);
int devlink_info_driver_name_put(struct devlink_info_req *req,
......
This diff is collapsed.
......@@ -141,6 +141,16 @@ regions_test()
check_region_snapshot_count dummy post-first-delete 2
devlink region new $DL_HANDLE/dummy snapshot 25
check_err $? "Failed to create a new snapshot with id 25"
check_region_snapshot_count dummy post-first-request 3
devlink region del $DL_HANDLE/dummy snapshot 25
check_err $? "Failed to delete snapshot with id 25"
check_region_snapshot_count dummy post-second-delete 2
log_test "regions test"
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment