Commit d3cf4051 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Prune private items from vfio_pci_core.h to a new internal header,
   fix missed function rename, and refactor vfio-pci interrupt defines
   (Jason Gunthorpe)

 - Create consistent naming and handling of ioctls with a function per
   ioctl for vfio-pci and vfio group handling, use proper type args
   where available (Jason Gunthorpe)

 - Implement a set of low power device feature ioctls allowing userspace
   to make use of power states such as D3cold where supported (Abhishek
   Sahu)

 - Remove device counter on vfio groups, which had restricted the page
   pinning interface to singleton groups to account for limitations in
   the type1 IOMMU backend. Document usage as limited to emulated IOMMU
   devices, ie. traditional mdev devices where this restriction is
   consistent (Jason Gunthorpe)

 - Correct function prefix in hisi_acc driver incurred during previous
   refactoring (Shameer Kolothum)

 - Correct typo and remove redundant warning triggers in vfio-fsl driver
   (Christophe JAILLET)

 - Introduce device level DMA dirty tracking uAPI and implementation in
   the mlx5 variant driver (Yishai Hadas & Joao Martins)

 - Move much of the vfio_device life cycle management into vfio core,
   simplifying and avoiding duplication across drivers. This also
   facilitates adding a struct device to vfio_device which begins the
   introduction of device rather than group level user support and fills
   a gap allowing userspace identify devices as vfio capable without
   implicit knowledge of the driver (Kevin Tian & Yi Liu)

 - Split vfio container handling to a separate file, creating a more
   well defined API between the core and container code, masking IOMMU
   backend implementation from the core, allowing for an easier future
   transition to an iommufd based implementation of the same (Jason
   Gunthorpe)

 - Attempt to resolve race accessing the iommu_group for a device
   between vfio releasing DMA ownership and removal of the device from
   the IOMMU driver. Follow-up with support to allow vfio_group to exist
   with NULL iommu_group pointer to support existing userspace use cases
   of holding the group file open (Jason Gunthorpe)

 - Fix error code and hi/lo register manipulation issues in the hisi_acc
   variant driver, along with various code cleanups (Longfang Liu)

 - Fix a prior regression in GVT-g group teardown, resulting in
   unreleased resources (Jason Gunthorpe)

 - A significant cleanup and simplification of the mdev interface,
   consolidating much of the open coded per driver sysfs interface
   support into the mdev core (Christoph Hellwig)

 - Simplification of tracking and locking around vfio_groups that fall
   out from previous refactoring (Jason Gunthorpe)

 - Replace trivial open coded f_ops tests with new helper (Alex
   Williamson)

* tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio: (77 commits)
  vfio: More vfio_file_is_group() use cases
  vfio: Make the group FD disassociate from the iommu_group
  vfio: Hold a reference to the iommu_group in kvm for SPAPR
  vfio: Add vfio_file_is_group()
  vfio: Change vfio_group->group_rwsem to a mutex
  vfio: Remove the vfio_group->users and users_comp
  vfio/mdev: add mdev available instance checking to the core
  vfio/mdev: consolidate all the description sysfs into the core code
  vfio/mdev: consolidate all the available_instance sysfs into the core code
  vfio/mdev: consolidate all the name sysfs into the core code
  vfio/mdev: consolidate all the device_api sysfs into the core code
  vfio/mdev: remove mtype_get_parent_dev
  vfio/mdev: remove mdev_parent_dev
  vfio/mdev: unexport mdev_bus_type
  vfio/mdev: remove mdev_from_dev
  vfio/mdev: simplify mdev_type handling
  vfio/mdev: embedd struct mdev_parent in the parent data structure
  vfio/mdev: make mdev.h standalone includable
  drm/i915/gvt: simplify vgpu configuration management
  drm/i915/gvt: fix a memory leak in intel_gvt_init_vgpu_types
  ...
parents 778ce723 b1b8132a
What: /sys/.../<device>/vfio-dev/vfioX/
Date: September 2022
Contact: Yi Liu <yi.l.liu@intel.com>
Description:
This directory is created when the device is bound to a
vfio driver. The layout under this directory matches what
exists for a standard 'struct device'. 'X' is a unique
index marking this device in vfio.
......@@ -58,19 +58,19 @@ devices as examples, as these devices are the first devices to use this module::
| MDEV CORE |
| MODULE |
| mdev.ko |
| +-----------+ | mdev_register_device() +--------------+
| +-----------+ | mdev_register_parent() +--------------+
| | | +<------------------------+ |
| | | | | nvidia.ko |<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| | Physical | |
| | device | | mdev_register_device() +--------------+
| | device | | mdev_register_parent() +--------------+
| | interface | |<------------------------+ |
| | | | | i915.ko |<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| | | |
| | | | mdev_register_device() +--------------+
| | | | mdev_register_parent() +--------------+
| | | +<------------------------+ |
| | | | | ccw_device.ko|<-> physical
| | | +------------------------>+ | device
......@@ -103,7 +103,8 @@ structure to represent a mediated device's driver::
struct mdev_driver {
int (*probe) (struct mdev_device *dev);
void (*remove) (struct mdev_device *dev);
struct attribute_group **supported_type_groups;
unsigned int (*get_available)(struct mdev_type *mtype);
ssize_t (*show_description)(struct mdev_type *mtype, char *buf);
struct device_driver driver;
};
......@@ -125,8 +126,8 @@ vfio_device_ops.
When a driver wants to add the GUID creation sysfs to an existing device it has
probe'd to then it should call::
int mdev_register_device(struct device *dev,
struct mdev_driver *mdev_driver);
int mdev_register_parent(struct mdev_parent *parent, struct device *dev,
struct mdev_driver *mdev_driver);
This will provide the 'mdev_supported_types/XX/create' files which can then be
used to trigger the creation of a mdev_device. The created mdev_device will be
......@@ -134,7 +135,7 @@ attached to the specified driver.
When the driver needs to remove itself it calls::
void mdev_unregister_device(struct device *dev);
void mdev_unregister_parent(struct mdev_parent *parent);
Which will unbind and destroy all the created mdevs and remove the sysfs files.
......@@ -200,17 +201,14 @@ Directories and files under the sysfs for Each Physical Device
sprintf(buf, "%s-%s", dev_driver_string(parent->dev), group->name);
(or using mdev_parent_dev(mdev) to arrive at the parent device outside
of the core mdev code)
* device_api
This attribute should show which device API is being created, for example,
This attribute shows which device API is being created, for example,
"vfio-pci" for a PCI device.
* available_instances
This attribute should show the number of devices of type <type-id> that can be
This attribute shows the number of devices of type <type-id> that can be
created.
* [device]
......@@ -220,11 +218,11 @@ Directories and files under the sysfs for Each Physical Device
* name
This attribute should show human readable name. This is optional attribute.
This attribute shows a human readable name.
* description
This attribute should show brief features/description of the type. This is
This attribute can show brief features/description of the type. This is an
optional attribute.
Directories and Files Under the sysfs for Each mdev Device
......
......@@ -297,7 +297,7 @@ of the VFIO AP mediated device driver::
| MDEV CORE |
| MODULE |
| mdev.ko |
| +---------+ | mdev_register_device() +--------------+
| +---------+ | mdev_register_parent() +--------------+
| |Physical | +<-----------------------+ |
| | device | | | vfio_ap.ko |<-> matrix
| |interface| +----------------------->+ | device
......
......@@ -156,7 +156,7 @@ Below is a high Level block diagram::
| MDEV CORE |
| MODULE |
| mdev.ko |
| +---------+ | mdev_register_device() +--------------+
| +---------+ | mdev_register_parent() +--------------+
| |Physical | +<-----------------------+ |
| | device | | | vfio_ccw.ko |<-> subchannel
| |interface| +----------------------->+ | device
......
......@@ -21558,6 +21558,7 @@ R: Cornelia Huck <cohuck@redhat.com>
L: kvm@vger.kernel.org
S: Maintained
T: git git://github.com/awilliam/linux-vfio.git
F: Documentation/ABI/testing/sysfs-devices-vfio-dev
F: Documentation/driver-api/vfio.rst
F: drivers/vfio/
F: include/linux/vfio.h
......
......@@ -240,13 +240,13 @@ static void free_resource(struct intel_vgpu *vgpu)
}
static int alloc_resource(struct intel_vgpu *vgpu,
struct intel_vgpu_creation_params *param)
const struct intel_vgpu_config *conf)
{
struct intel_gvt *gvt = vgpu->gvt;
unsigned long request, avail, max, taken;
const char *item;
if (!param->low_gm_sz || !param->high_gm_sz || !param->fence_sz) {
if (!conf->low_mm || !conf->high_mm || !conf->fence) {
gvt_vgpu_err("Invalid vGPU creation params\n");
return -EINVAL;
}
......@@ -255,7 +255,7 @@ static int alloc_resource(struct intel_vgpu *vgpu,
max = gvt_aperture_sz(gvt) - HOST_LOW_GM_SIZE;
taken = gvt->gm.vgpu_allocated_low_gm_size;
avail = max - taken;
request = MB_TO_BYTES(param->low_gm_sz);
request = conf->low_mm;
if (request > avail)
goto no_enough_resource;
......@@ -266,7 +266,7 @@ static int alloc_resource(struct intel_vgpu *vgpu,
max = gvt_hidden_sz(gvt) - HOST_HIGH_GM_SIZE;
taken = gvt->gm.vgpu_allocated_high_gm_size;
avail = max - taken;
request = MB_TO_BYTES(param->high_gm_sz);
request = conf->high_mm;
if (request > avail)
goto no_enough_resource;
......@@ -277,16 +277,16 @@ static int alloc_resource(struct intel_vgpu *vgpu,
max = gvt_fence_sz(gvt) - HOST_FENCE;
taken = gvt->fence.vgpu_allocated_fence_num;
avail = max - taken;
request = param->fence_sz;
request = conf->fence;
if (request > avail)
goto no_enough_resource;
vgpu_fence_sz(vgpu) = request;
gvt->gm.vgpu_allocated_low_gm_size += MB_TO_BYTES(param->low_gm_sz);
gvt->gm.vgpu_allocated_high_gm_size += MB_TO_BYTES(param->high_gm_sz);
gvt->fence.vgpu_allocated_fence_num += param->fence_sz;
gvt->gm.vgpu_allocated_low_gm_size += conf->low_mm;
gvt->gm.vgpu_allocated_high_gm_size += conf->high_mm;
gvt->fence.vgpu_allocated_fence_num += conf->fence;
return 0;
no_enough_resource:
......@@ -340,11 +340,11 @@ void intel_vgpu_reset_resource(struct intel_vgpu *vgpu)
*
*/
int intel_vgpu_alloc_resource(struct intel_vgpu *vgpu,
struct intel_vgpu_creation_params *param)
const struct intel_vgpu_config *conf)
{
int ret;
ret = alloc_resource(vgpu, param);
ret = alloc_resource(vgpu, conf);
if (ret)
return ret;
......
......@@ -36,6 +36,7 @@
#include <uapi/linux/pci_regs.h>
#include <linux/kvm_host.h>
#include <linux/vfio.h>
#include <linux/mdev.h>
#include "i915_drv.h"
#include "intel_gvt.h"
......@@ -172,6 +173,7 @@ struct intel_vgpu_submission {
#define KVMGT_DEBUGFS_FILENAME "kvmgt_nr_cache_entries"
struct intel_vgpu {
struct vfio_device vfio_device;
struct intel_gvt *gvt;
struct mutex vgpu_lock;
int id;
......@@ -211,7 +213,6 @@ struct intel_vgpu {
u32 scan_nonprivbb;
struct vfio_device vfio_device;
struct vfio_region *region;
int num_regions;
struct eventfd_ctx *intx_trigger;
......@@ -294,15 +295,25 @@ struct intel_gvt_firmware {
bool firmware_loaded;
};
#define NR_MAX_INTEL_VGPU_TYPES 20
struct intel_vgpu_type {
char name[16];
unsigned int avail_instance;
unsigned int low_gm_size;
unsigned int high_gm_size;
struct intel_vgpu_config {
unsigned int low_mm;
unsigned int high_mm;
unsigned int fence;
/*
* A vGPU with a weight of 8 will get twice as much GPU as a vGPU with
* a weight of 4 on a contended host, different vGPU type has different
* weight set. Legal weights range from 1 to 16.
*/
unsigned int weight;
enum intel_vgpu_edid resolution;
enum intel_vgpu_edid edid;
const char *name;
};
struct intel_vgpu_type {
struct mdev_type type;
char name[16];
const struct intel_vgpu_config *conf;
};
struct intel_gvt {
......@@ -326,6 +337,8 @@ struct intel_gvt {
struct intel_gvt_workload_scheduler scheduler;
struct notifier_block shadow_ctx_notifier_block[I915_NUM_ENGINES];
DECLARE_HASHTABLE(cmd_table, GVT_CMD_HASH_BITS);
struct mdev_parent parent;
struct mdev_type **mdev_types;
struct intel_vgpu_type *types;
unsigned int num_types;
struct intel_vgpu *idle_vgpu;
......@@ -436,19 +449,8 @@ int intel_gvt_load_firmware(struct intel_gvt *gvt);
/* ring context size i.e. the first 0x50 dwords*/
#define RING_CTX_SIZE 320
struct intel_vgpu_creation_params {
__u64 low_gm_sz; /* in MB */
__u64 high_gm_sz; /* in MB */
__u64 fence_sz;
__u64 resolution;
__s32 primary;
__u64 vgpu_id;
__u32 weight;
};
int intel_vgpu_alloc_resource(struct intel_vgpu *vgpu,
struct intel_vgpu_creation_params *param);
const struct intel_vgpu_config *conf);
void intel_vgpu_reset_resource(struct intel_vgpu *vgpu);
void intel_vgpu_free_resource(struct intel_vgpu *vgpu);
void intel_vgpu_write_fence(struct intel_vgpu *vgpu,
......@@ -494,8 +496,8 @@ void intel_gvt_clean_vgpu_types(struct intel_gvt *gvt);
struct intel_vgpu *intel_gvt_create_idle_vgpu(struct intel_gvt *gvt);
void intel_gvt_destroy_idle_vgpu(struct intel_vgpu *vgpu);
struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
struct intel_vgpu_type *type);
int intel_gvt_create_vgpu(struct intel_vgpu *vgpu,
const struct intel_vgpu_config *conf);
void intel_gvt_destroy_vgpu(struct intel_vgpu *vgpu);
void intel_gvt_release_vgpu(struct intel_vgpu *vgpu);
void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
......
......@@ -34,7 +34,6 @@
*/
#include <linux/init.h>
#include <linux/device.h>
#include <linux/mm.h>
#include <linux/kthread.h>
#include <linux/sched/mm.h>
......@@ -43,7 +42,6 @@
#include <linux/rbtree.h>
#include <linux/spinlock.h>
#include <linux/eventfd.h>
#include <linux/uuid.h>
#include <linux/mdev.h>
#include <linux/debugfs.h>
......@@ -115,117 +113,18 @@ static void kvmgt_page_track_flush_slot(struct kvm *kvm,
struct kvm_memory_slot *slot,
struct kvm_page_track_notifier_node *node);
static ssize_t available_instances_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr,
char *buf)
static ssize_t intel_vgpu_show_description(struct mdev_type *mtype, char *buf)
{
struct intel_vgpu_type *type;
unsigned int num = 0;
struct intel_gvt *gvt = kdev_to_i915(mtype_get_parent_dev(mtype))->gvt;
type = &gvt->types[mtype_get_type_group_id(mtype)];
if (!type)
num = 0;
else
num = type->avail_instance;
return sprintf(buf, "%u\n", num);
}
static ssize_t device_api_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sprintf(buf, "%s\n", VFIO_DEVICE_API_PCI_STRING);
}
static ssize_t description_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
struct intel_vgpu_type *type;
struct intel_gvt *gvt = kdev_to_i915(mtype_get_parent_dev(mtype))->gvt;
type = &gvt->types[mtype_get_type_group_id(mtype)];
if (!type)
return 0;
struct intel_vgpu_type *type =
container_of(mtype, struct intel_vgpu_type, type);
return sprintf(buf, "low_gm_size: %dMB\nhigh_gm_size: %dMB\n"
"fence: %d\nresolution: %s\n"
"weight: %d\n",
BYTES_TO_MB(type->low_gm_size),
BYTES_TO_MB(type->high_gm_size),
type->fence, vgpu_edid_str(type->resolution),
type->weight);
}
static ssize_t name_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
struct intel_vgpu_type *type;
struct intel_gvt *gvt = kdev_to_i915(mtype_get_parent_dev(mtype))->gvt;
type = &gvt->types[mtype_get_type_group_id(mtype)];
if (!type)
return 0;
return sprintf(buf, "%s\n", type->name);
}
static MDEV_TYPE_ATTR_RO(available_instances);
static MDEV_TYPE_ATTR_RO(device_api);
static MDEV_TYPE_ATTR_RO(description);
static MDEV_TYPE_ATTR_RO(name);
static struct attribute *gvt_type_attrs[] = {
&mdev_type_attr_available_instances.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_description.attr,
&mdev_type_attr_name.attr,
NULL,
};
static struct attribute_group *gvt_vgpu_type_groups[] = {
[0 ... NR_MAX_INTEL_VGPU_TYPES - 1] = NULL,
};
static int intel_gvt_init_vgpu_type_groups(struct intel_gvt *gvt)
{
int i, j;
struct intel_vgpu_type *type;
struct attribute_group *group;
for (i = 0; i < gvt->num_types; i++) {
type = &gvt->types[i];
group = kzalloc(sizeof(struct attribute_group), GFP_KERNEL);
if (!group)
goto unwind;
group->name = type->name;
group->attrs = gvt_type_attrs;
gvt_vgpu_type_groups[i] = group;
}
return 0;
unwind:
for (j = 0; j < i; j++) {
group = gvt_vgpu_type_groups[j];
kfree(group);
}
return -ENOMEM;
}
static void intel_gvt_cleanup_vgpu_type_groups(struct intel_gvt *gvt)
{
int i;
struct attribute_group *group;
for (i = 0; i < gvt->num_types; i++) {
group = gvt_vgpu_type_groups[i];
gvt_vgpu_type_groups[i] = NULL;
kfree(group);
}
BYTES_TO_MB(type->conf->low_mm),
BYTES_TO_MB(type->conf->high_mm),
type->conf->fence, vgpu_edid_str(type->conf->edid),
type->conf->weight);
}
static void gvt_unpin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn,
......@@ -1546,7 +1445,28 @@ static const struct attribute_group *intel_vgpu_groups[] = {
NULL,
};
static int intel_vgpu_init_dev(struct vfio_device *vfio_dev)
{
struct mdev_device *mdev = to_mdev_device(vfio_dev->dev);
struct intel_vgpu *vgpu = vfio_dev_to_vgpu(vfio_dev);
struct intel_vgpu_type *type =
container_of(mdev->type, struct intel_vgpu_type, type);
vgpu->gvt = kdev_to_i915(mdev->type->parent->dev)->gvt;
return intel_gvt_create_vgpu(vgpu, type->conf);
}
static void intel_vgpu_release_dev(struct vfio_device *vfio_dev)
{
struct intel_vgpu *vgpu = vfio_dev_to_vgpu(vfio_dev);
intel_gvt_destroy_vgpu(vgpu);
vfio_free_device(vfio_dev);
}
static const struct vfio_device_ops intel_vgpu_dev_ops = {
.init = intel_vgpu_init_dev,
.release = intel_vgpu_release_dev,
.open_device = intel_vgpu_open_device,
.close_device = intel_vgpu_close_device,
.read = intel_vgpu_read,
......@@ -1558,35 +1478,28 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = {
static int intel_vgpu_probe(struct mdev_device *mdev)
{
struct device *pdev = mdev_parent_dev(mdev);
struct intel_gvt *gvt = kdev_to_i915(pdev)->gvt;
struct intel_vgpu_type *type;
struct intel_vgpu *vgpu;
int ret;
type = &gvt->types[mdev_get_type_group_id(mdev)];
if (!type)
return -EINVAL;
vgpu = intel_gvt_create_vgpu(gvt, type);
vgpu = vfio_alloc_device(intel_vgpu, vfio_device, &mdev->dev,
&intel_vgpu_dev_ops);
if (IS_ERR(vgpu)) {
gvt_err("failed to create intel vgpu: %ld\n", PTR_ERR(vgpu));
return PTR_ERR(vgpu);
}
vfio_init_group_dev(&vgpu->vfio_device, &mdev->dev,
&intel_vgpu_dev_ops);
dev_set_drvdata(&mdev->dev, vgpu);
ret = vfio_register_emulated_iommu_dev(&vgpu->vfio_device);
if (ret) {
intel_gvt_destroy_vgpu(vgpu);
return ret;
}
if (ret)
goto out_put_vdev;
gvt_dbg_core("intel_vgpu_create succeeded for mdev: %s\n",
dev_name(mdev_dev(mdev)));
return 0;
out_put_vdev:
vfio_put_device(&vgpu->vfio_device);
return ret;
}
static void intel_vgpu_remove(struct mdev_device *mdev)
......@@ -1595,18 +1508,43 @@ static void intel_vgpu_remove(struct mdev_device *mdev)
if (WARN_ON_ONCE(vgpu->attached))
return;
intel_gvt_destroy_vgpu(vgpu);
vfio_unregister_group_dev(&vgpu->vfio_device);
vfio_put_device(&vgpu->vfio_device);
}
static unsigned int intel_vgpu_get_available(struct mdev_type *mtype)
{
struct intel_vgpu_type *type =
container_of(mtype, struct intel_vgpu_type, type);
struct intel_gvt *gvt = kdev_to_i915(mtype->parent->dev)->gvt;
unsigned int low_gm_avail, high_gm_avail, fence_avail;
mutex_lock(&gvt->lock);
low_gm_avail = gvt_aperture_sz(gvt) - HOST_LOW_GM_SIZE -
gvt->gm.vgpu_allocated_low_gm_size;
high_gm_avail = gvt_hidden_sz(gvt) - HOST_HIGH_GM_SIZE -
gvt->gm.vgpu_allocated_high_gm_size;
fence_avail = gvt_fence_sz(gvt) - HOST_FENCE -
gvt->fence.vgpu_allocated_fence_num;
mutex_unlock(&gvt->lock);
return min3(low_gm_avail / type->conf->low_mm,
high_gm_avail / type->conf->high_mm,
fence_avail / type->conf->fence);
}
static struct mdev_driver intel_vgpu_mdev_driver = {
.device_api = VFIO_DEVICE_API_PCI_STRING,
.driver = {
.name = "intel_vgpu_mdev",
.owner = THIS_MODULE,
.dev_groups = intel_vgpu_groups,
},
.probe = intel_vgpu_probe,
.remove = intel_vgpu_remove,
.supported_type_groups = gvt_vgpu_type_groups,
.probe = intel_vgpu_probe,
.remove = intel_vgpu_remove,
.get_available = intel_vgpu_get_available,
.show_description = intel_vgpu_show_description,
};
int intel_gvt_page_track_add(struct intel_vgpu *info, u64 gfn)
......@@ -1904,8 +1842,7 @@ static void intel_gvt_clean_device(struct drm_i915_private *i915)
if (drm_WARN_ON(&i915->drm, !gvt))
return;
mdev_unregister_device(i915->drm.dev);
intel_gvt_cleanup_vgpu_type_groups(gvt);
mdev_unregister_parent(&gvt->parent);
intel_gvt_destroy_idle_vgpu(gvt->idle_vgpu);
intel_gvt_clean_vgpu_types(gvt);
......@@ -2005,19 +1942,15 @@ static int intel_gvt_init_device(struct drm_i915_private *i915)
intel_gvt_debugfs_init(gvt);
ret = intel_gvt_init_vgpu_type_groups(gvt);
ret = mdev_register_parent(&gvt->parent, i915->drm.dev,
&intel_vgpu_mdev_driver,
gvt->mdev_types, gvt->num_types);
if (ret)
goto out_destroy_idle_vgpu;
ret = mdev_register_device(i915->drm.dev, &intel_vgpu_mdev_driver);
if (ret)
goto out_cleanup_vgpu_type_groups;
gvt_dbg_core("gvt device initialization is done\n");
return 0;
out_cleanup_vgpu_type_groups:
intel_gvt_cleanup_vgpu_type_groups(gvt);
out_destroy_idle_vgpu:
intel_gvt_destroy_idle_vgpu(gvt->idle_vgpu);
intel_gvt_debugfs_clean(gvt);
......
......@@ -73,24 +73,21 @@ void populate_pvinfo_page(struct intel_vgpu *vgpu)
drm_WARN_ON(&i915->drm, sizeof(struct vgt_if) != VGT_PVINFO_SIZE);
}
/*
* vGPU type name is defined as GVTg_Vx_y which contains the physical GPU
* generation type (e.g V4 as BDW server, V5 as SKL server).
*
* Depening on the physical SKU resource, we might see vGPU types like
* GVTg_V4_8, GVTg_V4_4, GVTg_V4_2, etc. We can create different types of
* vGPU on same physical GPU depending on available resource. Each vGPU
* type will have a different number of avail_instance to indicate how
* many vGPU instance can be created for this type.
*/
#define VGPU_MAX_WEIGHT 16
#define VGPU_WEIGHT(vgpu_num) \
(VGPU_MAX_WEIGHT / (vgpu_num))
static const struct {
unsigned int low_mm;
unsigned int high_mm;
unsigned int fence;
/* A vGPU with a weight of 8 will get twice as much GPU as a vGPU
* with a weight of 4 on a contended host, different vGPU type has
* different weight set. Legal weights range from 1 to 16.
*/
unsigned int weight;
enum intel_vgpu_edid edid;
const char *name;
} vgpu_types[] = {
/* Fixed vGPU type table */
static const struct intel_vgpu_config intel_vgpu_configs[] = {
{ MB_TO_BYTES(64), MB_TO_BYTES(384), 4, VGPU_WEIGHT(8), GVT_EDID_1024_768, "8" },
{ MB_TO_BYTES(128), MB_TO_BYTES(512), 4, VGPU_WEIGHT(4), GVT_EDID_1920_1200, "4" },
{ MB_TO_BYTES(256), MB_TO_BYTES(1024), 4, VGPU_WEIGHT(2), GVT_EDID_1920_1200, "2" },
......@@ -106,102 +103,58 @@ static const struct {
*/
int intel_gvt_init_vgpu_types(struct intel_gvt *gvt)
{
unsigned int num_types;
unsigned int i, low_avail, high_avail;
unsigned int min_low;
/* vGPU type name is defined as GVTg_Vx_y which contains
* physical GPU generation type (e.g V4 as BDW server, V5 as
* SKL server).
*
* Depend on physical SKU resource, might see vGPU types like
* GVTg_V4_8, GVTg_V4_4, GVTg_V4_2, etc. We can create
* different types of vGPU on same physical GPU depending on
* available resource. Each vGPU type will have "avail_instance"
* to indicate how many vGPU instance can be created for this
* type.
*
*/
low_avail = gvt_aperture_sz(gvt) - HOST_LOW_GM_SIZE;
high_avail = gvt_hidden_sz(gvt) - HOST_HIGH_GM_SIZE;
num_types = ARRAY_SIZE(vgpu_types);
unsigned int low_avail = gvt_aperture_sz(gvt) - HOST_LOW_GM_SIZE;
unsigned int high_avail = gvt_hidden_sz(gvt) - HOST_HIGH_GM_SIZE;
unsigned int num_types = ARRAY_SIZE(intel_vgpu_configs);
unsigned int i;
gvt->types = kcalloc(num_types, sizeof(struct intel_vgpu_type),
GFP_KERNEL);
if (!gvt->types)
return -ENOMEM;
min_low = MB_TO_BYTES(32);
for (i = 0; i < num_types; ++i) {
if (low_avail / vgpu_types[i].low_mm == 0)
break;
gvt->types[i].low_gm_size = vgpu_types[i].low_mm;
gvt->types[i].high_gm_size = vgpu_types[i].high_mm;
gvt->types[i].fence = vgpu_types[i].fence;
gvt->mdev_types = kcalloc(num_types, sizeof(*gvt->mdev_types),
GFP_KERNEL);
if (!gvt->mdev_types)
goto out_free_types;
if (vgpu_types[i].weight < 1 ||
vgpu_types[i].weight > VGPU_MAX_WEIGHT)
return -EINVAL;
for (i = 0; i < num_types; ++i) {
const struct intel_vgpu_config *conf = &intel_vgpu_configs[i];
gvt->types[i].weight = vgpu_types[i].weight;
gvt->types[i].resolution = vgpu_types[i].edid;
gvt->types[i].avail_instance = min(low_avail / vgpu_types[i].low_mm,
high_avail / vgpu_types[i].high_mm);
if (low_avail / conf->low_mm == 0)
break;
if (conf->weight < 1 || conf->weight > VGPU_MAX_WEIGHT)
goto out_free_mdev_types;
if (GRAPHICS_VER(gvt->gt->i915) == 8)
sprintf(gvt->types[i].name, "GVTg_V4_%s",
vgpu_types[i].name);
else if (GRAPHICS_VER(gvt->gt->i915) == 9)
sprintf(gvt->types[i].name, "GVTg_V5_%s",
vgpu_types[i].name);
sprintf(gvt->types[i].name, "GVTg_V%u_%s",
GRAPHICS_VER(gvt->gt->i915) == 8 ? 4 : 5, conf->name);
gvt->types[i].conf = conf;
gvt_dbg_core("type[%d]: %s avail %u low %u high %u fence %u weight %u res %s\n",
i, gvt->types[i].name,
gvt->types[i].avail_instance,
gvt->types[i].low_gm_size,
gvt->types[i].high_gm_size, gvt->types[i].fence,
gvt->types[i].weight,
vgpu_edid_str(gvt->types[i].resolution));
min(low_avail / conf->low_mm,
high_avail / conf->high_mm),
conf->low_mm, conf->high_mm, conf->fence,
conf->weight, vgpu_edid_str(conf->edid));
gvt->mdev_types[i] = &gvt->types[i].type;
gvt->mdev_types[i]->sysfs_name = gvt->types[i].name;
}
gvt->num_types = i;
return 0;
}
void intel_gvt_clean_vgpu_types(struct intel_gvt *gvt)
{
out_free_mdev_types:
kfree(gvt->mdev_types);
out_free_types:
kfree(gvt->types);
return -EINVAL;
}
static void intel_gvt_update_vgpu_types(struct intel_gvt *gvt)
void intel_gvt_clean_vgpu_types(struct intel_gvt *gvt)
{
int i;
unsigned int low_gm_avail, high_gm_avail, fence_avail;
unsigned int low_gm_min, high_gm_min, fence_min;
/* Need to depend on maxium hw resource size but keep on
* static config for now.
*/
low_gm_avail = gvt_aperture_sz(gvt) - HOST_LOW_GM_SIZE -
gvt->gm.vgpu_allocated_low_gm_size;
high_gm_avail = gvt_hidden_sz(gvt) - HOST_HIGH_GM_SIZE -
gvt->gm.vgpu_allocated_high_gm_size;
fence_avail = gvt_fence_sz(gvt) - HOST_FENCE -
gvt->fence.vgpu_allocated_fence_num;
for (i = 0; i < gvt->num_types; i++) {
low_gm_min = low_gm_avail / gvt->types[i].low_gm_size;
high_gm_min = high_gm_avail / gvt->types[i].high_gm_size;
fence_min = fence_avail / gvt->types[i].fence;
gvt->types[i].avail_instance = min(min(low_gm_min, high_gm_min),
fence_min);
gvt_dbg_core("update type[%d]: %s avail %u low %u high %u fence %u\n",
i, gvt->types[i].name,
gvt->types[i].avail_instance, gvt->types[i].low_gm_size,
gvt->types[i].high_gm_size, gvt->types[i].fence);
}
kfree(gvt->mdev_types);
kfree(gvt->types);
}
/**
......@@ -298,12 +251,6 @@ void intel_gvt_destroy_vgpu(struct intel_vgpu *vgpu)
intel_vgpu_clean_mmio(vgpu);
intel_vgpu_dmabuf_cleanup(vgpu);
mutex_unlock(&vgpu->vgpu_lock);
mutex_lock(&gvt->lock);
intel_gvt_update_vgpu_types(gvt);
mutex_unlock(&gvt->lock);
vfree(vgpu);
}
#define IDLE_VGPU_IDR 0
......@@ -363,42 +310,38 @@ void intel_gvt_destroy_idle_vgpu(struct intel_vgpu *vgpu)
vfree(vgpu);
}
static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
struct intel_vgpu_creation_params *param)
int intel_gvt_create_vgpu(struct intel_vgpu *vgpu,
const struct intel_vgpu_config *conf)
{
struct intel_gvt *gvt = vgpu->gvt;
struct drm_i915_private *dev_priv = gvt->gt->i915;
struct intel_vgpu *vgpu;
int ret;
gvt_dbg_core("low %llu MB high %llu MB fence %llu\n",
param->low_gm_sz, param->high_gm_sz,
param->fence_sz);
vgpu = vzalloc(sizeof(*vgpu));
if (!vgpu)
return ERR_PTR(-ENOMEM);
gvt_dbg_core("low %u MB high %u MB fence %u\n",
BYTES_TO_MB(conf->low_mm), BYTES_TO_MB(conf->high_mm),
conf->fence);
mutex_lock(&gvt->lock);
ret = idr_alloc(&gvt->vgpu_idr, vgpu, IDLE_VGPU_IDR + 1, GVT_MAX_VGPU,
GFP_KERNEL);
if (ret < 0)
goto out_free_vgpu;
goto out_unlock;;
vgpu->id = ret;
vgpu->gvt = gvt;
vgpu->sched_ctl.weight = param->weight;
vgpu->sched_ctl.weight = conf->weight;
mutex_init(&vgpu->vgpu_lock);
mutex_init(&vgpu->dmabuf_lock);
INIT_LIST_HEAD(&vgpu->dmabuf_obj_list_head);
INIT_RADIX_TREE(&vgpu->page_track_tree, GFP_KERNEL);
idr_init_base(&vgpu->object_idr, 1);
intel_vgpu_init_cfg_space(vgpu, param->primary);
intel_vgpu_init_cfg_space(vgpu, 1);
vgpu->d3_entered = false;
ret = intel_vgpu_init_mmio(vgpu);
if (ret)
goto out_clean_idr;
ret = intel_vgpu_alloc_resource(vgpu, param);
ret = intel_vgpu_alloc_resource(vgpu, conf);
if (ret)
goto out_clean_vgpu_mmio;
......@@ -412,7 +355,7 @@ static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
if (ret)
goto out_clean_gtt;
ret = intel_vgpu_init_display(vgpu, param->resolution);
ret = intel_vgpu_init_display(vgpu, conf->edid);
if (ret)
goto out_clean_opregion;
......@@ -437,7 +380,9 @@ static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
if (ret)
goto out_clean_sched_policy;
return vgpu;
intel_gvt_update_reg_whitelist(vgpu);
mutex_unlock(&gvt->lock);
return 0;
out_clean_sched_policy:
intel_vgpu_clean_sched_policy(vgpu);
......@@ -455,48 +400,9 @@ static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
intel_vgpu_clean_mmio(vgpu);
out_clean_idr:
idr_remove(&gvt->vgpu_idr, vgpu->id);
out_free_vgpu:
vfree(vgpu);
return ERR_PTR(ret);
}
/**
* intel_gvt_create_vgpu - create a virtual GPU
* @gvt: GVT device
* @type: type of the vGPU to create
*
* This function is called when user wants to create a virtual GPU.
*
* Returns:
* pointer to intel_vgpu, error pointer if failed.
*/
struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
struct intel_vgpu_type *type)
{
struct intel_vgpu_creation_params param;
struct intel_vgpu *vgpu;
param.primary = 1;
param.low_gm_sz = type->low_gm_size;
param.high_gm_sz = type->high_gm_size;
param.fence_sz = type->fence;
param.weight = type->weight;
param.resolution = type->resolution;
/* XXX current param based on MB */
param.low_gm_sz = BYTES_TO_MB(param.low_gm_sz);
param.high_gm_sz = BYTES_TO_MB(param.high_gm_sz);
mutex_lock(&gvt->lock);
vgpu = __intel_gvt_create_vgpu(gvt, &param);
if (!IS_ERR(vgpu)) {
/* calculate left instance change for types */
intel_gvt_update_vgpu_types(gvt);
intel_gvt_update_reg_whitelist(vgpu);
}
out_unlock:
mutex_unlock(&gvt->lock);
return vgpu;
return ret;
}
/**
......
......@@ -12,7 +12,6 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/device.h>
#include <linux/slab.h>
#include <linux/mdev.h>
......@@ -142,7 +141,6 @@ static struct vfio_ccw_private *vfio_ccw_alloc_private(struct subchannel *sch)
INIT_LIST_HEAD(&private->crw);
INIT_WORK(&private->io_work, vfio_ccw_sch_io_todo);
INIT_WORK(&private->crw_work, vfio_ccw_crw_todo);
atomic_set(&private->avail, 1);
private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1),
GFP_KERNEL);
......@@ -203,7 +201,6 @@ static void vfio_ccw_free_private(struct vfio_ccw_private *private)
mutex_destroy(&private->io_mutex);
kfree(private);
}
static int vfio_ccw_sch_probe(struct subchannel *sch)
{
struct pmcw *pmcw = &sch->schib.pmcw;
......@@ -222,7 +219,12 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
dev_set_drvdata(&sch->dev, private);
ret = mdev_register_device(&sch->dev, &vfio_ccw_mdev_driver);
private->mdev_type.sysfs_name = "io";
private->mdev_type.pretty_name = "I/O subchannel (Non-QDIO)";
private->mdev_types[0] = &private->mdev_type;
ret = mdev_register_parent(&private->parent, &sch->dev,
&vfio_ccw_mdev_driver,
private->mdev_types, 1);
if (ret)
goto out_free;
......@@ -241,7 +243,7 @@ static void vfio_ccw_sch_remove(struct subchannel *sch)
{
struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
mdev_unregister_device(&sch->dev);
mdev_unregister_parent(&private->parent);
dev_set_drvdata(&sch->dev, NULL);
......
......@@ -11,7 +11,6 @@
*/
#include <linux/vfio.h>
#include <linux/mdev.h>
#include <linux/nospec.h>
#include <linux/slab.h>
......@@ -45,47 +44,14 @@ static void vfio_ccw_dma_unmap(struct vfio_device *vdev, u64 iova, u64 length)
vfio_ccw_mdev_reset(private);
}
static ssize_t name_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sprintf(buf, "I/O subchannel (Non-QDIO)\n");
}
static MDEV_TYPE_ATTR_RO(name);
static ssize_t device_api_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sprintf(buf, "%s\n", VFIO_DEVICE_API_CCW_STRING);
}
static MDEV_TYPE_ATTR_RO(device_api);
static ssize_t available_instances_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr,
char *buf)
static int vfio_ccw_mdev_init_dev(struct vfio_device *vdev)
{
struct vfio_ccw_private *private =
dev_get_drvdata(mtype_get_parent_dev(mtype));
container_of(vdev, struct vfio_ccw_private, vdev);
return sprintf(buf, "%d\n", atomic_read(&private->avail));
init_completion(&private->release_comp);
return 0;
}
static MDEV_TYPE_ATTR_RO(available_instances);
static struct attribute *mdev_types_attrs[] = {
&mdev_type_attr_name.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_available_instances.attr,
NULL,
};
static struct attribute_group mdev_type_group = {
.name = "io",
.attrs = mdev_types_attrs,
};
static struct attribute_group *mdev_type_groups[] = {
&mdev_type_group,
NULL,
};
static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
{
......@@ -95,12 +61,9 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
if (private->state == VFIO_CCW_STATE_NOT_OPER)
return -ENODEV;
if (atomic_dec_if_positive(&private->avail) < 0)
return -EPERM;
memset(&private->vdev, 0, sizeof(private->vdev));
vfio_init_group_dev(&private->vdev, &mdev->dev,
&vfio_ccw_dev_ops);
ret = vfio_init_device(&private->vdev, &mdev->dev, &vfio_ccw_dev_ops);
if (ret)
return ret;
VFIO_CCW_MSG_EVENT(2, "sch %x.%x.%04x: create\n",
private->sch->schid.cssid,
......@@ -109,16 +72,32 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
ret = vfio_register_emulated_iommu_dev(&private->vdev);
if (ret)
goto err_atomic;
goto err_put_vdev;
dev_set_drvdata(&mdev->dev, private);
return 0;
err_atomic:
vfio_uninit_group_dev(&private->vdev);
atomic_inc(&private->avail);
err_put_vdev:
vfio_put_device(&private->vdev);
return ret;
}
static void vfio_ccw_mdev_release_dev(struct vfio_device *vdev)
{
struct vfio_ccw_private *private =
container_of(vdev, struct vfio_ccw_private, vdev);
/*
* We cannot free vfio_ccw_private here because it includes
* parent info which must be free'ed by css driver.
*
* Use a workaround by memset'ing the core device part and
* then notifying the remove path that all active references
* to this device have been released.
*/
memset(vdev, 0, sizeof(*vdev));
complete(&private->release_comp);
}
static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
{
struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
......@@ -130,8 +109,16 @@ static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
vfio_unregister_group_dev(&private->vdev);
vfio_uninit_group_dev(&private->vdev);
atomic_inc(&private->avail);
vfio_put_device(&private->vdev);
/*
* Wait for all active references on mdev are released so it
* is safe to defer kfree() to a later point.
*
* TODO: the clean fix is to split parent/mdev info from ccw
* private structure so each can be managed in its own life
* cycle.
*/
wait_for_completion(&private->release_comp);
}
static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
......@@ -592,6 +579,8 @@ static void vfio_ccw_mdev_request(struct vfio_device *vdev, unsigned int count)
}
static const struct vfio_device_ops vfio_ccw_dev_ops = {
.init = vfio_ccw_mdev_init_dev,
.release = vfio_ccw_mdev_release_dev,
.open_device = vfio_ccw_mdev_open_device,
.close_device = vfio_ccw_mdev_close_device,
.read = vfio_ccw_mdev_read,
......@@ -602,6 +591,8 @@ static const struct vfio_device_ops vfio_ccw_dev_ops = {
};
struct mdev_driver vfio_ccw_mdev_driver = {
.device_api = VFIO_DEVICE_API_CCW_STRING,
.max_instances = 1,
.driver = {
.name = "vfio_ccw_mdev",
.owner = THIS_MODULE,
......@@ -609,5 +600,4 @@ struct mdev_driver vfio_ccw_mdev_driver = {
},
.probe = vfio_ccw_mdev_probe,
.remove = vfio_ccw_mdev_remove,
.supported_type_groups = mdev_type_groups,
};
......@@ -18,6 +18,7 @@
#include <linux/workqueue.h>
#include <linux/vfio_ccw.h>
#include <linux/vfio.h>
#include <linux/mdev.h>
#include <asm/crw.h>
#include <asm/debug.h>
......@@ -72,7 +73,6 @@ struct vfio_ccw_crw {
* @sch: pointer to the subchannel
* @state: internal state of the device
* @completion: synchronization helper of the I/O completion
* @avail: available for creating a mediated device
* @io_region: MMIO region to input/output I/O arguments/results
* @io_mutex: protect against concurrent update of I/O regions
* @region: additional regions for other subchannel operations
......@@ -88,13 +88,14 @@ struct vfio_ccw_crw {
* @req_trigger: eventfd ctx for signaling userspace to return device
* @io_work: work for deferral process of I/O handling
* @crw_work: work for deferral process of CRW handling
* @release_comp: synchronization helper for vfio device release
* @parent: parent data structures for mdevs created
*/
struct vfio_ccw_private {
struct vfio_device vdev;
struct subchannel *sch;
int state;
struct completion *completion;
atomic_t avail;
struct ccw_io_region *io_region;
struct mutex io_mutex;
struct vfio_ccw_region *region;
......@@ -113,6 +114,12 @@ struct vfio_ccw_private {
struct eventfd_ctx *req_trigger;
struct work_struct io_work;
struct work_struct crw_work;
struct completion release_comp;
struct mdev_parent parent;
struct mdev_type mdev_type;
struct mdev_type *mdev_types[1];
} __aligned(8);
int vfio_ccw_sch_quiesce(struct subchannel *sch);
......
......@@ -684,42 +684,41 @@ static bool vfio_ap_mdev_filter_matrix(unsigned long *apm, unsigned long *aqm,
AP_DOMAINS);
}
static int vfio_ap_mdev_probe(struct mdev_device *mdev)
static int vfio_ap_mdev_init_dev(struct vfio_device *vdev)
{
struct ap_matrix_mdev *matrix_mdev;
int ret;
if ((atomic_dec_if_positive(&matrix_dev->available_instances) < 0))
return -EPERM;
matrix_mdev = kzalloc(sizeof(*matrix_mdev), GFP_KERNEL);
if (!matrix_mdev) {
ret = -ENOMEM;
goto err_dec_available;
}
vfio_init_group_dev(&matrix_mdev->vdev, &mdev->dev,
&vfio_ap_matrix_dev_ops);
struct ap_matrix_mdev *matrix_mdev =
container_of(vdev, struct ap_matrix_mdev, vdev);
matrix_mdev->mdev = mdev;
matrix_mdev->mdev = to_mdev_device(vdev->dev);
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
matrix_mdev->pqap_hook = handle_pqap;
vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb);
hash_init(matrix_mdev->qtable.queues);
return 0;
}
static int vfio_ap_mdev_probe(struct mdev_device *mdev)
{
struct ap_matrix_mdev *matrix_mdev;
int ret;
matrix_mdev = vfio_alloc_device(ap_matrix_mdev, vdev, &mdev->dev,
&vfio_ap_matrix_dev_ops);
if (IS_ERR(matrix_mdev))
return PTR_ERR(matrix_mdev);
ret = vfio_register_emulated_iommu_dev(&matrix_mdev->vdev);
if (ret)
goto err_list;
goto err_put_vdev;
dev_set_drvdata(&mdev->dev, matrix_mdev);
mutex_lock(&matrix_dev->mdevs_lock);
list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
mutex_unlock(&matrix_dev->mdevs_lock);
return 0;
err_list:
vfio_uninit_group_dev(&matrix_mdev->vdev);
kfree(matrix_mdev);
err_dec_available:
atomic_inc(&matrix_dev->available_instances);
err_put_vdev:
vfio_put_device(&matrix_mdev->vdev);
return ret;
}
......@@ -766,6 +765,11 @@ static void vfio_ap_mdev_unlink_fr_queues(struct ap_matrix_mdev *matrix_mdev)
}
}
static void vfio_ap_mdev_release_dev(struct vfio_device *vdev)
{
vfio_free_device(vdev);
}
static void vfio_ap_mdev_remove(struct mdev_device *mdev)
{
struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(&mdev->dev);
......@@ -779,54 +783,9 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev)
list_del(&matrix_mdev->node);
mutex_unlock(&matrix_dev->mdevs_lock);
mutex_unlock(&matrix_dev->guests_lock);
vfio_uninit_group_dev(&matrix_mdev->vdev);
kfree(matrix_mdev);
atomic_inc(&matrix_dev->available_instances);
vfio_put_device(&matrix_mdev->vdev);
}
static ssize_t name_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sprintf(buf, "%s\n", VFIO_AP_MDEV_NAME_HWVIRT);
}
static MDEV_TYPE_ATTR_RO(name);
static ssize_t available_instances_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr,
char *buf)
{
return sprintf(buf, "%d\n",
atomic_read(&matrix_dev->available_instances));
}
static MDEV_TYPE_ATTR_RO(available_instances);
static ssize_t device_api_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sprintf(buf, "%s\n", VFIO_DEVICE_API_AP_STRING);
}
static MDEV_TYPE_ATTR_RO(device_api);
static struct attribute *vfio_ap_mdev_type_attrs[] = {
&mdev_type_attr_name.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_available_instances.attr,
NULL,
};
static struct attribute_group vfio_ap_mdev_hwvirt_type_group = {
.name = VFIO_AP_MDEV_TYPE_HWVIRT,
.attrs = vfio_ap_mdev_type_attrs,
};
static struct attribute_group *vfio_ap_mdev_type_groups[] = {
&vfio_ap_mdev_hwvirt_type_group,
NULL,
};
#define MDEV_SHARING_ERR "Userspace may not re-assign queue %02lx.%04lx " \
"already assigned to %s"
......@@ -1824,6 +1783,8 @@ static const struct attribute_group vfio_queue_attr_group = {
};
static const struct vfio_device_ops vfio_ap_matrix_dev_ops = {
.init = vfio_ap_mdev_init_dev,
.release = vfio_ap_mdev_release_dev,
.open_device = vfio_ap_mdev_open_device,
.close_device = vfio_ap_mdev_close_device,
.ioctl = vfio_ap_mdev_ioctl,
......@@ -1831,6 +1792,8 @@ static const struct vfio_device_ops vfio_ap_matrix_dev_ops = {
};
static struct mdev_driver vfio_ap_matrix_driver = {
.device_api = VFIO_DEVICE_API_AP_STRING,
.max_instances = MAX_ZDEV_ENTRIES_EXT,
.driver = {
.name = "vfio_ap_mdev",
.owner = THIS_MODULE,
......@@ -1839,20 +1802,22 @@ static struct mdev_driver vfio_ap_matrix_driver = {
},
.probe = vfio_ap_mdev_probe,
.remove = vfio_ap_mdev_remove,
.supported_type_groups = vfio_ap_mdev_type_groups,
};
int vfio_ap_mdev_register(void)
{
int ret;
atomic_set(&matrix_dev->available_instances, MAX_ZDEV_ENTRIES_EXT);
ret = mdev_register_driver(&vfio_ap_matrix_driver);
if (ret)
return ret;
ret = mdev_register_device(&matrix_dev->device, &vfio_ap_matrix_driver);
matrix_dev->mdev_type.sysfs_name = VFIO_AP_MDEV_TYPE_HWVIRT;
matrix_dev->mdev_type.pretty_name = VFIO_AP_MDEV_NAME_HWVIRT;
matrix_dev->mdev_types[0] = &matrix_dev->mdev_type;
ret = mdev_register_parent(&matrix_dev->parent, &matrix_dev->device,
&vfio_ap_matrix_driver,
matrix_dev->mdev_types, 1);
if (ret)
goto err_driver;
return 0;
......@@ -1864,7 +1829,7 @@ int vfio_ap_mdev_register(void)
void vfio_ap_mdev_unregister(void)
{
mdev_unregister_device(&matrix_dev->device);
mdev_unregister_parent(&matrix_dev->parent);
mdev_unregister_driver(&vfio_ap_matrix_driver);
}
......
......@@ -13,7 +13,6 @@
#define _VFIO_AP_PRIVATE_H_
#include <linux/types.h>
#include <linux/device.h>
#include <linux/mdev.h>
#include <linux/delay.h>
#include <linux/mutex.h>
......@@ -30,7 +29,6 @@
* struct ap_matrix_dev - Contains the data for the matrix device.
*
* @device: generic device structure associated with the AP matrix device
* @available_instances: number of mediated matrix devices that can be created
* @info: the struct containing the output from the PQAP(QCI) instruction
* @mdev_list: the list of mediated matrix devices created
* @mdevs_lock: mutex for locking the AP matrix device. This lock will be
......@@ -47,12 +45,14 @@
*/
struct ap_matrix_dev {
struct device device;
atomic_t available_instances;
struct ap_config_info info;
struct list_head mdev_list;
struct mutex mdevs_lock; /* serializes access to each ap_matrix_mdev */
struct ap_driver *vfio_ap_drv;
struct mutex guests_lock; /* serializes access to each KVM guest */
struct mdev_parent parent;
struct mdev_type mdev_type;
struct mdev_type *mdev_types[];
};
extern struct ap_matrix_dev *matrix_dev;
......
......@@ -3,6 +3,7 @@ menuconfig VFIO
tristate "VFIO Non-Privileged userspace driver framework"
select IOMMU_API
select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64)
select INTERVAL_TREE
help
VFIO provides a framework for secure userspace device drivers.
See Documentation/driver-api/vfio.rst for more details.
......
# SPDX-License-Identifier: GPL-2.0
vfio_virqfd-y := virqfd.o
vfio-y += vfio_main.o
obj-$(CONFIG_VFIO) += vfio.o
vfio-y += vfio_main.o \
iova_bitmap.o \
container.o
obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
......
This diff is collapsed.
......@@ -108,9 +108,9 @@ static void vfio_fsl_mc_close_device(struct vfio_device *core_vdev)
/* reset the device before cleaning up the interrupts */
ret = vfio_fsl_mc_reset_device(vdev);
if (WARN_ON(ret))
if (ret)
dev_warn(&mc_cont->dev,
"VFIO_FLS_MC: reset device has failed (%d)\n", ret);
"VFIO_FSL_MC: reset device has failed (%d)\n", ret);
vfio_fsl_mc_irqs_cleanup(vdev);
......@@ -418,16 +418,7 @@ static int vfio_fsl_mc_mmap(struct vfio_device *core_vdev,
return vfio_fsl_mc_mmap_mmio(vdev->regions[index], vma);
}
static const struct vfio_device_ops vfio_fsl_mc_ops = {
.name = "vfio-fsl-mc",
.open_device = vfio_fsl_mc_open_device,
.close_device = vfio_fsl_mc_close_device,
.ioctl = vfio_fsl_mc_ioctl,
.read = vfio_fsl_mc_read,
.write = vfio_fsl_mc_write,
.mmap = vfio_fsl_mc_mmap,
};
static const struct vfio_device_ops vfio_fsl_mc_ops;
static int vfio_fsl_mc_bus_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
......@@ -518,35 +509,43 @@ static void vfio_fsl_uninit_device(struct vfio_fsl_mc_device *vdev)
bus_unregister_notifier(&fsl_mc_bus_type, &vdev->nb);
}
static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
static int vfio_fsl_mc_init_dev(struct vfio_device *core_vdev)
{
struct vfio_fsl_mc_device *vdev;
struct device *dev = &mc_dev->dev;
struct vfio_fsl_mc_device *vdev =
container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
struct fsl_mc_device *mc_dev = to_fsl_mc_device(core_vdev->dev);
int ret;
vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
if (!vdev)
return -ENOMEM;
vfio_init_group_dev(&vdev->vdev, dev, &vfio_fsl_mc_ops);
vdev->mc_dev = mc_dev;
mutex_init(&vdev->igate);
if (is_fsl_mc_bus_dprc(mc_dev))
ret = vfio_assign_device_set(&vdev->vdev, &mc_dev->dev);
ret = vfio_assign_device_set(core_vdev, &mc_dev->dev);
else
ret = vfio_assign_device_set(&vdev->vdev, mc_dev->dev.parent);
if (ret)
goto out_uninit;
ret = vfio_assign_device_set(core_vdev, mc_dev->dev.parent);
ret = vfio_fsl_mc_init_device(vdev);
if (ret)
goto out_uninit;
return ret;
/* device_set is released by vfio core if @init fails */
return vfio_fsl_mc_init_device(vdev);
}
static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
{
struct vfio_fsl_mc_device *vdev;
struct device *dev = &mc_dev->dev;
int ret;
vdev = vfio_alloc_device(vfio_fsl_mc_device, vdev, dev,
&vfio_fsl_mc_ops);
if (IS_ERR(vdev))
return PTR_ERR(vdev);
ret = vfio_register_group_dev(&vdev->vdev);
if (ret) {
dev_err(dev, "VFIO_FSL_MC: Failed to add to vfio group\n");
goto out_device;
goto out_put_vdev;
}
ret = vfio_fsl_mc_scan_container(mc_dev);
......@@ -557,30 +556,44 @@ static int vfio_fsl_mc_probe(struct fsl_mc_device *mc_dev)
out_group_dev:
vfio_unregister_group_dev(&vdev->vdev);
out_device:
vfio_fsl_uninit_device(vdev);
out_uninit:
vfio_uninit_group_dev(&vdev->vdev);
kfree(vdev);
out_put_vdev:
vfio_put_device(&vdev->vdev);
return ret;
}
static void vfio_fsl_mc_release_dev(struct vfio_device *core_vdev)
{
struct vfio_fsl_mc_device *vdev =
container_of(core_vdev, struct vfio_fsl_mc_device, vdev);
vfio_fsl_uninit_device(vdev);
mutex_destroy(&vdev->igate);
vfio_free_device(core_vdev);
}
static int vfio_fsl_mc_remove(struct fsl_mc_device *mc_dev)
{
struct device *dev = &mc_dev->dev;
struct vfio_fsl_mc_device *vdev = dev_get_drvdata(dev);
vfio_unregister_group_dev(&vdev->vdev);
mutex_destroy(&vdev->igate);
dprc_remove_devices(mc_dev, NULL, 0);
vfio_fsl_uninit_device(vdev);
vfio_uninit_group_dev(&vdev->vdev);
kfree(vdev);
vfio_put_device(&vdev->vdev);
return 0;
}
static const struct vfio_device_ops vfio_fsl_mc_ops = {
.name = "vfio-fsl-mc",
.init = vfio_fsl_mc_init_dev,
.release = vfio_fsl_mc_release_dev,
.open_device = vfio_fsl_mc_open_device,
.close_device = vfio_fsl_mc_close_device,
.ioctl = vfio_fsl_mc_ioctl,
.read = vfio_fsl_mc_read,
.write = vfio_fsl_mc_write,
.mmap = vfio_fsl_mc_mmap,
};
static struct fsl_mc_driver vfio_fsl_mc_driver = {
.probe = vfio_fsl_mc_probe,
.remove = vfio_fsl_mc_remove,
......
This diff is collapsed.
......@@ -8,9 +8,7 @@
*/
#include <linux/module.h>
#include <linux/device.h>
#include <linux/slab.h>
#include <linux/uuid.h>
#include <linux/sysfs.h>
#include <linux/mdev.h>
......@@ -20,71 +18,11 @@
#define DRIVER_AUTHOR "NVIDIA Corporation"
#define DRIVER_DESC "Mediated device Core Driver"
static LIST_HEAD(parent_list);
static DEFINE_MUTEX(parent_list_lock);
static struct class_compat *mdev_bus_compat_class;
static LIST_HEAD(mdev_list);
static DEFINE_MUTEX(mdev_list_lock);
struct device *mdev_parent_dev(struct mdev_device *mdev)
{
return mdev->type->parent->dev;
}
EXPORT_SYMBOL(mdev_parent_dev);
/*
* Return the index in supported_type_groups that this mdev_device was created
* from.
*/
unsigned int mdev_get_type_group_id(struct mdev_device *mdev)
{
return mdev->type->type_group_id;
}
EXPORT_SYMBOL(mdev_get_type_group_id);
/*
* Used in mdev_type_attribute sysfs functions to return the index in the
* supported_type_groups that the sysfs is called from.
*/
unsigned int mtype_get_type_group_id(struct mdev_type *mtype)
{
return mtype->type_group_id;
}
EXPORT_SYMBOL(mtype_get_type_group_id);
/*
* Used in mdev_type_attribute sysfs functions to return the parent struct
* device
*/
struct device *mtype_get_parent_dev(struct mdev_type *mtype)
{
return mtype->parent->dev;
}
EXPORT_SYMBOL(mtype_get_parent_dev);
/* Should be called holding parent_list_lock */
static struct mdev_parent *__find_parent_device(struct device *dev)
{
struct mdev_parent *parent;
list_for_each_entry(parent, &parent_list, next) {
if (parent->dev == dev)
return parent;
}
return NULL;
}
void mdev_release_parent(struct kref *kref)
{
struct mdev_parent *parent = container_of(kref, struct mdev_parent,
ref);
struct device *dev = parent->dev;
kfree(parent);
put_device(dev);
}
/* Caller must hold parent unreg_sem read or write lock */
static void mdev_device_remove_common(struct mdev_device *mdev)
{
......@@ -99,145 +37,96 @@ static void mdev_device_remove_common(struct mdev_device *mdev)
static int mdev_device_remove_cb(struct device *dev, void *data)
{
struct mdev_device *mdev = mdev_from_dev(dev);
if (mdev)
mdev_device_remove_common(mdev);
if (dev->bus == &mdev_bus_type)
mdev_device_remove_common(to_mdev_device(dev));
return 0;
}
/*
* mdev_register_device : Register a device
* mdev_register_parent: Register a device as parent for mdevs
* @parent: parent structure registered
* @dev: device structure representing parent device.
* @mdev_driver: Device driver to bind to the newly created mdev
* @types: Array of supported mdev types
* @nr_types: Number of entries in @types
*
* Registers the @parent stucture as a parent for mdev types and thus mdev
* devices. The caller needs to hold a reference on @dev that must not be
* released until after the call to mdev_unregister_parent().
*
* Add device to list of registered parent devices.
* Returns a negative value on error, otherwise 0.
*/
int mdev_register_device(struct device *dev, struct mdev_driver *mdev_driver)
int mdev_register_parent(struct mdev_parent *parent, struct device *dev,
struct mdev_driver *mdev_driver, struct mdev_type **types,
unsigned int nr_types)
{
int ret;
struct mdev_parent *parent;
char *env_string = "MDEV_STATE=registered";
char *envp[] = { env_string, NULL };
int ret;
/* check for mandatory ops */
if (!mdev_driver->supported_type_groups)
return -EINVAL;
dev = get_device(dev);
if (!dev)
return -EINVAL;
mutex_lock(&parent_list_lock);
/* Check for duplicate */
parent = __find_parent_device(dev);
if (parent) {
parent = NULL;
ret = -EEXIST;
goto add_dev_err;
}
parent = kzalloc(sizeof(*parent), GFP_KERNEL);
if (!parent) {
ret = -ENOMEM;
goto add_dev_err;
}
kref_init(&parent->ref);
memset(parent, 0, sizeof(*parent));
init_rwsem(&parent->unreg_sem);
parent->dev = dev;
parent->mdev_driver = mdev_driver;
parent->types = types;
parent->nr_types = nr_types;
atomic_set(&parent->available_instances, mdev_driver->max_instances);
if (!mdev_bus_compat_class) {
mdev_bus_compat_class = class_compat_register("mdev_bus");
if (!mdev_bus_compat_class) {
ret = -ENOMEM;
goto add_dev_err;
}
if (!mdev_bus_compat_class)
return -ENOMEM;
}
ret = parent_create_sysfs_files(parent);
if (ret)
goto add_dev_err;
return ret;
ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
if (ret)
dev_warn(dev, "Failed to create compatibility class link\n");
list_add(&parent->next, &parent_list);
mutex_unlock(&parent_list_lock);
dev_info(dev, "MDEV: Registered\n");
kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, envp);
return 0;
add_dev_err:
mutex_unlock(&parent_list_lock);
if (parent)
mdev_put_parent(parent);
else
put_device(dev);
return ret;
}
EXPORT_SYMBOL(mdev_register_device);
EXPORT_SYMBOL(mdev_register_parent);
/*
* mdev_unregister_device : Unregister a parent device
* @dev: device structure representing parent device.
*
* Remove device from list of registered parent devices. Give a chance to free
* existing mediated devices for given device.
* mdev_unregister_parent : Unregister a parent device
* @parent: parent structure to unregister
*/
void mdev_unregister_device(struct device *dev)
void mdev_unregister_parent(struct mdev_parent *parent)
{
struct mdev_parent *parent;
char *env_string = "MDEV_STATE=unregistered";
char *envp[] = { env_string, NULL };
mutex_lock(&parent_list_lock);
parent = __find_parent_device(dev);
if (!parent) {
mutex_unlock(&parent_list_lock);
return;
}
dev_info(dev, "MDEV: Unregistering\n");
list_del(&parent->next);
mutex_unlock(&parent_list_lock);
dev_info(parent->dev, "MDEV: Unregistering\n");
down_write(&parent->unreg_sem);
class_compat_remove_link(mdev_bus_compat_class, dev, NULL);
device_for_each_child(dev, NULL, mdev_device_remove_cb);
class_compat_remove_link(mdev_bus_compat_class, parent->dev, NULL);
device_for_each_child(parent->dev, NULL, mdev_device_remove_cb);
parent_remove_sysfs_files(parent);
up_write(&parent->unreg_sem);
mdev_put_parent(parent);
/* We still have the caller's reference to use for the uevent */
kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, envp);
kobject_uevent_env(&parent->dev->kobj, KOBJ_CHANGE, envp);
}
EXPORT_SYMBOL(mdev_unregister_device);
EXPORT_SYMBOL(mdev_unregister_parent);
static void mdev_device_release(struct device *dev)
{
struct mdev_device *mdev = to_mdev_device(dev);
/* Pairs with the get in mdev_device_create() */
kobject_put(&mdev->type->kobj);
struct mdev_parent *parent = mdev->type->parent;
mutex_lock(&mdev_list_lock);
list_del(&mdev->next);
if (!parent->mdev_driver->get_available)
atomic_inc(&parent->available_instances);
mutex_unlock(&mdev_list_lock);
/* Pairs with the get in mdev_device_create() */
kobject_put(&mdev->type->kobj);
dev_dbg(&mdev->dev, "MDEV: destroying\n");
kfree(mdev);
}
......@@ -259,6 +148,18 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
}
}
if (!drv->get_available) {
/*
* Note: that non-atomic read and dec is fine here because
* all modifications are under mdev_list_lock.
*/
if (!atomic_read(&parent->available_instances)) {
mutex_unlock(&mdev_list_lock);
return -EUSERS;
}
atomic_dec(&parent->available_instances);
}
mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
if (!mdev) {
mutex_unlock(&mdev_list_lock);
......
......@@ -7,7 +7,6 @@
* Kirti Wankhede <kwankhede@nvidia.com>
*/
#include <linux/device.h>
#include <linux/iommu.h>
#include <linux/mdev.h>
......@@ -47,7 +46,6 @@ struct bus_type mdev_bus_type = {
.remove = mdev_remove,
.match = mdev_match,
};
EXPORT_SYMBOL_GPL(mdev_bus_type);
/**
* mdev_register_driver - register a new MDEV driver
......@@ -57,10 +55,11 @@ EXPORT_SYMBOL_GPL(mdev_bus_type);
**/
int mdev_register_driver(struct mdev_driver *drv)
{
if (!drv->device_api)
return -EINVAL;
/* initialize common driver fields */
drv->driver.bus = &mdev_bus_type;
/* register with core */
return driver_register(&drv->driver);
}
EXPORT_SYMBOL(mdev_register_driver);
......
......@@ -13,25 +13,7 @@
int mdev_bus_register(void);
void mdev_bus_unregister(void);
struct mdev_parent {
struct device *dev;
struct mdev_driver *mdev_driver;
struct kref ref;
struct list_head next;
struct kset *mdev_types_kset;
struct list_head type_list;
/* Synchronize device creation/removal with parent unregistration */
struct rw_semaphore unreg_sem;
};
struct mdev_type {
struct kobject kobj;
struct kobject *devices_kobj;
struct mdev_parent *parent;
struct list_head next;
unsigned int type_group_id;
};
extern struct bus_type mdev_bus_type;
extern const struct attribute_group *mdev_device_groups[];
#define to_mdev_type_attr(_attr) \
......@@ -48,16 +30,4 @@ void mdev_remove_sysfs_files(struct mdev_device *mdev);
int mdev_device_create(struct mdev_type *kobj, const guid_t *uuid);
int mdev_device_remove(struct mdev_device *dev);
void mdev_release_parent(struct kref *kref);
static inline void mdev_get_parent(struct mdev_parent *parent)
{
kref_get(&parent->ref);
}
static inline void mdev_put_parent(struct mdev_parent *parent)
{
kref_put(&parent->ref, mdev_release_parent);
}
#endif /* MDEV_PRIVATE_H */
......@@ -9,14 +9,24 @@
#include <linux/sysfs.h>
#include <linux/ctype.h>
#include <linux/device.h>
#include <linux/slab.h>
#include <linux/uuid.h>
#include <linux/mdev.h>
#include "mdev_private.h"
/* Static functions */
struct mdev_type_attribute {
struct attribute attr;
ssize_t (*show)(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf);
ssize_t (*store)(struct mdev_type *mtype,
struct mdev_type_attribute *attr, const char *buf,
size_t count);
};
#define MDEV_TYPE_ATTR_RO(_name) \
struct mdev_type_attribute mdev_type_attr_##_name = __ATTR_RO(_name)
#define MDEV_TYPE_ATTR_WO(_name) \
struct mdev_type_attribute mdev_type_attr_##_name = __ATTR_WO(_name)
static ssize_t mdev_type_attr_show(struct kobject *kobj,
struct attribute *__attr, char *buf)
......@@ -74,152 +84,156 @@ static ssize_t create_store(struct mdev_type *mtype,
return count;
}
static MDEV_TYPE_ATTR_WO(create);
static ssize_t device_api_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sysfs_emit(buf, "%s\n", mtype->parent->mdev_driver->device_api);
}
static MDEV_TYPE_ATTR_RO(device_api);
static ssize_t name_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr, char *buf)
{
return sprintf(buf, "%s\n",
mtype->pretty_name ? mtype->pretty_name : mtype->sysfs_name);
}
static MDEV_TYPE_ATTR_RO(name);
static ssize_t available_instances_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr,
char *buf)
{
struct mdev_driver *drv = mtype->parent->mdev_driver;
if (drv->get_available)
return sysfs_emit(buf, "%u\n", drv->get_available(mtype));
return sysfs_emit(buf, "%u\n",
atomic_read(&mtype->parent->available_instances));
}
static MDEV_TYPE_ATTR_RO(available_instances);
static ssize_t description_show(struct mdev_type *mtype,
struct mdev_type_attribute *attr,
char *buf)
{
return mtype->parent->mdev_driver->show_description(mtype, buf);
}
static MDEV_TYPE_ATTR_RO(description);
static struct attribute *mdev_types_core_attrs[] = {
&mdev_type_attr_create.attr,
&mdev_type_attr_device_api.attr,
&mdev_type_attr_name.attr,
&mdev_type_attr_available_instances.attr,
&mdev_type_attr_description.attr,
NULL,
};
static umode_t mdev_types_core_is_visible(struct kobject *kobj,
struct attribute *attr, int n)
{
if (attr == &mdev_type_attr_description.attr &&
!to_mdev_type(kobj)->parent->mdev_driver->show_description)
return 0;
return attr->mode;
}
static struct attribute_group mdev_type_core_group = {
.attrs = mdev_types_core_attrs,
.is_visible = mdev_types_core_is_visible,
};
static const struct attribute_group *mdev_type_groups[] = {
&mdev_type_core_group,
NULL,
};
static void mdev_type_release(struct kobject *kobj)
{
struct mdev_type *type = to_mdev_type(kobj);
pr_debug("Releasing group %s\n", kobj->name);
/* Pairs with the get in add_mdev_supported_type() */
mdev_put_parent(type->parent);
kfree(type);
put_device(type->parent->dev);
}
static struct kobj_type mdev_type_ktype = {
.sysfs_ops = &mdev_type_sysfs_ops,
.release = mdev_type_release,
.sysfs_ops = &mdev_type_sysfs_ops,
.release = mdev_type_release,
.default_groups = mdev_type_groups,
};
static struct mdev_type *add_mdev_supported_type(struct mdev_parent *parent,
unsigned int type_group_id)
static int mdev_type_add(struct mdev_parent *parent, struct mdev_type *type)
{
struct mdev_type *type;
struct attribute_group *group =
parent->mdev_driver->supported_type_groups[type_group_id];
int ret;
if (!group->name) {
pr_err("%s: Type name empty!\n", __func__);
return ERR_PTR(-EINVAL);
}
type = kzalloc(sizeof(*type), GFP_KERNEL);
if (!type)
return ERR_PTR(-ENOMEM);
type->kobj.kset = parent->mdev_types_kset;
type->parent = parent;
/* Pairs with the put in mdev_type_release() */
mdev_get_parent(parent);
type->type_group_id = type_group_id;
get_device(parent->dev);
ret = kobject_init_and_add(&type->kobj, &mdev_type_ktype, NULL,
"%s-%s", dev_driver_string(parent->dev),
group->name);
type->sysfs_name);
if (ret) {
kobject_put(&type->kobj);
return ERR_PTR(ret);
return ret;
}
ret = sysfs_create_file(&type->kobj, &mdev_type_attr_create.attr);
if (ret)
goto attr_create_failed;
type->devices_kobj = kobject_create_and_add("devices", &type->kobj);
if (!type->devices_kobj) {
ret = -ENOMEM;
goto attr_devices_failed;
}
ret = sysfs_create_files(&type->kobj,
(const struct attribute **)group->attrs);
if (ret) {
ret = -ENOMEM;
goto attrs_failed;
}
return type;
return 0;
attrs_failed:
kobject_put(type->devices_kobj);
attr_devices_failed:
sysfs_remove_file(&type->kobj, &mdev_type_attr_create.attr);
attr_create_failed:
kobject_del(&type->kobj);
kobject_put(&type->kobj);
return ERR_PTR(ret);
return ret;
}
static void remove_mdev_supported_type(struct mdev_type *type)
static void mdev_type_remove(struct mdev_type *type)
{
struct attribute_group *group =
type->parent->mdev_driver->supported_type_groups[type->type_group_id];
sysfs_remove_files(&type->kobj,
(const struct attribute **)group->attrs);
kobject_put(type->devices_kobj);
sysfs_remove_file(&type->kobj, &mdev_type_attr_create.attr);
kobject_del(&type->kobj);
kobject_put(&type->kobj);
}
static int add_mdev_supported_type_groups(struct mdev_parent *parent)
{
int i;
for (i = 0; parent->mdev_driver->supported_type_groups[i]; i++) {
struct mdev_type *type;
type = add_mdev_supported_type(parent, i);
if (IS_ERR(type)) {
struct mdev_type *ltype, *tmp;
list_for_each_entry_safe(ltype, tmp, &parent->type_list,
next) {
list_del(&ltype->next);
remove_mdev_supported_type(ltype);
}
return PTR_ERR(type);
}
list_add(&type->next, &parent->type_list);
}
return 0;
}
/* mdev sysfs functions */
void parent_remove_sysfs_files(struct mdev_parent *parent)
{
struct mdev_type *type, *tmp;
list_for_each_entry_safe(type, tmp, &parent->type_list, next) {
list_del(&type->next);
remove_mdev_supported_type(type);
}
int i;
for (i = 0; i < parent->nr_types; i++)
mdev_type_remove(parent->types[i]);
kset_unregister(parent->mdev_types_kset);
}
int parent_create_sysfs_files(struct mdev_parent *parent)
{
int ret;
int ret, i;
parent->mdev_types_kset = kset_create_and_add("mdev_supported_types",
NULL, &parent->dev->kobj);
if (!parent->mdev_types_kset)
return -ENOMEM;
INIT_LIST_HEAD(&parent->type_list);
ret = add_mdev_supported_type_groups(parent);
if (ret)
goto create_err;
for (i = 0; i < parent->nr_types; i++) {
ret = mdev_type_add(parent, parent->types[i]);
if (ret)
goto out_err;
}
return 0;
create_err:
kset_unregister(parent->mdev_types_kset);
return ret;
out_err:
while (--i >= 0)
mdev_type_remove(parent->types[i]);
return 0;
}
static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
......
......@@ -16,7 +16,7 @@
#include "hisi_acc_vfio_pci.h"
/* return 0 on VM acc device ready, -ETIMEDOUT hardware timeout */
/* Return 0 on VM acc device ready, -ETIMEDOUT hardware timeout */
static int qm_wait_dev_not_ready(struct hisi_qm *qm)
{
u32 val;
......@@ -189,7 +189,7 @@ static int qm_set_regs(struct hisi_qm *qm, struct acc_vf_data *vf_data)
struct device *dev = &qm->pdev->dev;
int ret;
/* check VF state */
/* Check VF state */
if (unlikely(hisi_qm_wait_mb_ready(qm))) {
dev_err(&qm->pdev->dev, "QM device is not ready to write\n");
return -EBUSY;
......@@ -337,16 +337,7 @@ static int vf_qm_cache_wb(struct hisi_qm *qm)
return 0;
}
static struct hisi_acc_vf_core_device *hssi_acc_drvdata(struct pci_dev *pdev)
{
struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev);
return container_of(core_device, struct hisi_acc_vf_core_device,
core_device);
}
static void vf_qm_fun_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev,
struct hisi_qm *qm)
static void vf_qm_fun_reset(struct hisi_qm *qm)
{
int i;
......@@ -382,7 +373,7 @@ static int vf_qm_check_match(struct hisi_acc_vf_core_device *hisi_acc_vdev,
return -EINVAL;
}
/* vf qp num check */
/* VF qp num check */
ret = qm_get_vft(vf_qm, &vf_qm->qp_base);
if (ret <= 0) {
dev_err(dev, "failed to get vft qp nums\n");
......@@ -396,7 +387,7 @@ static int vf_qm_check_match(struct hisi_acc_vf_core_device *hisi_acc_vdev,
vf_qm->qp_num = ret;
/* vf isolation state check */
/* VF isolation state check */
ret = qm_read_regs(pf_qm, QM_QUE_ISO_CFG_V, &que_iso_state, 1);
if (ret) {
dev_err(dev, "failed to read QM_QUE_ISO_CFG_V\n");
......@@ -405,7 +396,7 @@ static int vf_qm_check_match(struct hisi_acc_vf_core_device *hisi_acc_vdev,
if (vf_data->que_iso_cfg != que_iso_state) {
dev_err(dev, "failed to match isolation state\n");
return ret;
return -EINVAL;
}
ret = qm_write_regs(vf_qm, QM_VF_STATE, &vf_data->vf_qm_state, 1);
......@@ -427,10 +418,10 @@ static int vf_qm_get_match_data(struct hisi_acc_vf_core_device *hisi_acc_vdev,
int ret;
vf_data->acc_magic = ACC_DEV_MAGIC;
/* save device id */
/* Save device id */
vf_data->dev_id = hisi_acc_vdev->vf_dev->device;
/* vf qp num save from PF */
/* VF qp num save from PF */
ret = pf_qm_get_qp_num(pf_qm, vf_id, &vf_data->qp_base);
if (ret <= 0) {
dev_err(dev, "failed to get vft qp nums!\n");
......@@ -474,19 +465,19 @@ static int vf_qm_load_data(struct hisi_acc_vf_core_device *hisi_acc_vdev,
ret = qm_set_regs(qm, vf_data);
if (ret) {
dev_err(dev, "Set VF regs failed\n");
dev_err(dev, "set VF regs failed\n");
return ret;
}
ret = hisi_qm_mb(qm, QM_MB_CMD_SQC_BT, qm->sqc_dma, 0, 0);
if (ret) {
dev_err(dev, "Set sqc failed\n");
dev_err(dev, "set sqc failed\n");
return ret;
}
ret = hisi_qm_mb(qm, QM_MB_CMD_CQC_BT, qm->cqc_dma, 0, 0);
if (ret) {
dev_err(dev, "Set cqc failed\n");
dev_err(dev, "set cqc failed\n");
return ret;
}
......@@ -528,12 +519,12 @@ static int vf_qm_state_save(struct hisi_acc_vf_core_device *hisi_acc_vdev,
return -EINVAL;
/* Every reg is 32 bit, the dma address is 64 bit. */
vf_data->eqe_dma = vf_data->qm_eqc_dw[2];
vf_data->eqe_dma = vf_data->qm_eqc_dw[1];
vf_data->eqe_dma <<= QM_XQC_ADDR_OFFSET;
vf_data->eqe_dma |= vf_data->qm_eqc_dw[1];
vf_data->aeqe_dma = vf_data->qm_aeqc_dw[2];
vf_data->eqe_dma |= vf_data->qm_eqc_dw[0];
vf_data->aeqe_dma = vf_data->qm_aeqc_dw[1];
vf_data->aeqe_dma <<= QM_XQC_ADDR_OFFSET;
vf_data->aeqe_dma |= vf_data->qm_aeqc_dw[1];
vf_data->aeqe_dma |= vf_data->qm_aeqc_dw[0];
/* Through SQC_BT/CQC_BT to get sqc and cqc address */
ret = qm_get_sqc(vf_qm, &vf_data->sqc_dma);
......@@ -552,6 +543,14 @@ static int vf_qm_state_save(struct hisi_acc_vf_core_device *hisi_acc_vdev,
return 0;
}
static struct hisi_acc_vf_core_device *hisi_acc_drvdata(struct pci_dev *pdev)
{
struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev);
return container_of(core_device, struct hisi_acc_vf_core_device,
core_device);
}
/* Check the PF's RAS state and Function INT state */
static int
hisi_acc_check_int_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
......@@ -662,7 +661,10 @@ static void hisi_acc_vf_start_device(struct hisi_acc_vf_core_device *hisi_acc_vd
if (hisi_acc_vdev->vf_qm_state != QM_READY)
return;
vf_qm_fun_reset(hisi_acc_vdev, vf_qm);
/* Make sure the device is enabled */
qm_dev_cmd_init(vf_qm);
vf_qm_fun_reset(vf_qm);
}
static int hisi_acc_vf_load_state(struct hisi_acc_vf_core_device *hisi_acc_vdev)
......@@ -970,7 +972,7 @@ hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev,
static void hisi_acc_vf_pci_aer_reset_done(struct pci_dev *pdev)
{
struct hisi_acc_vf_core_device *hisi_acc_vdev = hssi_acc_drvdata(pdev);
struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_drvdata(pdev);
if (hisi_acc_vdev->core_device.vdev.migration_flags !=
VFIO_MIGRATION_STOP_COPY)
......@@ -1213,8 +1215,28 @@ static const struct vfio_migration_ops hisi_acc_vfio_pci_migrn_state_ops = {
.migration_get_state = hisi_acc_vfio_pci_get_device_state,
};
static int hisi_acc_vfio_pci_migrn_init_dev(struct vfio_device *core_vdev)
{
struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(core_vdev,
struct hisi_acc_vf_core_device, core_device.vdev);
struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
struct hisi_qm *pf_qm = hisi_acc_get_pf_qm(pdev);
hisi_acc_vdev->vf_id = pci_iov_vf_id(pdev) + 1;
hisi_acc_vdev->pf_qm = pf_qm;
hisi_acc_vdev->vf_dev = pdev;
mutex_init(&hisi_acc_vdev->state_mutex);
core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY;
core_vdev->mig_ops = &hisi_acc_vfio_pci_migrn_state_ops;
return vfio_pci_core_init_dev(core_vdev);
}
static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
.name = "hisi-acc-vfio-pci-migration",
.init = hisi_acc_vfio_pci_migrn_init_dev,
.release = vfio_pci_core_release_dev,
.open_device = hisi_acc_vfio_pci_open_device,
.close_device = hisi_acc_vfio_pci_close_device,
.ioctl = hisi_acc_vfio_pci_ioctl,
......@@ -1228,6 +1250,8 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = {
static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
.name = "hisi-acc-vfio-pci",
.init = vfio_pci_core_init_dev,
.release = vfio_pci_core_release_dev,
.open_device = hisi_acc_vfio_pci_open_device,
.close_device = vfio_pci_core_close_device,
.ioctl = vfio_pci_core_ioctl,
......@@ -1239,73 +1263,45 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = {
.match = vfio_pci_core_match,
};
static int
hisi_acc_vfio_pci_migrn_init(struct hisi_acc_vf_core_device *hisi_acc_vdev,
struct pci_dev *pdev, struct hisi_qm *pf_qm)
{
int vf_id;
vf_id = pci_iov_vf_id(pdev);
if (vf_id < 0)
return vf_id;
hisi_acc_vdev->vf_id = vf_id + 1;
hisi_acc_vdev->core_device.vdev.migration_flags =
VFIO_MIGRATION_STOP_COPY;
hisi_acc_vdev->pf_qm = pf_qm;
hisi_acc_vdev->vf_dev = pdev;
mutex_init(&hisi_acc_vdev->state_mutex);
return 0;
}
static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct hisi_acc_vf_core_device *hisi_acc_vdev;
const struct vfio_device_ops *ops = &hisi_acc_vfio_pci_ops;
struct hisi_qm *pf_qm;
int vf_id;
int ret;
hisi_acc_vdev = kzalloc(sizeof(*hisi_acc_vdev), GFP_KERNEL);
if (!hisi_acc_vdev)
return -ENOMEM;
pf_qm = hisi_acc_get_pf_qm(pdev);
if (pf_qm && pf_qm->ver >= QM_HW_V3) {
ret = hisi_acc_vfio_pci_migrn_init(hisi_acc_vdev, pdev, pf_qm);
if (!ret) {
vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
&hisi_acc_vfio_pci_migrn_ops);
hisi_acc_vdev->core_device.vdev.mig_ops =
&hisi_acc_vfio_pci_migrn_state_ops;
} else {
vf_id = pci_iov_vf_id(pdev);
if (vf_id >= 0)
ops = &hisi_acc_vfio_pci_migrn_ops;
else
pci_warn(pdev, "migration support failed, continue with generic interface\n");
vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
&hisi_acc_vfio_pci_ops);
}
} else {
vfio_pci_core_init_device(&hisi_acc_vdev->core_device, pdev,
&hisi_acc_vfio_pci_ops);
}
hisi_acc_vdev = vfio_alloc_device(hisi_acc_vf_core_device,
core_device.vdev, &pdev->dev, ops);
if (IS_ERR(hisi_acc_vdev))
return PTR_ERR(hisi_acc_vdev);
dev_set_drvdata(&pdev->dev, &hisi_acc_vdev->core_device);
ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
if (ret)
goto out_free;
goto out_put_vdev;
return 0;
out_free:
vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
kfree(hisi_acc_vdev);
out_put_vdev:
vfio_put_device(&hisi_acc_vdev->core_device.vdev);
return ret;
}
static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
{
struct hisi_acc_vf_core_device *hisi_acc_vdev = hssi_acc_drvdata(pdev);
struct hisi_acc_vf_core_device *hisi_acc_vdev = hisi_acc_drvdata(pdev);
vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
kfree(hisi_acc_vdev);
vfio_put_device(&hisi_acc_vdev->core_device.vdev);
}
static const struct pci_device_id hisi_acc_vfio_pci_table[] = {
......
......@@ -16,7 +16,6 @@
#define SEC_CORE_INT_STATUS 0x301008
#define HPRE_HAC_INT_STATUS 0x301800
#define HZIP_CORE_INT_STATUS 0x3010AC
#define QM_QUE_ISO_CFG 0x301154
#define QM_VFT_CFG_RDY 0x10006c
#define QM_VFT_CFG_OP_WR 0x100058
......@@ -80,7 +79,7 @@ struct acc_vf_data {
/* QM reserved 5 regs */
u32 qm_rsv_regs[5];
u32 padding;
/* qm memory init information */
/* QM memory init information */
u64 eqe_dma;
u64 aeqe_dma;
u64 sqc_dma;
......@@ -99,7 +98,7 @@ struct hisi_acc_vf_migration_file {
struct hisi_acc_vf_core_device {
struct vfio_pci_core_device core_device;
u8 deferred_reset:1;
/* for migration state */
/* For migration state */
struct mutex state_mutex;
enum vfio_device_mig_state mig_state;
struct pci_dev *pf_dev;
......@@ -108,7 +107,7 @@ struct hisi_acc_vf_core_device {
struct hisi_qm vf_qm;
u32 vf_qm_state;
int vf_id;
/* for reset handler */
/* For reset handler */
spinlock_t reset_lock;
struct hisi_acc_vf_migration_file *resuming_migf;
struct hisi_acc_vf_migration_file *saving_migf;
......
This diff is collapsed.
......@@ -9,6 +9,8 @@
#include <linux/kernel.h>
#include <linux/vfio_pci_core.h>
#include <linux/mlx5/driver.h>
#include <linux/mlx5/cq.h>
#include <linux/mlx5/qp.h>
struct mlx5vf_async_data {
struct mlx5_async_work cb_work;
......@@ -39,6 +41,56 @@ struct mlx5_vf_migration_file {
struct mlx5vf_async_data async_data;
};
struct mlx5_vhca_cq_buf {
struct mlx5_frag_buf_ctrl fbc;
struct mlx5_frag_buf frag_buf;
int cqe_size;
int nent;
};
struct mlx5_vhca_cq {
struct mlx5_vhca_cq_buf buf;
struct mlx5_db db;
struct mlx5_core_cq mcq;
size_t ncqe;
};
struct mlx5_vhca_recv_buf {
u32 npages;
struct page **page_list;
dma_addr_t *dma_addrs;
u32 next_rq_offset;
u32 mkey;
};
struct mlx5_vhca_qp {
struct mlx5_frag_buf buf;
struct mlx5_db db;
struct mlx5_vhca_recv_buf recv_buf;
u32 tracked_page_size;
u32 max_msg_size;
u32 qpn;
struct {
unsigned int pc;
unsigned int cc;
unsigned int wqe_cnt;
__be32 *db;
struct mlx5_frag_buf_ctrl fbc;
} rq;
};
struct mlx5_vhca_page_tracker {
u32 id;
u32 pdn;
u8 is_err:1;
struct mlx5_uars_page *uar;
struct mlx5_vhca_cq cq;
struct mlx5_vhca_qp *host_qp;
struct mlx5_vhca_qp *fw_qp;
struct mlx5_nb nb;
int status;
};
struct mlx5vf_pci_core_device {
struct vfio_pci_core_device core_device;
int vf_id;
......@@ -46,6 +98,8 @@ struct mlx5vf_pci_core_device {
u8 migrate_cap:1;
u8 deferred_reset:1;
u8 mdev_detach:1;
u8 log_active:1;
struct completion tracker_comp;
/* protect migration state */
struct mutex state_mutex;
enum vfio_device_mig_state mig_state;
......@@ -53,6 +107,7 @@ struct mlx5vf_pci_core_device {
spinlock_t reset_lock;
struct mlx5_vf_migration_file *resuming_migf;
struct mlx5_vf_migration_file *saving_migf;
struct mlx5_vhca_page_tracker tracker;
struct workqueue_struct *cb_wq;
struct notifier_block nb;
struct mlx5_core_dev *mdev;
......@@ -63,7 +118,8 @@ int mlx5vf_cmd_resume_vhca(struct mlx5vf_pci_core_device *mvdev, u16 op_mod);
int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev,
size_t *state_size);
void mlx5vf_cmd_set_migratable(struct mlx5vf_pci_core_device *mvdev,
const struct vfio_migration_ops *mig_ops);
const struct vfio_migration_ops *mig_ops,
const struct vfio_log_ops *log_ops);
void mlx5vf_cmd_remove_migratable(struct mlx5vf_pci_core_device *mvdev);
void mlx5vf_cmd_close_migratable(struct mlx5vf_pci_core_device *mvdev);
int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev,
......@@ -73,4 +129,9 @@ int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev,
void mlx5vf_state_mutex_unlock(struct mlx5vf_pci_core_device *mvdev);
void mlx5vf_disable_fds(struct mlx5vf_pci_core_device *mvdev);
void mlx5vf_mig_file_cleanup_cb(struct work_struct *_work);
int mlx5vf_start_page_tracker(struct vfio_device *vdev,
struct rb_root_cached *ranges, u32 nnodes, u64 *page_size);
int mlx5vf_stop_page_tracker(struct vfio_device *vdev);
int mlx5vf_tracker_read_and_clear(struct vfio_device *vdev, unsigned long iova,
unsigned long length, struct iova_bitmap *dirty);
#endif /* MLX5_VFIO_CMD_H */
......@@ -579,8 +579,41 @@ static const struct vfio_migration_ops mlx5vf_pci_mig_ops = {
.migration_get_state = mlx5vf_pci_get_device_state,
};
static const struct vfio_log_ops mlx5vf_pci_log_ops = {
.log_start = mlx5vf_start_page_tracker,
.log_stop = mlx5vf_stop_page_tracker,
.log_read_and_clear = mlx5vf_tracker_read_and_clear,
};
static int mlx5vf_pci_init_dev(struct vfio_device *core_vdev)
{
struct mlx5vf_pci_core_device *mvdev = container_of(core_vdev,
struct mlx5vf_pci_core_device, core_device.vdev);
int ret;
ret = vfio_pci_core_init_dev(core_vdev);
if (ret)
return ret;
mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops,
&mlx5vf_pci_log_ops);
return 0;
}
static void mlx5vf_pci_release_dev(struct vfio_device *core_vdev)
{
struct mlx5vf_pci_core_device *mvdev = container_of(core_vdev,
struct mlx5vf_pci_core_device, core_device.vdev);
mlx5vf_cmd_remove_migratable(mvdev);
vfio_pci_core_release_dev(core_vdev);
}
static const struct vfio_device_ops mlx5vf_pci_ops = {
.name = "mlx5-vfio-pci",
.init = mlx5vf_pci_init_dev,
.release = mlx5vf_pci_release_dev,
.open_device = mlx5vf_pci_open_device,
.close_device = mlx5vf_pci_close_device,
.ioctl = vfio_pci_core_ioctl,
......@@ -598,21 +631,19 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev,
struct mlx5vf_pci_core_device *mvdev;
int ret;
mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
if (!mvdev)
return -ENOMEM;
vfio_pci_core_init_device(&mvdev->core_device, pdev, &mlx5vf_pci_ops);
mlx5vf_cmd_set_migratable(mvdev, &mlx5vf_pci_mig_ops);
mvdev = vfio_alloc_device(mlx5vf_pci_core_device, core_device.vdev,
&pdev->dev, &mlx5vf_pci_ops);
if (IS_ERR(mvdev))
return PTR_ERR(mvdev);
dev_set_drvdata(&pdev->dev, &mvdev->core_device);
ret = vfio_pci_core_register_device(&mvdev->core_device);
if (ret)
goto out_free;
goto out_put_vdev;
return 0;
out_free:
mlx5vf_cmd_remove_migratable(mvdev);
vfio_pci_core_uninit_device(&mvdev->core_device);
kfree(mvdev);
out_put_vdev:
vfio_put_device(&mvdev->core_device.vdev);
return ret;
}
......@@ -621,9 +652,7 @@ static void mlx5vf_pci_remove(struct pci_dev *pdev)
struct mlx5vf_pci_core_device *mvdev = mlx5vf_drvdata(pdev);
vfio_pci_core_unregister_device(&mvdev->core_device);
mlx5vf_cmd_remove_migratable(mvdev);
vfio_pci_core_uninit_device(&mvdev->core_device);
kfree(mvdev);
vfio_put_device(&mvdev->core_device.vdev);
}
static const struct pci_device_id mlx5vf_pci_table[] = {
......
......@@ -25,7 +25,7 @@
#include <linux/types.h>
#include <linux/uaccess.h>
#include <linux/vfio_pci_core.h>
#include "vfio_pci_priv.h"
#define DRIVER_AUTHOR "Alex Williamson <alex.williamson@redhat.com>"
#define DRIVER_DESC "VFIO PCI - User Level meta-driver"
......@@ -127,6 +127,8 @@ static int vfio_pci_open_device(struct vfio_device *core_vdev)
static const struct vfio_device_ops vfio_pci_ops = {
.name = "vfio-pci",
.init = vfio_pci_core_init_dev,
.release = vfio_pci_core_release_dev,
.open_device = vfio_pci_open_device,
.close_device = vfio_pci_core_close_device,
.ioctl = vfio_pci_core_ioctl,
......@@ -146,20 +148,19 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (vfio_pci_is_denylisted(pdev))
return -EINVAL;
vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
if (!vdev)
return -ENOMEM;
vfio_pci_core_init_device(vdev, pdev, &vfio_pci_ops);
vdev = vfio_alloc_device(vfio_pci_core_device, vdev, &pdev->dev,
&vfio_pci_ops);
if (IS_ERR(vdev))
return PTR_ERR(vdev);
dev_set_drvdata(&pdev->dev, vdev);
ret = vfio_pci_core_register_device(vdev);
if (ret)
goto out_free;
goto out_put_vdev;
return 0;
out_free:
vfio_pci_core_uninit_device(vdev);
kfree(vdev);
out_put_vdev:
vfio_put_device(&vdev->vdev);
return ret;
}
......@@ -168,8 +169,7 @@ static void vfio_pci_remove(struct pci_dev *pdev)
struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev);
vfio_pci_core_unregister_device(vdev);
vfio_pci_core_uninit_device(vdev);
kfree(vdev);
vfio_put_device(&vdev->vdev);
}
static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
......
......@@ -26,7 +26,7 @@
#include <linux/vfio.h>
#include <linux/slab.h>
#include <linux/vfio_pci_core.h>
#include "vfio_pci_priv.h"
/* Fake capability ID for standard config space */
#define PCI_CAP_ID_BASIC 0
......@@ -1166,7 +1166,7 @@ static int vfio_msi_config_write(struct vfio_pci_core_device *vdev, int pos,
flags = le16_to_cpu(*pflags);
/* MSI is enabled via ioctl */
if (!is_msi(vdev))
if (vdev->irq_type != VFIO_PCI_MSI_IRQ_INDEX)
flags &= ~PCI_MSI_FLAGS_ENABLE;
/* Check queue size */
......
This diff is collapsed.
......@@ -15,7 +15,7 @@
#include <linux/uaccess.h>
#include <linux/vfio.h>
#include <linux/vfio_pci_core.h>
#include "vfio_pci_priv.h"
#define OPREGION_SIGNATURE "IntelGraphicsMem"
#define OPREGION_SIZE (8 * 1024)
......@@ -257,7 +257,7 @@ static int vfio_pci_igd_opregion_init(struct vfio_pci_core_device *vdev)
}
}
ret = vfio_pci_register_dev_region(vdev,
ret = vfio_pci_core_register_dev_region(vdev,
PCI_VENDOR_ID_INTEL | VFIO_REGION_TYPE_PCI_VENDOR_TYPE,
VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION, &vfio_pci_igd_regops,
size, VFIO_REGION_INFO_FLAG_READ, opregionvbt);
......@@ -402,7 +402,7 @@ static int vfio_pci_igd_cfg_init(struct vfio_pci_core_device *vdev)
return -EINVAL;
}
ret = vfio_pci_register_dev_region(vdev,
ret = vfio_pci_core_register_dev_region(vdev,
PCI_VENDOR_ID_INTEL | VFIO_REGION_TYPE_PCI_VENDOR_TYPE,
VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG,
&vfio_pci_igd_cfg_regops, host_bridge->cfg_size,
......@@ -422,7 +422,7 @@ static int vfio_pci_igd_cfg_init(struct vfio_pci_core_device *vdev)
return -EINVAL;
}
ret = vfio_pci_register_dev_region(vdev,
ret = vfio_pci_core_register_dev_region(vdev,
PCI_VENDOR_ID_INTEL | VFIO_REGION_TYPE_PCI_VENDOR_TYPE,
VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG,
&vfio_pci_igd_cfg_regops, lpc_bridge->cfg_size,
......
......@@ -20,7 +20,33 @@
#include <linux/wait.h>
#include <linux/slab.h>
#include <linux/vfio_pci_core.h>
#include "vfio_pci_priv.h"
struct vfio_pci_irq_ctx {
struct eventfd_ctx *trigger;
struct virqfd *unmask;
struct virqfd *mask;
char *name;
bool masked;
struct irq_bypass_producer producer;
};
static bool irq_is(struct vfio_pci_core_device *vdev, int type)
{
return vdev->irq_type == type;
}
static bool is_intx(struct vfio_pci_core_device *vdev)
{
return vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX;
}
static bool is_irq_none(struct vfio_pci_core_device *vdev)
{
return !(vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX ||
vdev->irq_type == VFIO_PCI_MSI_IRQ_INDEX ||
vdev->irq_type == VFIO_PCI_MSIX_IRQ_INDEX);
}
/*
* INTx
......@@ -33,10 +59,12 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused)
eventfd_signal(vdev->ctx[0].trigger, 1);
}
void vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
/* Returns true if the INTx vfio_pci_irq_ctx.masked value is changed. */
bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
{
struct pci_dev *pdev = vdev->pdev;
unsigned long flags;
bool masked_changed = false;
spin_lock_irqsave(&vdev->irqlock, flags);
......@@ -60,9 +88,11 @@ void vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
disable_irq_nosync(pdev->irq);
vdev->ctx[0].masked = true;
masked_changed = true;
}
spin_unlock_irqrestore(&vdev->irqlock, flags);
return masked_changed;
}
/*
......
/* SPDX-License-Identifier: GPL-2.0-only */
#ifndef VFIO_PCI_PRIV_H
#define VFIO_PCI_PRIV_H
#include <linux/vfio_pci_core.h>
/* Special capability IDs predefined access */
#define PCI_CAP_ID_INVALID 0xFF /* default raw access */
#define PCI_CAP_ID_INVALID_VIRT 0xFE /* default virt access */
/* Cap maximum number of ioeventfds per device (arbitrary) */
#define VFIO_PCI_IOEVENTFD_MAX 1000
struct vfio_pci_ioeventfd {
struct list_head next;
struct vfio_pci_core_device *vdev;
struct virqfd *virqfd;
void __iomem *addr;
uint64_t data;
loff_t pos;
int bar;
int count;
bool test_mem;
};
bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev);
void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev);
int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
unsigned index, unsigned start, unsigned count,
void *data);
ssize_t vfio_pci_config_rw(struct vfio_pci_core_device *vdev, char __user *buf,
size_t count, loff_t *ppos, bool iswrite);
ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf,
size_t count, loff_t *ppos, bool iswrite);
#ifdef CONFIG_VFIO_PCI_VGA
ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf,
size_t count, loff_t *ppos, bool iswrite);
#else
static inline ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev,
char __user *buf, size_t count,
loff_t *ppos, bool iswrite)
{
return -EINVAL;
}
#endif
int vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset,
uint64_t data, int count, int fd);
int vfio_pci_init_perm_bits(void);
void vfio_pci_uninit_perm_bits(void);
int vfio_config_init(struct vfio_pci_core_device *vdev);
void vfio_config_free(struct vfio_pci_core_device *vdev);
int vfio_pci_set_power_state(struct vfio_pci_core_device *vdev,
pci_power_t state);
bool __vfio_pci_memory_enabled(struct vfio_pci_core_device *vdev);
void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_core_device *vdev);
u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_core_device *vdev);
void vfio_pci_memory_unlock_and_restore(struct vfio_pci_core_device *vdev,
u16 cmd);
#ifdef CONFIG_VFIO_PCI_IGD
int vfio_pci_igd_init(struct vfio_pci_core_device *vdev);
#else
static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
{
return -ENODEV;
}
#endif
#ifdef CONFIG_VFIO_PCI_ZDEV_KVM
int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps);
int vfio_pci_zdev_open_device(struct vfio_pci_core_device *vdev);
void vfio_pci_zdev_close_device(struct vfio_pci_core_device *vdev);
#else
static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
struct vfio_info_cap *caps)
{
return -ENODEV;
}
static inline int vfio_pci_zdev_open_device(struct vfio_pci_core_device *vdev)
{
return 0;
}
static inline void vfio_pci_zdev_close_device(struct vfio_pci_core_device *vdev)
{}
#endif
static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
{
return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
}
#endif
This diff is collapsed.
......@@ -15,7 +15,7 @@
#include <asm/pci_clp.h>
#include <asm/pci_io.h>
#include <linux/vfio_pci_core.h>
#include "vfio_pci_priv.h"
/*
* Add the Base PCI Function information to the device info region.
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment