Commit d16c0cd2 authored by Ricardo Cañuelo's avatar Ricardo Cañuelo Committed by Michael S. Tsirkin

docs: driver-api: virtio: virtio on Linux

Basic doc about Virtio on Linux and a short tutorial on Virtio drivers.

includes the following fixup:

virtio: fix virtio_config_ops kerneldocs

Fixes two warning messages when building htmldocs:

    warning: duplicate section name 'Note'
    warning: expecting prototype for virtio_config_ops().
             Prototype was for vq_callback_t() instead

Message-Id: <20221010064359.1324353-2-ricardo.canuelo@collabora.com>
Signed-off-by: default avatarRicardo Cañuelo <ricardo.canuelo@collabora.com>
Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
Message-Id: <20221220100035.2712449-1-ricardo.canuelo@collabora.com>
Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: default avatarBagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
parent d5ff73bb
...@@ -106,6 +106,7 @@ available subsections can be seen below. ...@@ -106,6 +106,7 @@ available subsections can be seen below.
vfio-mediated-device vfio-mediated-device
vfio vfio
vfio-pci-device-specific-driver-acceptance vfio-pci-device-specific-driver-acceptance
virtio/index
xilinx/index xilinx/index
xillybus xillybus
zorro zorro
......
.. SPDX-License-Identifier: GPL-2.0
======
Virtio
======
.. toctree::
:maxdepth: 1
virtio
writing_virtio_drivers
.. SPDX-License-Identifier: GPL-2.0
.. _virtio:
===============
Virtio on Linux
===============
Introduction
============
Virtio is an open standard that defines a protocol for communication
between drivers and devices of different types, see Chapter 5 ("Device
Types") of the virtio spec `[1]`_. Originally developed as a standard
for paravirtualized devices implemented by a hypervisor, it can be used
to interface any compliant device (real or emulated) with a driver.
For illustrative purposes, this document will focus on the common case
of a Linux kernel running in a virtual machine and using paravirtualized
devices provided by the hypervisor, which exposes them as virtio devices
via standard mechanisms such as PCI.
Device - Driver communication: virtqueues
=========================================
Although the virtio devices are really an abstraction layer in the
hypervisor, they're exposed to the guest as if they are physical devices
using a specific transport method -- PCI, MMIO or CCW -- that is
orthogonal to the device itself. The virtio spec defines these transport
methods in detail, including device discovery, capabilities and
interrupt handling.
The communication between the driver in the guest OS and the device in
the hypervisor is done through shared memory (that's what makes virtio
devices so efficient) using specialized data structures called
virtqueues, which are actually ring buffers [#f1]_ of buffer descriptors
similar to the ones used in a network device:
.. kernel-doc:: include/uapi/linux/virtio_ring.h
:identifiers: struct vring_desc
All the buffers the descriptors point to are allocated by the guest and
used by the host either for reading or for writing but not for both.
Refer to Chapter 2.5 ("Virtqueues") of the virtio spec `[1]`_ for the
reference definitions of virtqueues and to `[2]`_ for an illustrated
overview of how the host device and the guest driver communicate.
The :c:type:`vring_virtqueue` struct models a virtqueue, including the
ring buffers and management data. Embedded in this struct is the
:c:type:`virtqueue` struct, which is the data structure that's
ultimately used by virtio drivers:
.. kernel-doc:: include/linux/virtio.h
:identifiers: struct virtqueue
The callback function pointed by this struct is triggered when the
device has consumed the buffers provided by the driver. More
specifically, the trigger will be an interrupt issued by the hypervisor
(see vring_interrupt()). Interrupt request handlers are registered for
a virtqueue during the virtqueue setup process (transport-specific).
.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: vring_interrupt
Device discovery and probing
============================
In the kernel, the virtio core contains the virtio bus driver and
transport-specific drivers like `virtio-pci` and `virtio-mmio`. Then
there are individual virtio drivers for specific device types that are
registered to the virtio bus driver.
How a virtio device is found and configured by the kernel depends on how
the hypervisor defines it. Taking the `QEMU virtio-console
<https://gitlab.com/qemu-project/qemu/-/blob/master/hw/char/virtio-console.c>`__
device as an example. When using PCI as a transport method, the device
will present itself on the PCI bus with vendor 0x1af4 (Red Hat, Inc.)
and device id 0x1003 (virtio console), as defined in the spec, so the
kernel will detect it as it would do with any other PCI device.
During the PCI enumeration process, if a device is found to match the
virtio-pci driver (according to the virtio-pci device table, any PCI
device with vendor id = 0x1af4)::
/* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
static const struct pci_device_id virtio_pci_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) },
{ 0 }
};
then the virtio-pci driver is probed and, if the probing goes well, the
device is registered to the virtio bus::
static int virtio_pci_probe(struct pci_dev *pci_dev,
const struct pci_device_id *id)
{
...
if (force_legacy) {
rc = virtio_pci_legacy_probe(vp_dev);
/* Also try modern mode if we can't map BAR0 (no IO space). */
if (rc == -ENODEV || rc == -ENOMEM)
rc = virtio_pci_modern_probe(vp_dev);
if (rc)
goto err_probe;
} else {
rc = virtio_pci_modern_probe(vp_dev);
if (rc == -ENODEV)
rc = virtio_pci_legacy_probe(vp_dev);
if (rc)
goto err_probe;
}
...
rc = register_virtio_device(&vp_dev->vdev);
When the device is registered to the virtio bus the kernel will look
for a driver in the bus that can handle the device and call that
driver's ``probe`` method.
It's at this stage that the virtqueues will be allocated and configured
by calling the appropriate ``virtio_find`` helper function, such as
virtio_find_single_vq() or virtio_find_vqs(), which will end up
calling a transport-specific ``find_vqs`` method.
References
==========
_`[1]` Virtio Spec v1.2:
https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html
Check for later versions of the spec as well.
_`[2]` Virtqueues and virtio ring: How the data travels
https://www.redhat.com/en/blog/virtqueues-and-virtio-ring-how-data-travels
.. rubric:: Footnotes
.. [#f1] that's why they may be also referred to as virtrings.
.. SPDX-License-Identifier: GPL-2.0
.. _writing_virtio_drivers:
======================
Writing Virtio Drivers
======================
Introduction
============
This document serves as a basic guideline for driver programmers that
need to hack a new virtio driver or understand the essentials of the
existing ones. See :ref:`Virtio on Linux <virtio>` for a general
overview of virtio.
Driver boilerplate
==================
As a bare minimum, a virtio driver needs to register in the virtio bus
and configure the virtqueues for the device according to its spec, the
configuration of the virtqueues in the driver side must match the
virtqueue definitions in the device. A basic driver skeleton could look
like this::
#include <linux/virtio.h>
#include <linux/virtio_ids.h>
#include <linux/virtio_config.h>
#include <linux/module.h>
/* device private data (one per device) */
struct virtio_dummy_dev {
struct virtqueue *vq;
};
static void virtio_dummy_recv_cb(struct virtqueue *vq)
{
struct virtio_dummy_dev *dev = vq->vdev->priv;
char *buf;
unsigned int len;
while ((buf = virtqueue_get_buf(dev->vq, &len)) != NULL) {
/* process the received data */
}
}
static int virtio_dummy_probe(struct virtio_device *vdev)
{
struct virtio_dummy_dev *dev = NULL;
/* initialize device data */
dev = kzalloc(sizeof(struct virtio_dummy_dev), GFP_KERNEL);
if (!dev)
return -ENOMEM;
/* the device has a single virtqueue */
dev->vq = virtio_find_single_vq(vdev, virtio_dummy_recv_cb, "input");
if (IS_ERR(dev->vq)) {
kfree(dev);
return PTR_ERR(dev->vq);
}
vdev->priv = dev;
/* from this point on, the device can notify and get callbacks */
virtio_device_ready(vdev);
return 0;
}
static void virtio_dummy_remove(struct virtio_device *vdev)
{
struct virtio_dummy_dev *dev = vdev->priv;
/*
* disable vq interrupts: equivalent to
* vdev->config->reset(vdev)
*/
virtio_reset_device(vdev);
/* detach unused buffers */
while ((buf = virtqueue_detach_unused_buf(dev->vq)) != NULL) {
kfree(buf);
}
/* remove virtqueues */
vdev->config->del_vqs(vdev);
kfree(dev);
}
static const struct virtio_device_id id_table[] = {
{ VIRTIO_ID_DUMMY, VIRTIO_DEV_ANY_ID },
{ 0 },
};
static struct virtio_driver virtio_dummy_driver = {
.driver.name = KBUILD_MODNAME,
.driver.owner = THIS_MODULE,
.id_table = id_table,
.probe = virtio_dummy_probe,
.remove = virtio_dummy_remove,
};
module_virtio_driver(virtio_dummy_driver);
MODULE_DEVICE_TABLE(virtio, id_table);
MODULE_DESCRIPTION("Dummy virtio driver");
MODULE_LICENSE("GPL");
The device id ``VIRTIO_ID_DUMMY`` here is a placeholder, virtio drivers
should be added only for devices that are defined in the spec, see
include/uapi/linux/virtio_ids.h. Device ids need to be at least reserved
in the virtio spec before being added to that file.
If your driver doesn't have to do anything special in its ``init`` and
``exit`` methods, you can use the module_virtio_driver() helper to
reduce the amount of boilerplate code.
The ``probe`` method does the minimum driver setup in this case
(memory allocation for the device data) and initializes the
virtqueue. virtio_device_ready() is used to enable the virtqueue and to
notify the device that the driver is ready to manage the device
("DRIVER_OK"). The virtqueues are anyway enabled automatically by the
core after ``probe`` returns.
.. kernel-doc:: include/linux/virtio_config.h
:identifiers: virtio_device_ready
In any case, the virtqueues need to be enabled before adding buffers to
them.
Sending and receiving data
==========================
The virtio_dummy_recv_cb() callback in the code above will be triggered
when the device notifies the driver after it finishes processing a
descriptor or descriptor chain, either for reading or writing. However,
that's only the second half of the virtio device-driver communication
process, as the communication is always started by the driver regardless
of the direction of the data transfer.
To configure a buffer transfer from the driver to the device, first you
have to add the buffers -- packed as `scatterlists` -- to the
appropriate virtqueue using any of the virtqueue_add_inbuf(),
virtqueue_add_outbuf() or virtqueue_add_sgs(), depending on whether you
need to add one input `scatterlist` (for the device to fill in), one
output `scatterlist` (for the device to consume) or multiple
`scatterlists`, respectively. Then, once the virtqueue is set up, a call
to virtqueue_kick() sends a notification that will be serviced by the
hypervisor that implements the device::
struct scatterlist sg[1];
sg_init_one(sg, buffer, BUFLEN);
virtqueue_add_inbuf(dev->vq, sg, 1, buffer, GFP_ATOMIC);
virtqueue_kick(dev->vq);
.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_add_inbuf
.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_add_outbuf
.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_add_sgs
Then, after the device has read or written the buffers prepared by the
driver and notifies it back, the driver can call virtqueue_get_buf() to
read the data produced by the device (if the virtqueue was set up with
input buffers) or simply to reclaim the buffers if they were already
consumed by the device:
.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_get_buf_ctx
The virtqueue callbacks can be disabled and re-enabled using the
virtqueue_disable_cb() and the family of virtqueue_enable_cb() functions
respectively. See drivers/virtio/virtio_ring.c for more details:
.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_disable_cb
.. kernel-doc:: drivers/virtio/virtio_ring.c
:identifiers: virtqueue_enable_cb
But note that some spurious callbacks can still be triggered under
certain scenarios. The way to disable callbacks reliably is to reset the
device or the virtqueue (virtio_reset_device()).
References
==========
_`[1]` Virtio Spec v1.2:
https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html
Check for later versions of the spec as well.
...@@ -22040,6 +22040,7 @@ S: Maintained ...@@ -22040,6 +22040,7 @@ S: Maintained
F: Documentation/ABI/testing/sysfs-bus-vdpa F: Documentation/ABI/testing/sysfs-bus-vdpa
F: Documentation/ABI/testing/sysfs-class-vduse F: Documentation/ABI/testing/sysfs-class-vduse
F: Documentation/devicetree/bindings/virtio/ F: Documentation/devicetree/bindings/virtio/
F: Documentation/driver-api/virtio/
F: drivers/block/virtio_blk.c F: drivers/block/virtio_blk.c
F: drivers/crypto/virtio/ F: drivers/crypto/virtio/
F: drivers/net/virtio_net.c F: drivers/net/virtio_net.c
......
...@@ -16,8 +16,10 @@ struct virtio_shm_region { ...@@ -16,8 +16,10 @@ struct virtio_shm_region {
u64 len; u64 len;
}; };
typedef void vq_callback_t(struct virtqueue *);
/** /**
* virtio_config_ops - operations for configuring a virtio device * struct virtio_config_ops - operations for configuring a virtio device
* Note: Do not assume that a transport implements all of the operations * Note: Do not assume that a transport implements all of the operations
* getting/setting a value as a simple read/write! Generally speaking, * getting/setting a value as a simple read/write! Generally speaking,
* any of @get/@set, @get_status/@set_status, or @get_features/ * any of @get/@set, @get_status/@set_status, or @get_features/
...@@ -69,7 +71,8 @@ struct virtio_shm_region { ...@@ -69,7 +71,8 @@ struct virtio_shm_region {
* vdev: the virtio_device * vdev: the virtio_device
* This sends the driver feature bits to the device: it can change * This sends the driver feature bits to the device: it can change
* the dev->feature bits if it wants. * the dev->feature bits if it wants.
* Note: despite the name this can be called any number of times. * Note that despite the name this can be called any number of
* times.
* Returns 0 on success or error status * Returns 0 on success or error status
* @bus_name: return the bus name associated with the device (optional) * @bus_name: return the bus name associated with the device (optional)
* vdev: the virtio_device * vdev: the virtio_device
...@@ -91,7 +94,6 @@ struct virtio_shm_region { ...@@ -91,7 +94,6 @@ struct virtio_shm_region {
* If disable_vq_and_reset is set, then enable_vq_after_reset must also be * If disable_vq_and_reset is set, then enable_vq_after_reset must also be
* set. * set.
*/ */
typedef void vq_callback_t(struct virtqueue *);
struct virtio_config_ops { struct virtio_config_ops {
void (*get)(struct virtio_device *vdev, unsigned offset, void (*get)(struct virtio_device *vdev, unsigned offset,
void *buf, unsigned len); void *buf, unsigned len);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment