1. 03 Jul, 2023 25 commits
    • Xianting Tian's avatar
      vhost: Make parameter name match of vhost_get_vq_desc() · 9e396a2f
      Xianting Tian authored
      The parameter name in the function declaration and definition
      should be the same.
      
      drivers/vhost/vhost.h,
      int vhost_get_vq_desc(..., unsigned int iov_count,...);
      
      drivers/vhost/vhost.c,
      int vhost_get_vq_desc(..., unsigned int iov_size,...)
      Signed-off-by: default avatarXianting Tian <xianting.tian@linux.alibaba.com>
      Message-Id: <20230621093835.36878-1-xianting.tian@linux.alibaba.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      9e396a2f
    • Maxime Coquelin's avatar
      vduse: fix NULL pointer dereference · f06cf1e1
      Maxime Coquelin authored
      vduse_vdpa_set_vq_affinity callback can be called
      with NULL value as cpu_mask when deleting the vduse
      device.
      
      This patch resets virtqueue's IRQ affinity mask value
      to set all CPUs instead of dereferencing NULL cpu_mask.
      
      [ 4760.952149] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [ 4760.959110] #PF: supervisor read access in kernel mode
      [ 4760.964247] #PF: error_code(0x0000) - not-present page
      [ 4760.969385] PGD 0 P4D 0
      [ 4760.971927] Oops: 0000 [#1] PREEMPT SMP PTI
      [ 4760.976112] CPU: 13 PID: 2346 Comm: vdpa Not tainted 6.4.0-rc6+ #4
      [ 4760.982291] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.8.1 06/26/2020
      [ 4760.989769] RIP: 0010:memcpy_orig+0xc5/0x130
      [ 4760.994049] Code: 16 f8 4c 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 fa 08 72 1b <4c> 8b 06 4c 8b 4c 16 f8 4c 89 07 4c 89 4c 17 f8 c3 cc cc cc cc 66
      [ 4761.012793] RSP: 0018:ffffb1d565abb830 EFLAGS: 00010246
      [ 4761.018020] RAX: ffff9f4bf6b27898 RBX: ffff9f4be23969c0 RCX: ffff9f4bcadf6400
      [ 4761.025152] RDX: 0000000000000008 RSI: 0000000000000000 RDI: ffff9f4bf6b27898
      [ 4761.032286] RBP: 0000000000000000 R08: 0000000000000008 R09: 0000000000000000
      [ 4761.039416] R10: 0000000000000000 R11: 0000000000000600 R12: 0000000000000000
      [ 4761.046549] R13: 0000000000000000 R14: 0000000000000080 R15: ffffb1d565abbb10
      [ 4761.053680] FS:  00007f64c2ec2740(0000) GS:ffff9f635f980000(0000) knlGS:0000000000000000
      [ 4761.061765] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 4761.067513] CR2: 0000000000000000 CR3: 0000001875270006 CR4: 00000000007706e0
      [ 4761.074645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 4761.081775] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 4761.088909] PKRU: 55555554
      [ 4761.091620] Call Trace:
      [ 4761.094074]  <TASK>
      [ 4761.096180]  ? __die+0x1f/0x70
      [ 4761.099238]  ? page_fault_oops+0x171/0x4f0
      [ 4761.103340]  ? exc_page_fault+0x7b/0x180
      [ 4761.107265]  ? asm_exc_page_fault+0x22/0x30
      [ 4761.111460]  ? memcpy_orig+0xc5/0x130
      [ 4761.115126]  vduse_vdpa_set_vq_affinity+0x3e/0x50 [vduse]
      [ 4761.120533]  virtnet_clean_affinity.part.0+0x3d/0x90 [virtio_net]
      [ 4761.126635]  remove_vq_common+0x1a4/0x250 [virtio_net]
      [ 4761.131781]  virtnet_remove+0x5d/0x70 [virtio_net]
      [ 4761.136580]  virtio_dev_remove+0x3a/0x90
      [ 4761.140509]  device_release_driver_internal+0x19b/0x200
      [ 4761.145742]  bus_remove_device+0xc2/0x130
      [ 4761.149755]  device_del+0x158/0x3e0
      [ 4761.153245]  ? kernfs_find_ns+0x35/0xc0
      [ 4761.157086]  device_unregister+0x13/0x60
      [ 4761.161010]  unregister_virtio_device+0x11/0x20
      [ 4761.165543]  device_release_driver_internal+0x19b/0x200
      [ 4761.170770]  bus_remove_device+0xc2/0x130
      [ 4761.174782]  device_del+0x158/0x3e0
      [ 4761.178276]  ? __pfx_vdpa_name_match+0x10/0x10 [vdpa]
      [ 4761.183336]  device_unregister+0x13/0x60
      [ 4761.187260]  vdpa_nl_cmd_dev_del_set_doit+0x63/0xe0 [vdpa]
      
      Fixes: 28f6288e ("vduse: Support set_vq_affinity callback")
      Cc: xieyongji@bytedance.com
      Signed-off-by: default avatarMaxime Coquelin <maxime.coquelin@redhat.com>
      Message-Id: <20230622204851.318125-1-maxime.coquelin@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarXie Yongji <xieyongji@bytedance.com>
      f06cf1e1
    • Mike Christie's avatar
      vhost: Allow worker switching while work is queueing · 228a27cf
      Mike Christie authored
      This patch drops the requirement that we can only switch workers if work
      has not been queued by using RCU for the vq based queueing paths and a
      mutex for the device wide flush.
      
      We can also use this to support SIGKILL properly in the future where we
      should exit almost immediately after getting that signal. With this
      patch, when get_signal returns true, we can set the vq->worker to NULL
      and do a synchronize_rcu to prevent new work from being queued to the
      vhost_task that has been killed.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-18-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      228a27cf
    • Mike Christie's avatar
      vhost_scsi: add support for worker ioctls · d74b55e6
      Mike Christie authored
      This has vhost-scsi support the worker ioctls by calling the
      vhost_worker_ioctl helper.
      
      With a single worker, the single thread becomes a bottlneck when trying
      to use 3 or more virtqueues like:
      
      fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
      --ioengine=libaio --iodepth=128  --numjobs=3
      
      With the patches and doing a worker per vq, we can scale to at least
      16 vCPUs/vqs (that's my system limit) with the same command fio command
      above with numjobs=16:
      
      fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
      --ioengine=libaio --iodepth=64  --numjobs=16
      
      which gives around 2002K IOPs.
      
      Note that for testing I dropped depth to 64 above because the vhost/virt
      layer supports only 1024 total commands per device. And the only tuning I
      did was set LIO's emulate_pr to 0 to avoid LIO's PR lock in the main IO
      path which becomes an issue at around 12 jobs/virtqueues.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-17-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      d74b55e6
    • Mike Christie's avatar
      vhost: allow userspace to create workers · c1ecd8e9
      Mike Christie authored
      For vhost-scsi with 3 vqs or more and a workload that tries to use
      them in parallel like:
      
      fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
      --ioengine=libaio --iodepth=128  --numjobs=3
      
      the single vhost worker thread will become a bottlneck and we are stuck
      at around 500K IOPs no matter how many jobs, virtqueues, and CPUs are
      used.
      
      To better utilize virtqueues and available CPUs, this patch allows
      userspace to create workers and bind them to vqs. You can have N workers
      per dev and also share N workers with M vqs on that dev.
      
      This patch adds the interface related code and the next patch will hook
      vhost-scsi into it. The patches do not try to hook net and vsock into
      the interface because:
      
      1. multiple workers don't seem to help vsock. The problem is that with
      only 2 virtqueues we never fully use the existing worker when doing
      bidirectional tests. This seems to match vhost-scsi where we don't see
      the worker as a bottleneck until 3 virtqueues are used.
      
      2. net already has a way to use multiple workers.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-16-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      c1ecd8e9
    • Mike Christie's avatar
      vhost: replace single worker pointer with xarray · 1cdaafa1
      Mike Christie authored
      The next patch allows userspace to create multiple workers per device,
      so this patch replaces the vhost_worker pointer with an xarray so we
      can store mupltiple workers and look them up.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-15-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      1cdaafa1
    • Mike Christie's avatar
      vhost: add helper to parse userspace vring state/file · cef25866
      Mike Christie authored
      The next patches add new vhost worker ioctls which will need to get a
      vhost_virtqueue from a userspace struct which specifies the vq's index.
      This moves the vhost_vring_ioctl code to do this to a helper so it can
      be shared.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-14-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      cef25866
    • Mike Christie's avatar
      vhost: remove vhost_work_queue · 27eca189
      Mike Christie authored
      vhost_work_queue is no longer used. Each driver is using the poll or vq
      based queueing, so remove vhost_work_queue.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-13-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      27eca189
    • Mike Christie's avatar
      vhost_scsi: flush IO vqs then send TMF rsp · 0a3eac52
      Mike Christie authored
      With one worker we will always send the scsi cmd responses then send the
      TMF rsp, because LIO will always complete the scsi cmds first then call
      into us to send the TMF response.
      
      With multiple workers, the IO vq workers could be running while the
      TMF/ctl vq worker is running so this has us do a flush before completing
      the TMF to make sure cmds are completed when it's work is later queued
      and run.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-12-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      0a3eac52
    • Mike Christie's avatar
      vhost_scsi: convert to vhost_vq_work_queue · 78af31cc
      Mike Christie authored
      Convert from vhost_work_queue to vhost_vq_work_queue so we can
      remove vhost_work_queue.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-11-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      78af31cc
    • Mike Christie's avatar
      vhost_scsi: make SCSI cmd completion per vq · 48ae70dd
      Mike Christie authored
      This patch separates the scsi cmd completion code paths so we can complete
      cmds based on their vq instead of having all cmds complete on the same
      worker/CPU. This will be useful with the next patches that allow us to
      create mulitple worker threads and bind them to different vqs, and we can
      have completions running on different threads/CPUs.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Message-Id: <20230626232307.97930-10-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      48ae70dd
    • Mike Christie's avatar
      vhost_sock: convert to vhost_vq_work_queue · 9e09d0ec
      Mike Christie authored
      Convert from vhost_work_queue to vhost_vq_work_queue, so we can drop
      vhost_work_queue.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-9-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      9e09d0ec
    • Mike Christie's avatar
      vhost: convert poll work to be vq based · 493b94bf
      Mike Christie authored
      This has the drivers pass in their poll to vq mapping and then converts
      the core poll code to use the vq based helpers. In the next patches we
      will allow vqs to be handled by different workers, so to allow drivers
      to execute operations like queue, stop, flush, etc on specific polls/vqs
      we need to know the mappings.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-8-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      493b94bf
    • Mike Christie's avatar
      vhost: take worker or vq for flushing · a6fc0473
      Mike Christie authored
      This patch has the core work flush function take a worker. When we
      support multiple workers we can then flush each worker during device
      removal, stoppage, etc. It also adds a helper to flush specific
      virtqueues, so vhost-scsi can flush IO vqs from it's ctl vq.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-7-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      a6fc0473
    • Mike Christie's avatar
      vhost: take worker or vq instead of dev for queueing · 0921dddc
      Mike Christie authored
      This patch has the core work queueing function take a worker for when we
      support multiple workers. It also adds a helper that takes a vq during
      queueing so modules can control which vq/worker to queue work on.
      
      This temp leaves vhost_work_queue. It will be removed when the drivers
      are converted in the next patches.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-6-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      0921dddc
    • Mike Christie's avatar
      vhost, vhost_net: add helper to check if vq has work · 9784df15
      Mike Christie authored
      In the next patches each vq might have different workers so one could
      have work but others do not. For net, we only want to check specific vqs,
      so this adds a helper to check if a vq has work pending and converts
      vhost-net to use it.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Message-Id: <20230626232307.97930-5-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      9784df15
    • Mike Christie's avatar
      vhost: add vhost_worker pointer to vhost_virtqueue · 737bdb64
      Mike Christie authored
      This patchset allows userspace to map vqs to different workers. This
      patch adds a worker pointer to the vq so in later patches in this set
      we can queue/flush specific vqs and their workers.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-4-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      737bdb64
    • Mike Christie's avatar
      vhost: dynamically allocate vhost_worker · c011bb66
      Mike Christie authored
      This patchset allows us to allocate multiple workers, so this has us
      move from the vhost_worker that's embedded in the vhost_dev to
      dynamically allocating it.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-3-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      c011bb66
    • Mike Christie's avatar
      vhost: create worker at end of vhost_dev_set_owner · 3e11c6eb
      Mike Christie authored
      vsock can start queueing work after VHOST_VSOCK_SET_GUEST_CID, so
      after we have called vhost_worker_create it can be calling
      vhost_work_queue and trying to access the vhost worker/task. If
      vhost_dev_alloc_iovecs fails, then vhost_worker_free could free
      the worker/task from under vsock.
      
      This moves vhost_worker_create to the end of vhost_dev_set_owner
      where we know we can no longer fail in that path. If it fails
      after the VHOST_SET_OWNER and userspace closes the device, then
      the normal vsock release handling will do the right thing.
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Message-Id: <20230626232307.97930-2-michael.christie@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      3e11c6eb
    • Xianting Tian's avatar
      virtio_bt: call scheduler when we free unused buffs · 3845308f
      Xianting Tian authored
      For virtio-net we were getting CPU stall warnings, and fixed it by
      calling the scheduler: see f8bb5104 ("virtio_net: suppress cpu stall
      when free_unused_bufs").
      
      This driver is similar so theoretically the same logic applies.
      Signed-off-by: default avatarXianting Tian <xianting.tian@linux.alibaba.com>
      Message-Id: <20230609131817.712867-4-xianting.tian@linux.alibaba.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      3845308f
    • Xianting Tian's avatar
      virtio-console: call scheduler when we free unused buffs · 56b5e65e
      Xianting Tian authored
      For virtio-net we were getting CPU stall warnings, and fixed it by
      calling the scheduler: see f8bb5104 ("virtio_net: suppress cpu stall
      when free_unused_bufs").
      
      This driver is similar so theoretically the same logic applies.
      Signed-off-by: default avatarXianting Tian <xianting.tian@linux.alibaba.com>
      Message-Id: <20230609131817.712867-3-xianting.tian@linux.alibaba.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      56b5e65e
    • Xianting Tian's avatar
      virtio-crypto: call scheduler when we free unused buffs · 7a5103b8
      Xianting Tian authored
      For virtio-net we were getting CPU stall warnings, and fixed it by
      calling the scheduler: see f8bb5104 ("virtio_net: suppress cpu stall
      when free_unused_bufs").
      
      This driver is similar so theoretically the same logic applies.
      Signed-off-by: default avatarXianting Tian <xianting.tian@linux.alibaba.com>
      Message-Id: <20230609131817.712867-2-xianting.tian@linux.alibaba.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      7a5103b8
    • Zhu Lingshan's avatar
      vDPA/ifcvf: implement new accessors for vq_state · 4cf8b6d0
      Zhu Lingshan authored
      This commit implements a better layout of the
      live migration bar, therefore the accessors for virtqueue
      state have been refactored.
      
      This commit also add a comment to the probing-ids list,
      indicating this driver drives F2000X-PL virtio-net
      Signed-off-by: default avatarZhu Lingshan <lingshan.zhu@intel.com>
      Message-Id: <20230612151420.1019504-4-lingshan.zhu@intel.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      4cf8b6d0
    • Zhu Lingshan's avatar
      vDPA/ifcvf: detect and report max allowed vq size · ae904d9c
      Zhu Lingshan authored
      Rather than a hardcode, this commit detects
      and reports the max value of allowed size
      of the virtqueues
      Signed-off-by: default avatarZhu Lingshan <lingshan.zhu@intel.com>
      Message-Id: <20230612151420.1019504-3-lingshan.zhu@intel.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      ae904d9c
    • Zhu Lingshan's avatar
      vDPA/ifcvf: dynamic allocate vq data stores · 77128322
      Zhu Lingshan authored
      This commit dynamically allocates the data
      stores for the virtqueues based on
      virtio_pci_common_cfg.num_queues.
      Signed-off-by: default avatarZhu Lingshan <lingshan.zhu@intel.com>
      Message-Id: <20230612151420.1019504-2-lingshan.zhu@intel.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      77128322
  2. 27 Jun, 2023 15 commits