qemu/vhost-implement-vhost_vdpa_device_suspend-resume.patch
Jiabo Feng c300b8e80b QEMU update to version 8.2.0-5
- vfio/migration: Add support for manual clear vfio dirty log
- vfio: Maintain DMA mapping range for the container
- linux-headers: update against 5.10 and manual clear vfio dirty log series
- arm/acpi: Fix when make qemu-system-aarch64 at x86_64 host bios_tables_test fail reason: __aarch64__ macro let build_pptt at x86_64 and aarch64 host build different function that let bios_tables_test fail.
- pl031: support rtc-timer property for pl031
- feature: Add logs for vm start and destroy
- feature: Add log for each modules
- log: Add log at boot & cpu init for aarch64
- bugfix: irq: Avoid covering object refcount of qemu_irq
- i386: cache passthrough: Update AMD 8000_001D.EAX[25:14] based on vCPU topo
- freeclock: set rtc_date_diff for X86
- freeclock: set rtc_date_diff for arm
- freeclock: add qmp command to get time offset of vm in seconds
- tests: Disable filemonitor testcase
- shadow_dev: introduce shadow dev for virtio-net device
- pl011: reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff
- tests: virt: Update expected ACPI tables for virt test(update BinDir)
- arm64: Add the cpufreq device to show cpufreq info to guest
- hw/arm64: add vcpu cache info support
- tests: virt: Allow changes to PPTT test table
- cpu: add Cortex-A72 processor kvm target support
- cpu: add Kunpeng-920 cpu support
- net: eepro100: validate various address valuesi(CVE-2021-20255)
- ide: ahci: add check to avoid null dereference (CVE-2019-12067)
- vdpa: set vring enable only if the vring address has already been set
- docs: Add generic vhost-vdpa device documentation
- vdpa: don't suspend/resume device when vdpa device not started
- vdpa: correct param passed in when unregister save
- vdpa: suspend function return 0 when the vdpa device is stopped
- vdpa: support vdpa device suspend/resume
- vdpa: move memory listener to the realize stage
- vdpa: implement vdpa device migration
- vhost: implement migration state notifier for vdpa device
- vhost: implement post resume bh
- vhost: implement savevm_handler for vdpa device
- vhost: implement vhost_vdpa_device_suspend/resume
- vhost: implement vhost-vdpa suspend/resume
- vhost: add vhost_dev_suspend/resume_op
- vhost: introduce bytemap for vhost backend logging
- vhost-vdpa: add migration log ops for VhostOps
- vhost-vdpa: add VHOST_BACKEND_F_BYTEMAPLOG
- hw/usb: reduce the vpcu cost of UHCI when VNC disconnect
- virtio-net: update the default and max of rx/tx_queue_size
- virtio-net: set the max of queue size to 4096
- virtio-net: fix max vring buf size when set ring num
- virtio-net: bugfix: do not delete netdev before virtio net
- monitor: Discard BLOCK_IO_ERROR event when VM rebooted
- vhost-user: add unregister_savevm when vhost-user cleanup
- vhost-user: add vhost_set_mem_table when vm load_setup at destination
- vhost-user: quit infinite loop while used memslots is more than the backend limit
- fix qemu-core when vhost-user-net config with server mode
- vhost-user: Add support reconnect vhost-user socket
- vhost-user: Set the acked_features to vm's featrue
- i6300esb watchdog: bugfix: Add a runstate transition
- hw/net/rocker_of_dpa: fix double free bug of rocker device
- net/dump.c: Suppress spurious compiler warning
- pcie: Add pcie-root-port fast plug/unplug feature
- pcie: Compat with devices which do not support Link Width, such as ioh3420
- qdev/monitors: Fix reundant error_setg of qdev_add_device
- qemu-nbd: set timeout to qemu-nbd socket
- qemu-nbd: make native as the default aio mode
- nbd/server.c: fix invalid read after client was already free
- virtio-scsi: bugfix: fix qemu crash for hotplug scsi disk with dataplane
- virtio: bugfix: check the value of caches before accessing it
- virtio: print the guest virtio_net features that host does not support
- virtio: bugfix: add rcu_read_lock when vring_avail_idx is called
- virtio: check descriptor numbers
- migration: report multiFd related thread pid to libvirt
- migration: report migration related thread pid to libvirt
- cpu/features: fix bug for memory leakage
- doc: Update multi-thread compression doc
- migration: Add compress_level sanity check
- migration: Add zstd support in multi-thread compression
- migration: Add multi-thread compress ops
- migration: Refactoring multi-thread compress migration
- migration: Add multi-thread compress method
- migration: skip cache_drop for bios bootloader and nvram template
- oslib-posix: optimise vm startup time for 1G hugepage
- monitor/qmp: drop inflight rsp if qmp client broken
- ps2: fix oob in ps2 kbd
- Currently, while kvm and qemu can not handle some kvm exit, qemu will do vm_stop, which will make vm in pause state. This action make vm unrecoverable, so send guest panic to libvirt instead.
- vhost: cancel migration when vhost-user restarted during migraiton

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
2024-04-10 20:19:06 +08:00

448 lines
14 KiB
Diff

From 4c5a9a0703e227186639124f09cdf7214e40ea7d Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:27:34 +0800
Subject: [PATCH] vhost: implement vhost_vdpa_device_suspend/resume
Implement vhost device suspend & resume interface
Signed-off-by: jiangdongxu <jiangdongxu1@huawei.com>
Signed-off-by: fangyi <eric.fangyi@huawei.com>
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/meson.build | 2 +-
hw/virtio/vdpa-dev-mig.c | 178 +++++++++++++++++++++++++++++++
hw/virtio/vhost.c | 138 ++++++++++++++++++++++++
include/hw/virtio/vdpa-dev-mig.h | 16 +++
include/hw/virtio/vdpa-dev.h | 1 +
include/hw/virtio/vhost.h | 3 +
migration/migration.c | 3 +-
migration/migration.h | 2 +
8 files changed, 340 insertions(+), 3 deletions(-)
create mode 100644 hw/virtio/vdpa-dev-mig.c
create mode 100644 include/hw/virtio/vdpa-dev-mig.h
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index c0055a7832..596651d113 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -5,7 +5,7 @@ system_virtio_ss.add(when: 'CONFIG_VIRTIO_MMIO', if_true: files('virtio-mmio.c')
system_virtio_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('virtio-crypto.c'))
system_virtio_ss.add(when: 'CONFIG_VHOST_VSOCK_COMMON', if_true: files('vhost-vsock-common.c'))
system_virtio_ss.add(when: 'CONFIG_VIRTIO_IOMMU', if_true: files('virtio-iommu.c'))
-system_virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev.c'))
+system_virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev.c', 'vdpa-dev-mig.c'))
specific_virtio_ss = ss.source_set()
specific_virtio_ss.add(files('virtio.c'))
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
new file mode 100644
index 0000000000..1d2bed2571
--- /dev/null
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -0,0 +1,178 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2023. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <sys/ioctl.h>
+#include <linux/vhost.h>
+#include "qemu/osdep.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "hw/virtio/virtio-bus.h"
+#include "migration/migration.h"
+#include "qemu/error-report.h"
+#include "hw/virtio/vdpa-dev-mig.h"
+
+static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
+ void *arg)
+{
+ struct vhost_vdpa *v = dev->opaque;
+ int fd = v->device_fd;
+
+ if (dev->vhost_ops->backend_type != VHOST_BACKEND_TYPE_VDPA) {
+ error_report("backend type isn't VDPA. Operation not permitted!\n");
+ return -EPERM;
+ }
+
+ return ioctl(fd, request, arg);
+}
+
+static int vhost_vdpa_device_suspend(VhostVdpaDevice *vdpa)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(vdpa);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int ret;
+
+ if (!vdpa->started) {
+ return -EFAULT;
+ }
+
+ if (!k->set_guest_notifiers) {
+ return -EFAULT;
+ }
+
+ vdpa->started = false;
+
+ ret = vhost_dev_suspend(&vdpa->dev, vdev, false);
+ if (ret) {
+ goto suspend_fail;
+ }
+
+ ret = k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, false);
+ if (ret < 0) {
+ error_report("vhost guest notifier cleanup failed: %d\n", ret);
+ goto set_guest_notifiers_fail;
+ }
+
+ vhost_dev_disable_notifiers(&vdpa->dev, vdev);
+ return ret;
+
+set_guest_notifiers_fail:
+ ret = k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, true);
+ if (ret) {
+ error_report("vhost guest notifier restore failed: %d\n", ret);
+ }
+
+suspend_fail:
+ vdpa->started = true;
+ return ret;
+}
+
+static int vhost_vdpa_device_resume(VhostVdpaDevice *vdpa)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(vdpa);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int i, ret;
+
+ if (!k->set_guest_notifiers) {
+ error_report("binding does not support guest notifiers\n");
+ return -ENOSYS;
+ }
+
+ ret = vhost_dev_enable_notifiers(&vdpa->dev, vdev);
+ if (ret < 0) {
+ error_report("Error enabling host notifiers: %d\n", ret);
+ return ret;
+ }
+
+ ret = k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, true);
+ if (ret < 0) {
+ error_report("Error binding guest notifier: %d\n", ret);
+ goto err_host_notifiers;
+ }
+
+ vdpa->dev.acked_features = vdev->guest_features;
+
+ ret = vhost_dev_resume(&vdpa->dev, vdev, false);
+ if (ret < 0) {
+ error_report("Error starting vhost: %d\n", ret);
+ goto err_guest_notifiers;
+ }
+ vdpa->started = true;
+
+ /*
+ * guest_notifier_mask/pending not used yet, so just unmask
+ * everything here. virtio-pci will do the right thing by
+ * enabling/disabling irqfd.
+ */
+ for (i = 0; i < vdpa->dev.nvqs; i++) {
+ vhost_virtqueue_mask(&vdpa->dev, vdev, i, false);
+ }
+
+ return ret;
+
+err_guest_notifiers:
+ k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, false);
+err_host_notifiers:
+ vhost_dev_disable_notifiers(&vdpa->dev, vdev);
+ return ret;
+}
+
+static void vdpa_dev_vmstate_change(void *opaque, bool running, RunState state)
+{
+ VhostVdpaDevice *vdpa = VHOST_VDPA_DEVICE(opaque);
+ struct vhost_dev *hdev = &vdpa->dev;
+ int ret;
+ MigrationState *ms = migrate_get_current();
+ MigrationIncomingState *mis = migration_incoming_get_current();
+
+ if (!running) {
+ if (ms->state == RUN_STATE_PAUSED) {
+ ret = vhost_vdpa_device_suspend(vdpa);
+ if (ret) {
+ error_report("suspend vdpa device failed: %d\n", ret);
+ if (ms->migration_thread_running) {
+ migrate_fd_cancel(ms);
+ }
+ }
+ }
+ } else {
+ if (ms->state == RUN_STATE_RESTORE_VM) {
+ ret = vhost_vdpa_device_resume(vdpa);
+ if (ret) {
+ error_report("migration dest resume device failed, abort!\n");
+ exit(EXIT_FAILURE);
+ }
+ }
+
+ if (mis->state == RUN_STATE_RESTORE_VM) {
+ vhost_vdpa_call(hdev, VHOST_VDPA_RESUME, NULL);
+ }
+ }
+}
+
+void vdpa_migration_register(VhostVdpaDevice *vdev)
+{
+ vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
+ vdpa_dev_vmstate_change,
+ DEVICE(vdev));
+}
+
+void vdpa_migration_unregister(VhostVdpaDevice *vdev)
+{
+ qemu_del_vm_change_state_handler(vdev->vmstate);
+}
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 438182d850..d073a6d5a5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2492,3 +2492,141 @@ bool used_memslots_is_exceeded(void)
{
return used_memslots_exceeded;
}
+
+int vhost_dev_resume(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+{
+ int i, r;
+ EventNotifier *e = &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier;
+
+ /* should only be called after backend is connected */
+ if (!hdev->vhost_ops) {
+ error_report("Missing vhost_ops! Operation not permitted!\n");
+ return -EPERM;
+ }
+
+ vdev->vhost_started = true;
+ hdev->started = true;
+ hdev->vdev = vdev;
+
+ if (vhost_dev_has_iommu(hdev)) {
+ memory_listener_register(&hdev->iommu_listener, vdev->dma_as);
+ }
+
+ r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_mem_table failed");
+ goto fail_mem;
+ }
+ for (i = 0; i < hdev->nvqs; ++i) {
+ r = vhost_virtqueue_start(hdev,
+ vdev,
+ hdev->vqs + i,
+ hdev->vq_index + i);
+ if (r < 0) {
+ goto fail_vq;
+ }
+ }
+
+ r = event_notifier_init(e, 0);
+ if (r < 0) {
+ return r;
+ }
+ event_notifier_test_and_clear(e);
+ if (!vdev->use_guest_notifier_mask) {
+ vhost_config_mask(hdev, vdev, true);
+ }
+ if (vrings) {
+ r = vhost_dev_set_vring_enable(hdev, true);
+ if (r) {
+ goto fail_vq;
+ }
+ }
+ if (hdev->vhost_ops->vhost_dev_resume) {
+ r = hdev->vhost_ops->vhost_dev_resume(hdev);
+ if (r) {
+ goto fail_start;
+ }
+ }
+ if (vhost_dev_has_iommu(hdev)) {
+ hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
+
+ /*
+ * Update used ring information for IOTLB to work correctly,
+ * vhost-kernel code requires for this.
+ */
+ for (i = 0; i < hdev->nvqs; ++i) {
+ struct vhost_virtqueue *vq = hdev->vqs + i;
+ vhost_device_iotlb_miss(hdev, vq->used_phys, true);
+ }
+ }
+ vhost_start_config_intr(hdev);
+ return 0;
+fail_start:
+ if (vrings) {
+ vhost_dev_set_vring_enable(hdev, false);
+ }
+fail_vq:
+ while (--i >= 0) {
+ vhost_virtqueue_stop(hdev,
+ vdev,
+ hdev->vqs + i,
+ hdev->vq_index + i);
+ }
+
+fail_mem:
+ vdev->vhost_started = false;
+ hdev->started = false;
+ return r;
+}
+
+int vhost_dev_suspend(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+{
+ int i;
+ int ret = 0;
+ EventNotifier *e = &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier;
+
+ /* should only be called after backend is connected */
+ if (!hdev->vhost_ops) {
+ error_report("Missing vhost_ops! Operation not permitted!\n");
+ return -EPERM;
+ }
+
+ event_notifier_test_and_clear(e);
+ event_notifier_test_and_clear(&vdev->config_notifier);
+
+ if (hdev->vhost_ops->vhost_dev_suspend) {
+ ret = hdev->vhost_ops->vhost_dev_suspend(hdev);
+ if (ret) {
+ goto fail_suspend;
+ }
+ }
+ if (vrings) {
+ ret = vhost_dev_set_vring_enable(hdev, false);
+ if (ret) {
+ goto fail_suspend;
+ }
+ }
+ for (i = 0; i < hdev->nvqs; ++i) {
+ vhost_virtqueue_stop(hdev,
+ vdev,
+ hdev->vqs + i,
+ hdev->vq_index + i);
+ }
+
+ if (vhost_dev_has_iommu(hdev)) {
+ hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
+ memory_listener_unregister(&hdev->iommu_listener);
+ }
+ vhost_stop_config_intr(hdev);
+ vhost_log_put(hdev, true);
+ hdev->started = false;
+ vdev->vhost_started = false;
+ hdev->vdev = NULL;
+
+ return ret;
+
+fail_suspend:
+ event_notifier_test_and_clear(e);
+
+ return ret;
+}
diff --git a/include/hw/virtio/vdpa-dev-mig.h b/include/hw/virtio/vdpa-dev-mig.h
new file mode 100644
index 0000000000..89665ca747
--- /dev/null
+++ b/include/hw/virtio/vdpa-dev-mig.h
@@ -0,0 +1,16 @@
+/*
+ * Vhost Vdpa Device Migration Header
+ *
+ * Copyright (c) Huawei Technologies Co., Ltd. 2023. All Rights Reserved.
+ */
+
+#ifndef _VHOST_VDPA_MIGRATION_H
+#define _VHOST_VDPA_MIGRATION_H
+
+#include "hw/virtio/vdpa-dev.h"
+
+void vdpa_migration_register(VhostVdpaDevice *vdev);
+
+void vdpa_migration_unregister(VhostVdpaDevice *vdev);
+
+#endif /* _VHOST_VDPA_MIGRATION_H */
diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
index 4dbf98195c..43cbcef81b 100644
--- a/include/hw/virtio/vdpa-dev.h
+++ b/include/hw/virtio/vdpa-dev.h
@@ -38,6 +38,7 @@ struct VhostVdpaDevice {
uint16_t queue_size;
bool started;
int (*post_init)(VhostVdpaDevice *v, Error **errp);
+ VMChangeStateEntry *vmstate;
};
#endif
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6ae86833e3..9ca5819deb 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -466,4 +466,7 @@ int vhost_save_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp);
*/
int vhost_load_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp);
+int vhost_dev_resume(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+int vhost_dev_suspend(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+
#endif
diff --git a/migration/migration.c b/migration/migration.c
index 23d9233bbe..dce22c2da5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -99,7 +99,6 @@ static bool migration_object_check(MigrationState *ms, Error **errp);
static int migration_maybe_pause(MigrationState *s,
int *current_active_state,
int new_state);
-static void migrate_fd_cancel(MigrationState *s);
static bool close_return_path_on_source(MigrationState *s);
static void migration_downtime_start(MigrationState *s)
@@ -1386,7 +1385,7 @@ void migrate_fd_error(MigrationState *s, const Error *error)
migrate_set_error(s, error);
}
-static void migrate_fd_cancel(MigrationState *s)
+void migrate_fd_cancel(MigrationState *s)
{
int old_state ;
diff --git a/migration/migration.h b/migration/migration.h
index 6aafa04314..2f26c9509b 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -551,4 +551,6 @@ void migration_rp_kick(MigrationState *s);
int migration_stop_vm(RunState state);
+void migrate_fd_cancel(MigrationState *s);
+
#endif
--
2.27.0