- vfio/migration: Add support for manual clear vfio dirty log - vfio: Maintain DMA mapping range for the container - linux-headers: update against 5.10 and manual clear vfio dirty log series - arm/acpi: Fix when make qemu-system-aarch64 at x86_64 host bios_tables_test fail reason: __aarch64__ macro let build_pptt at x86_64 and aarch64 host build different function that let bios_tables_test fail. - pl031: support rtc-timer property for pl031 - feature: Add logs for vm start and destroy - feature: Add log for each modules - log: Add log at boot & cpu init for aarch64 - bugfix: irq: Avoid covering object refcount of qemu_irq - i386: cache passthrough: Update AMD 8000_001D.EAX[25:14] based on vCPU topo - freeclock: set rtc_date_diff for X86 - freeclock: set rtc_date_diff for arm - freeclock: add qmp command to get time offset of vm in seconds - tests: Disable filemonitor testcase - shadow_dev: introduce shadow dev for virtio-net device - pl011: reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff - tests: virt: Update expected ACPI tables for virt test(update BinDir) - arm64: Add the cpufreq device to show cpufreq info to guest - hw/arm64: add vcpu cache info support - tests: virt: Allow changes to PPTT test table - cpu: add Cortex-A72 processor kvm target support - cpu: add Kunpeng-920 cpu support - net: eepro100: validate various address valuesi(CVE-2021-20255) - ide: ahci: add check to avoid null dereference (CVE-2019-12067) - vdpa: set vring enable only if the vring address has already been set - docs: Add generic vhost-vdpa device documentation - vdpa: don't suspend/resume device when vdpa device not started - vdpa: correct param passed in when unregister save - vdpa: suspend function return 0 when the vdpa device is stopped - vdpa: support vdpa device suspend/resume - vdpa: move memory listener to the realize stage - vdpa: implement vdpa device migration - vhost: implement migration state notifier for vdpa device - vhost: implement post resume bh - vhost: implement savevm_handler for vdpa device - vhost: implement vhost_vdpa_device_suspend/resume - vhost: implement vhost-vdpa suspend/resume - vhost: add vhost_dev_suspend/resume_op - vhost: introduce bytemap for vhost backend logging - vhost-vdpa: add migration log ops for VhostOps - vhost-vdpa: add VHOST_BACKEND_F_BYTEMAPLOG - hw/usb: reduce the vpcu cost of UHCI when VNC disconnect - virtio-net: update the default and max of rx/tx_queue_size - virtio-net: set the max of queue size to 4096 - virtio-net: fix max vring buf size when set ring num - virtio-net: bugfix: do not delete netdev before virtio net - monitor: Discard BLOCK_IO_ERROR event when VM rebooted - vhost-user: add unregister_savevm when vhost-user cleanup - vhost-user: add vhost_set_mem_table when vm load_setup at destination - vhost-user: quit infinite loop while used memslots is more than the backend limit - fix qemu-core when vhost-user-net config with server mode - vhost-user: Add support reconnect vhost-user socket - vhost-user: Set the acked_features to vm's featrue - i6300esb watchdog: bugfix: Add a runstate transition - hw/net/rocker_of_dpa: fix double free bug of rocker device - net/dump.c: Suppress spurious compiler warning - pcie: Add pcie-root-port fast plug/unplug feature - pcie: Compat with devices which do not support Link Width, such as ioh3420 - qdev/monitors: Fix reundant error_setg of qdev_add_device - qemu-nbd: set timeout to qemu-nbd socket - qemu-nbd: make native as the default aio mode - nbd/server.c: fix invalid read after client was already free - virtio-scsi: bugfix: fix qemu crash for hotplug scsi disk with dataplane - virtio: bugfix: check the value of caches before accessing it - virtio: print the guest virtio_net features that host does not support - virtio: bugfix: add rcu_read_lock when vring_avail_idx is called - virtio: check descriptor numbers - migration: report multiFd related thread pid to libvirt - migration: report migration related thread pid to libvirt - cpu/features: fix bug for memory leakage - doc: Update multi-thread compression doc - migration: Add compress_level sanity check - migration: Add zstd support in multi-thread compression - migration: Add multi-thread compress ops - migration: Refactoring multi-thread compress migration - migration: Add multi-thread compress method - migration: skip cache_drop for bios bootloader and nvram template - oslib-posix: optimise vm startup time for 1G hugepage - monitor/qmp: drop inflight rsp if qmp client broken - ps2: fix oob in ps2 kbd - Currently, while kvm and qemu can not handle some kvm exit, qemu will do vm_stop, which will make vm in pause state. This action make vm unrecoverable, so send guest panic to libvirt instead. - vhost: cancel migration when vhost-user restarted during migraiton Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
230 lines
7.9 KiB
Diff
230 lines
7.9 KiB
Diff
From 24c3ff779f35b40967d195e4764d4cb605c1a304 Mon Sep 17 00:00:00 2001
|
|
From: Zenghui Yu <yuzenghui@huawei.com>
|
|
Date: Sat, 8 May 2021 17:31:05 +0800
|
|
Subject: [PATCH] vfio/migration: Add support for manual clear vfio dirty log
|
|
|
|
The new capability VFIO_DIRTY_LOG_MANUAL_CLEAR and the new ioctl
|
|
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
|
|
VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP have been introduced in
|
|
the kernel, tweak the userspace side to use them.
|
|
|
|
Check if the kernel supports VFIO_DIRTY_LOG_MANUAL_CLEAR and
|
|
provide the log_clear() hook for vfio_memory_listener. If the
|
|
kernel supports it, deliever the clear message to kernel.
|
|
|
|
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
|
|
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
|
|
---
|
|
hw/vfio/common.c | 136 ++++++++++++++++++++++++++++++++++
|
|
hw/vfio/container.c | 13 +++-
|
|
include/hw/vfio/vfio-common.h | 1 +
|
|
3 files changed, 148 insertions(+), 2 deletions(-)
|
|
|
|
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
|
|
index 564e933135..e08b147b3d 100644
|
|
--- a/hw/vfio/common.c
|
|
+++ b/hw/vfio/common.c
|
|
@@ -1344,6 +1344,141 @@ static void vfio_listener_log_sync(MemoryListener *listener,
|
|
}
|
|
}
|
|
|
|
+/*
|
|
+ * I'm not sure if there's any alignment requirement for the CLEAR_BITMAP
|
|
+ * ioctl. But copy from kvm side and align {start, size} with 64 pages.
|
|
+ *
|
|
+ * I think the code can be simplified a lot if no alignment requirement.
|
|
+ */
|
|
+#define VFIO_CLEAR_LOG_SHIFT 6
|
|
+#define VFIO_CLEAR_LOG_ALIGN (qemu_real_host_page_size() << VFIO_CLEAR_LOG_SHIFT)
|
|
+#define VFIO_CLEAR_LOG_MASK (-VFIO_CLEAR_LOG_ALIGN)
|
|
+
|
|
+static int vfio_log_clear_one_range(VFIOContainer *container,VFIODMARange *qrange,
|
|
+ uint64_t start, uint64_t size)
|
|
+{
|
|
+ struct vfio_iommu_type1_dirty_bitmap *dbitmap;
|
|
+ struct vfio_iommu_type1_dirty_bitmap_get *range;
|
|
+
|
|
+ dbitmap = g_malloc0(sizeof(*dbitmap) + sizeof(*range));
|
|
+
|
|
+ dbitmap->argsz = sizeof(*dbitmap) + sizeof(*range);
|
|
+ dbitmap->flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP;
|
|
+ range = (struct vfio_iommu_type1_dirty_bitmap_get *)&dbitmap->data;
|
|
+
|
|
+ /*
|
|
+ * Now let's deal with the actual bitmap, which is almost the same
|
|
+ * as the kvm side.
|
|
+ */
|
|
+ uint64_t end, bmap_start, start_delta, bmap_npages;
|
|
+ unsigned long *bmap_clear = NULL, psize = qemu_real_host_page_size();
|
|
+ int ret;
|
|
+
|
|
+ bmap_start = start & VFIO_CLEAR_LOG_MASK;
|
|
+ start_delta = start - bmap_start;
|
|
+ bmap_start /= psize;
|
|
+
|
|
+ bmap_npages = DIV_ROUND_UP(size + start_delta, VFIO_CLEAR_LOG_ALIGN)
|
|
+ << VFIO_CLEAR_LOG_SHIFT;
|
|
+ end = qrange->size / psize;
|
|
+ if (bmap_npages > end - bmap_start) {
|
|
+ bmap_npages = end - bmap_start;
|
|
+ }
|
|
+ start_delta /= psize;
|
|
+
|
|
+ if (start_delta) {
|
|
+ bmap_clear = bitmap_new(bmap_npages);
|
|
+ bitmap_copy_with_src_offset(bmap_clear, qrange->bitmap,
|
|
+ bmap_start, start_delta + size / psize);
|
|
+ bitmap_clear(bmap_clear, 0, start_delta);
|
|
+ range->bitmap.data = (__u64 *)bmap_clear;
|
|
+ } else {
|
|
+ range->bitmap.data = (__u64 *)(qrange->bitmap + BIT_WORD(bmap_start));
|
|
+ }
|
|
+
|
|
+ range->iova = qrange->iova + bmap_start * psize;
|
|
+ range->size = bmap_npages * psize;
|
|
+ range->bitmap.size = ROUND_UP(bmap_npages, sizeof(__u64) * BITS_PER_BYTE) /
|
|
+ BITS_PER_BYTE;
|
|
+ range->bitmap.pgsize = qemu_real_host_page_size();
|
|
+
|
|
+ ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, dbitmap);
|
|
+ if (ret) {
|
|
+ error_report("Failed to clear dirty log for iova: 0x%"PRIx64
|
|
+ " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova,
|
|
+ (uint64_t)range->size, errno);
|
|
+ goto err_out;
|
|
+ }
|
|
+
|
|
+ bitmap_clear(qrange->bitmap, bmap_start + start_delta, size / psize);
|
|
+err_out:
|
|
+ g_free(bmap_clear);
|
|
+ g_free(dbitmap);
|
|
+ return 0;
|
|
+}
|
|
+
|
|
+static int vfio_physical_log_clear(VFIOContainer *container,
|
|
+ MemoryRegionSection *section)
|
|
+{
|
|
+ uint64_t start, size, offset, count;
|
|
+ VFIODMARange *qrange;
|
|
+ int ret = 0;
|
|
+
|
|
+ if (!container->dirty_log_manual_clear) {
|
|
+ /* No need to do explicit clear */
|
|
+ return ret;
|
|
+ }
|
|
+
|
|
+ start = section->offset_within_address_space;
|
|
+ size = int128_get64(section->size);
|
|
+
|
|
+ if (!size) {
|
|
+ return ret;
|
|
+ }
|
|
+
|
|
+ QLIST_FOREACH(qrange, &container->dma_list, next) {
|
|
+ /*
|
|
+ * Discard ranges that do not overlap the section (e.g., the
|
|
+ * Memory BAR regions of the device)
|
|
+ */
|
|
+ if (qrange->iova > start + size - 1 ||
|
|
+ start > qrange->iova + qrange->size - 1) {
|
|
+ continue;
|
|
+ }
|
|
+
|
|
+ if (start >= qrange->iova) {
|
|
+ /* The range starts before section or is aligned to it. */
|
|
+ offset = start - qrange->iova;
|
|
+ count = MIN(qrange->size - offset, size);
|
|
+ } else {
|
|
+ /* The range starts after section. */
|
|
+ offset = 0;
|
|
+ count = MIN(qrange->size, size - (qrange->iova - start));
|
|
+ }
|
|
+ ret = vfio_log_clear_one_range(container, qrange, offset, count);
|
|
+ if (ret < 0) {
|
|
+ break;
|
|
+ }
|
|
+ }
|
|
+
|
|
+ return ret;
|
|
+}
|
|
+
|
|
+static void vfio_listener_log_clear(MemoryListener *listener,
|
|
+ MemoryRegionSection *section)
|
|
+{
|
|
+ VFIOContainer *container = container_of(listener, VFIOContainer, listener);
|
|
+
|
|
+ if (vfio_listener_skipped_section(section) ||
|
|
+ !container->dirty_pages_supported) {
|
|
+ return;
|
|
+ }
|
|
+
|
|
+ if (vfio_devices_all_dirty_tracking(container)) {
|
|
+ vfio_physical_log_clear(container, section);
|
|
+ }
|
|
+}
|
|
+
|
|
const MemoryListener vfio_memory_listener = {
|
|
.name = "vfio",
|
|
.region_add = vfio_listener_region_add,
|
|
@@ -1351,6 +1486,7 @@ const MemoryListener vfio_memory_listener = {
|
|
.log_global_start = vfio_listener_log_global_start,
|
|
.log_global_stop = vfio_listener_log_global_stop,
|
|
.log_sync = vfio_listener_log_sync,
|
|
+ .log_clear = vfio_listener_log_clear,
|
|
};
|
|
|
|
void vfio_reset_handler(void *opaque)
|
|
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
|
|
index 9a176a0d33..d8b9117f4f 100644
|
|
--- a/hw/vfio/container.c
|
|
+++ b/hw/vfio/container.c
|
|
@@ -285,7 +285,9 @@ int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap,
|
|
dbitmap = g_malloc0(sizeof(*dbitmap) + sizeof(*range));
|
|
|
|
dbitmap->argsz = sizeof(*dbitmap) + sizeof(*range);
|
|
- dbitmap->flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP;
|
|
+ dbitmap->flags = container->dirty_log_manual_clear ?
|
|
+ VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR :
|
|
+ VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP;
|
|
range = (struct vfio_iommu_type1_dirty_bitmap_get *)&dbitmap->data;
|
|
range->iova = iova;
|
|
range->size = size;
|
|
@@ -409,7 +411,7 @@ static int vfio_get_iommu_type(VFIOContainer *container,
|
|
static int vfio_init_container(VFIOContainer *container, int group_fd,
|
|
Error **errp)
|
|
{
|
|
- int iommu_type, ret;
|
|
+ int iommu_type, dirty_log_manual_clear, ret;
|
|
|
|
iommu_type = vfio_get_iommu_type(container, errp);
|
|
if (iommu_type < 0) {
|
|
@@ -438,6 +440,13 @@ static int vfio_init_container(VFIOContainer *container, int group_fd,
|
|
}
|
|
|
|
container->iommu_type = iommu_type;
|
|
+
|
|
+ dirty_log_manual_clear = ioctl(container->fd, VFIO_CHECK_EXTENSION,
|
|
+ VFIO_DIRTY_LOG_MANUAL_CLEAR);
|
|
+ if (dirty_log_manual_clear) {
|
|
+ container->dirty_log_manual_clear = dirty_log_manual_clear;
|
|
+ }
|
|
+
|
|
return 0;
|
|
}
|
|
|
|
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
|
|
index b131d04c9c..fd9828d50b 100644
|
|
--- a/include/hw/vfio/vfio-common.h
|
|
+++ b/include/hw/vfio/vfio-common.h
|
|
@@ -97,6 +97,7 @@ typedef struct VFIOContainer {
|
|
Error *error;
|
|
bool initialized;
|
|
bool dirty_pages_supported;
|
|
+ bool dirty_log_manual_clear;
|
|
uint64_t dirty_pgsizes;
|
|
uint64_t max_dirty_bitmap_size;
|
|
unsigned long pgsizes;
|
|
--
|
|
2.27.0
|
|
|