qemu/vhost-introduce-bytemap-for-vhost-backend-logging.patch

305 lines
10 KiB
Diff
Raw Permalink Normal View History

QEMU update to version 8.2.0-5 - vfio/migration: Add support for manual clear vfio dirty log - vfio: Maintain DMA mapping range for the container - linux-headers: update against 5.10 and manual clear vfio dirty log series - arm/acpi: Fix when make qemu-system-aarch64 at x86_64 host bios_tables_test fail reason: __aarch64__ macro let build_pptt at x86_64 and aarch64 host build different function that let bios_tables_test fail. - pl031: support rtc-timer property for pl031 - feature: Add logs for vm start and destroy - feature: Add log for each modules - log: Add log at boot & cpu init for aarch64 - bugfix: irq: Avoid covering object refcount of qemu_irq - i386: cache passthrough: Update AMD 8000_001D.EAX[25:14] based on vCPU topo - freeclock: set rtc_date_diff for X86 - freeclock: set rtc_date_diff for arm - freeclock: add qmp command to get time offset of vm in seconds - tests: Disable filemonitor testcase - shadow_dev: introduce shadow dev for virtio-net device - pl011: reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff - tests: virt: Update expected ACPI tables for virt test(update BinDir) - arm64: Add the cpufreq device to show cpufreq info to guest - hw/arm64: add vcpu cache info support - tests: virt: Allow changes to PPTT test table - cpu: add Cortex-A72 processor kvm target support - cpu: add Kunpeng-920 cpu support - net: eepro100: validate various address valuesi(CVE-2021-20255) - ide: ahci: add check to avoid null dereference (CVE-2019-12067) - vdpa: set vring enable only if the vring address has already been set - docs: Add generic vhost-vdpa device documentation - vdpa: don't suspend/resume device when vdpa device not started - vdpa: correct param passed in when unregister save - vdpa: suspend function return 0 when the vdpa device is stopped - vdpa: support vdpa device suspend/resume - vdpa: move memory listener to the realize stage - vdpa: implement vdpa device migration - vhost: implement migration state notifier for vdpa device - vhost: implement post resume bh - vhost: implement savevm_handler for vdpa device - vhost: implement vhost_vdpa_device_suspend/resume - vhost: implement vhost-vdpa suspend/resume - vhost: add vhost_dev_suspend/resume_op - vhost: introduce bytemap for vhost backend logging - vhost-vdpa: add migration log ops for VhostOps - vhost-vdpa: add VHOST_BACKEND_F_BYTEMAPLOG - hw/usb: reduce the vpcu cost of UHCI when VNC disconnect - virtio-net: update the default and max of rx/tx_queue_size - virtio-net: set the max of queue size to 4096 - virtio-net: fix max vring buf size when set ring num - virtio-net: bugfix: do not delete netdev before virtio net - monitor: Discard BLOCK_IO_ERROR event when VM rebooted - vhost-user: add unregister_savevm when vhost-user cleanup - vhost-user: add vhost_set_mem_table when vm load_setup at destination - vhost-user: quit infinite loop while used memslots is more than the backend limit - fix qemu-core when vhost-user-net config with server mode - vhost-user: Add support reconnect vhost-user socket - vhost-user: Set the acked_features to vm's featrue - i6300esb watchdog: bugfix: Add a runstate transition - hw/net/rocker_of_dpa: fix double free bug of rocker device - net/dump.c: Suppress spurious compiler warning - pcie: Add pcie-root-port fast plug/unplug feature - pcie: Compat with devices which do not support Link Width, such as ioh3420 - qdev/monitors: Fix reundant error_setg of qdev_add_device - qemu-nbd: set timeout to qemu-nbd socket - qemu-nbd: make native as the default aio mode - nbd/server.c: fix invalid read after client was already free - virtio-scsi: bugfix: fix qemu crash for hotplug scsi disk with dataplane - virtio: bugfix: check the value of caches before accessing it - virtio: print the guest virtio_net features that host does not support - virtio: bugfix: add rcu_read_lock when vring_avail_idx is called - virtio: check descriptor numbers - migration: report multiFd related thread pid to libvirt - migration: report migration related thread pid to libvirt - cpu/features: fix bug for memory leakage - doc: Update multi-thread compression doc - migration: Add compress_level sanity check - migration: Add zstd support in multi-thread compression - migration: Add multi-thread compress ops - migration: Refactoring multi-thread compress migration - migration: Add multi-thread compress method - migration: skip cache_drop for bios bootloader and nvram template - oslib-posix: optimise vm startup time for 1G hugepage - monitor/qmp: drop inflight rsp if qmp client broken - ps2: fix oob in ps2 kbd - Currently, while kvm and qemu can not handle some kvm exit, qemu will do vm_stop, which will make vm in pause state. This action make vm unrecoverable, so send guest panic to libvirt instead. - vhost: cancel migration when vhost-user restarted during migraiton Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
2024-04-07 10:21:31 +08:00
From 962acd498b11ae5ccc040d76ec89990add119dec Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:09:26 +0800
Subject: [PATCH] vhost: introduce bytemap for vhost backend logging
As vhost backend may use bytemap for logging, when get log_size
of vhost device, check whether vhost device support VHOST_BACKEND_F_BYTEMAPLOG.
If vhost device support, use bytemap for logging.
By the way, add log_resize func pointer check and vhost_log_sync return
value check.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vhost.c | 89 ++++++++++++++++++++++++++++++++++++---
include/exec/memory.h | 9 ++++
include/exec/ram_addr.h | 44 +++++++++++++++++++
include/hw/virtio/vhost.h | 1 +
system/physmem.c | 11 +++++
5 files changed, 148 insertions(+), 6 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 038ac37dd0..438182d850 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -29,6 +29,7 @@
#include "migration/migration.h"
#include "sysemu/dma.h"
#include "trace.h"
+#include "qapi/qapi-commands-migration.h"
/* enabled until disconnected backend stabilizes */
#define _VHOST_DEBUG 1
@@ -44,6 +45,11 @@
do { } while (0)
#endif
+static inline bool vhost_bytemap_log_support(struct vhost_dev *dev)
+{
+ return (dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_BYTEMAPLOG));
+}
+
static struct vhost_log *vhost_log;
static struct vhost_log *vhost_log_shm;
@@ -232,12 +238,40 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
return 0;
}
+static int vhost_sync_dirty_bytemap(struct vhost_dev *dev,
+ MemoryRegionSection *section)
+{
+ unsigned long *bytemap = dev->log->log;
+ return memory_section_set_dirty_bytemap(section, bytemap);
+}
+
static void vhost_log_sync(MemoryListener *listener,
MemoryRegionSection *section)
{
struct vhost_dev *dev = container_of(listener, struct vhost_dev,
memory_listener);
- vhost_sync_dirty_bitmap(dev, section, 0x0, ~0x0ULL);
+ MigrationState *ms = migrate_get_current();
+
+ if (!dev->log_enabled || !dev->started) {
+ return;
+ }
+
+ if (dev->vhost_ops->vhost_log_sync) {
+ int r = dev->vhost_ops->vhost_log_sync(dev);
+ if (r < 0) {
+ error_report("Failed to sync dirty log: 0x%x\n", r);
+ if (migration_is_running(ms->state)) {
+ qmp_migrate_cancel(NULL);
+ }
+ return;
+ }
+ }
+
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_sync_dirty_bytemap(dev, section);
+ } else {
+ vhost_sync_dirty_bitmap(dev, section, 0x0, ~0x0ULL);
+ }
}
static void vhost_log_sync_range(struct vhost_dev *dev,
@@ -247,7 +281,11 @@ static void vhost_log_sync_range(struct vhost_dev *dev,
/* FIXME: this is N^2 in number of sections */
for (i = 0; i < dev->n_mem_sections; ++i) {
MemoryRegionSection *section = &dev->mem_sections[i];
- vhost_sync_dirty_bitmap(dev, section, first, last);
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_sync_dirty_bytemap(dev, section);
+ } else {
+ vhost_sync_dirty_bitmap(dev, section, first, last);
+ }
}
}
@@ -255,11 +293,19 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
{
uint64_t log_size = 0;
int i;
+ uint64_t vhost_log_chunk_size;
+
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_log_chunk_size = VHOST_LOG_CHUNK_BYTES;
+ } else {
+ vhost_log_chunk_size = VHOST_LOG_CHUNK;
+ }
+
for (i = 0; i < dev->mem->nregions; ++i) {
struct vhost_memory_region *reg = dev->mem->regions + i;
uint64_t last = range_get_last(reg->guest_phys_addr,
reg->memory_size);
- log_size = MAX(log_size, last / VHOST_LOG_CHUNK + 1);
+ log_size = MAX(log_size, last / vhost_log_chunk_size + 1);
}
return log_size;
}
@@ -377,12 +423,21 @@ static bool vhost_dev_log_is_shared(struct vhost_dev *dev)
dev->vhost_ops->vhost_requires_shm_log(dev);
}
-static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
+static inline int vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
{
struct vhost_log *log = vhost_log_get(size, vhost_dev_log_is_shared(dev));
- uint64_t log_base = (uintptr_t)log->log;
+ uint64_t log_base;
+ int log_fd;
int r;
+ if (!log) {
+ r = -ENOMEM;
+ goto out;
+ }
+
+ log_base = (uint64_t)log->log;
+ log_fd = log_fd;
+
/* inform backend of log switching, this must be done before
releasing the current log, to ensure no logging is lost */
r = dev->vhost_ops->vhost_set_log_base(dev, log_base, log);
@@ -390,9 +445,19 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
VHOST_OPS_DEBUG(r, "vhost_set_log_base failed");
}
+ if (dev->vhost_ops->vhost_set_log_size) {
+ r = dev->vhost_ops->vhost_set_log_size(dev, size, dev->log);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_log_size failed");
+ }
+ }
+
vhost_log_put(dev, true);
dev->log = log;
dev->log_size = size;
+
+out:
+ return r;
}
static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
@@ -1018,7 +1083,11 @@ static int vhost_migration_log(MemoryListener *listener, bool enable)
}
vhost_log_put(dev, false);
} else {
- vhost_dev_log_resize(dev, vhost_get_log_size(dev));
+ r = vhost_dev_log_resize(dev, vhost_get_log_size(dev));
+ if ( r < 0 ) {
+ return r;
+ }
+
r = vhost_dev_set_log(dev, true);
if (r < 0) {
goto check_dev_state;
@@ -2057,6 +2126,14 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
VHOST_OPS_DEBUG(r, "vhost_set_log_base failed");
goto fail_log;
}
+
+ if (hdev->vhost_ops->vhost_set_log_size) {
+ r = hdev->vhost_ops->vhost_set_log_size(hdev, hdev->log_size, hdev->log);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_log_size failed");
+ goto fail_log;
+ }
+ }
}
if (vrings) {
r = vhost_dev_set_vring_enable(hdev, true);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 831f7c996d..e131c2682c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2594,6 +2594,15 @@ MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
MemOp op,
MemTxAttrs attrs);
+/**
+ * memory_section_set_dirty_bytemap: Mark a range of bytes as dirty for a memory section
+ * using a bytemap
+ *
+ * @section: the memory section being dirtied.
+ * @bytemap: bytemap that stores dirty page range information.
+ */
+int64_t memory_section_set_dirty_bytemap(MemoryRegionSection *section, unsigned long *bytemap);
+
/**
* address_space_init: initializes an address space
*
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 90676093f5..ef6988b445 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -535,5 +535,49 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
return num_dirty;
}
+
+#define BYTES_PER_LONG (sizeof(unsigned long))
+#define BYTE_WORD(nr) ((nr) / BYTES_PER_LONG)
+#define BYTES_TO_LONGS(nr) DIV_ROUND_UP(nr, BYTES_PER_LONG)
+
+static inline int64_t _set_dirty_bytemap_atomic(unsigned long *bytemap, unsigned long cur_pfn)
+{
+ char *byte_of_long = (char *)bytemap;
+ int i;
+ int64_t dirty_num = 0;
+
+ for (i = 0; i < BYTES_PER_LONG; i++) {
+ if (byte_of_long[i]) {
+ cpu_physical_memory_set_dirty_range((cur_pfn + i) << TARGET_PAGE_BITS,
+ TARGET_PAGE_SIZE,
+ 1 << DIRTY_MEMORY_MIGRATION);
+ /* Per byte ops, no need to atomic_xchg */
+ byte_of_long[i] = 0;
+ dirty_num++;
+ }
+ }
+
+ return dirty_num;
+}
+
+static inline int64_t cpu_physical_memory_set_dirty_bytemap(unsigned long *bytemap,
+ ram_addr_t start,
+ ram_addr_t pages)
+{
+ unsigned long i;
+ unsigned long len = BYTES_TO_LONGS(pages);
+ unsigned long pfn = (start >> TARGET_PAGE_BITS) /
+ BYTES_PER_LONG * BYTES_PER_LONG;
+ int64_t dirty_mig_bits = 0;
+
+ for (i = 0; i < len; i++) {
+ if (bytemap[i]) {
+ dirty_mig_bits += _set_dirty_bytemap_atomic(&bytemap[i],
+ pfn + BYTES_PER_LONG * i);
+ }
+ }
+
+ return dirty_mig_bits;
+}
#endif
#endif
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 444ca0ad42..6ae86833e3 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -43,6 +43,7 @@ typedef unsigned long vhost_log_chunk_t;
#define VHOST_LOG_PAGE 0x1000
#define VHOST_LOG_BITS (8 * sizeof(vhost_log_chunk_t))
#define VHOST_LOG_CHUNK (VHOST_LOG_PAGE * VHOST_LOG_BITS)
+#define VHOST_LOG_CHUNK_BYTES (VHOST_LOG_PAGE * sizeof(vhost_log_chunk_t))
#define VHOST_INVALID_FEATURE_BIT (0xff)
#define VHOST_QUEUE_NUM_CONFIG_INR 0
diff --git a/system/physmem.c b/system/physmem.c
index f14d64819b..247c252e53 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2602,6 +2602,17 @@ static void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
cpu_physical_memory_set_dirty_range(addr, length, dirty_log_mask);
}
+int64_t memory_section_set_dirty_bytemap(MemoryRegionSection *section, unsigned long *bytemap)
+{
+ ram_addr_t start = section->offset_within_region +
+ memory_region_get_ram_addr(section->mr);
+ ram_addr_t pages = int128_get64(section->size) >> TARGET_PAGE_BITS;
+
+ hwaddr idx = BYTE_WORD(
+ section->offset_within_address_space >> TARGET_PAGE_BITS);
+ return cpu_physical_memory_set_dirty_bytemap(bytemap + idx, start, pages);
+}
+
void memory_region_flush_rom_device(MemoryRegion *mr, hwaddr addr, hwaddr size)
{
/*
--
2.27.0