qemu/vhost-introduce-bytemap-for-vhost-backend-logging.patch
Jiabo Feng c300b8e80b QEMU update to version 8.2.0-5
- vfio/migration: Add support for manual clear vfio dirty log
- vfio: Maintain DMA mapping range for the container
- linux-headers: update against 5.10 and manual clear vfio dirty log series
- arm/acpi: Fix when make qemu-system-aarch64 at x86_64 host bios_tables_test fail reason: __aarch64__ macro let build_pptt at x86_64 and aarch64 host build different function that let bios_tables_test fail.
- pl031: support rtc-timer property for pl031
- feature: Add logs for vm start and destroy
- feature: Add log for each modules
- log: Add log at boot & cpu init for aarch64
- bugfix: irq: Avoid covering object refcount of qemu_irq
- i386: cache passthrough: Update AMD 8000_001D.EAX[25:14] based on vCPU topo
- freeclock: set rtc_date_diff for X86
- freeclock: set rtc_date_diff for arm
- freeclock: add qmp command to get time offset of vm in seconds
- tests: Disable filemonitor testcase
- shadow_dev: introduce shadow dev for virtio-net device
- pl011: reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff
- tests: virt: Update expected ACPI tables for virt test(update BinDir)
- arm64: Add the cpufreq device to show cpufreq info to guest
- hw/arm64: add vcpu cache info support
- tests: virt: Allow changes to PPTT test table
- cpu: add Cortex-A72 processor kvm target support
- cpu: add Kunpeng-920 cpu support
- net: eepro100: validate various address valuesi(CVE-2021-20255)
- ide: ahci: add check to avoid null dereference (CVE-2019-12067)
- vdpa: set vring enable only if the vring address has already been set
- docs: Add generic vhost-vdpa device documentation
- vdpa: don't suspend/resume device when vdpa device not started
- vdpa: correct param passed in when unregister save
- vdpa: suspend function return 0 when the vdpa device is stopped
- vdpa: support vdpa device suspend/resume
- vdpa: move memory listener to the realize stage
- vdpa: implement vdpa device migration
- vhost: implement migration state notifier for vdpa device
- vhost: implement post resume bh
- vhost: implement savevm_handler for vdpa device
- vhost: implement vhost_vdpa_device_suspend/resume
- vhost: implement vhost-vdpa suspend/resume
- vhost: add vhost_dev_suspend/resume_op
- vhost: introduce bytemap for vhost backend logging
- vhost-vdpa: add migration log ops for VhostOps
- vhost-vdpa: add VHOST_BACKEND_F_BYTEMAPLOG
- hw/usb: reduce the vpcu cost of UHCI when VNC disconnect
- virtio-net: update the default and max of rx/tx_queue_size
- virtio-net: set the max of queue size to 4096
- virtio-net: fix max vring buf size when set ring num
- virtio-net: bugfix: do not delete netdev before virtio net
- monitor: Discard BLOCK_IO_ERROR event when VM rebooted
- vhost-user: add unregister_savevm when vhost-user cleanup
- vhost-user: add vhost_set_mem_table when vm load_setup at destination
- vhost-user: quit infinite loop while used memslots is more than the backend limit
- fix qemu-core when vhost-user-net config with server mode
- vhost-user: Add support reconnect vhost-user socket
- vhost-user: Set the acked_features to vm's featrue
- i6300esb watchdog: bugfix: Add a runstate transition
- hw/net/rocker_of_dpa: fix double free bug of rocker device
- net/dump.c: Suppress spurious compiler warning
- pcie: Add pcie-root-port fast plug/unplug feature
- pcie: Compat with devices which do not support Link Width, such as ioh3420
- qdev/monitors: Fix reundant error_setg of qdev_add_device
- qemu-nbd: set timeout to qemu-nbd socket
- qemu-nbd: make native as the default aio mode
- nbd/server.c: fix invalid read after client was already free
- virtio-scsi: bugfix: fix qemu crash for hotplug scsi disk with dataplane
- virtio: bugfix: check the value of caches before accessing it
- virtio: print the guest virtio_net features that host does not support
- virtio: bugfix: add rcu_read_lock when vring_avail_idx is called
- virtio: check descriptor numbers
- migration: report multiFd related thread pid to libvirt
- migration: report migration related thread pid to libvirt
- cpu/features: fix bug for memory leakage
- doc: Update multi-thread compression doc
- migration: Add compress_level sanity check
- migration: Add zstd support in multi-thread compression
- migration: Add multi-thread compress ops
- migration: Refactoring multi-thread compress migration
- migration: Add multi-thread compress method
- migration: skip cache_drop for bios bootloader and nvram template
- oslib-posix: optimise vm startup time for 1G hugepage
- monitor/qmp: drop inflight rsp if qmp client broken
- ps2: fix oob in ps2 kbd
- Currently, while kvm and qemu can not handle some kvm exit, qemu will do vm_stop, which will make vm in pause state. This action make vm unrecoverable, so send guest panic to libvirt instead.
- vhost: cancel migration when vhost-user restarted during migraiton

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
2024-04-10 20:19:06 +08:00

305 lines
10 KiB
Diff

From 962acd498b11ae5ccc040d76ec89990add119dec Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:09:26 +0800
Subject: [PATCH] vhost: introduce bytemap for vhost backend logging
As vhost backend may use bytemap for logging, when get log_size
of vhost device, check whether vhost device support VHOST_BACKEND_F_BYTEMAPLOG.
If vhost device support, use bytemap for logging.
By the way, add log_resize func pointer check and vhost_log_sync return
value check.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vhost.c | 89 ++++++++++++++++++++++++++++++++++++---
include/exec/memory.h | 9 ++++
include/exec/ram_addr.h | 44 +++++++++++++++++++
include/hw/virtio/vhost.h | 1 +
system/physmem.c | 11 +++++
5 files changed, 148 insertions(+), 6 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 038ac37dd0..438182d850 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -29,6 +29,7 @@
#include "migration/migration.h"
#include "sysemu/dma.h"
#include "trace.h"
+#include "qapi/qapi-commands-migration.h"
/* enabled until disconnected backend stabilizes */
#define _VHOST_DEBUG 1
@@ -44,6 +45,11 @@
do { } while (0)
#endif
+static inline bool vhost_bytemap_log_support(struct vhost_dev *dev)
+{
+ return (dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_BYTEMAPLOG));
+}
+
static struct vhost_log *vhost_log;
static struct vhost_log *vhost_log_shm;
@@ -232,12 +238,40 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
return 0;
}
+static int vhost_sync_dirty_bytemap(struct vhost_dev *dev,
+ MemoryRegionSection *section)
+{
+ unsigned long *bytemap = dev->log->log;
+ return memory_section_set_dirty_bytemap(section, bytemap);
+}
+
static void vhost_log_sync(MemoryListener *listener,
MemoryRegionSection *section)
{
struct vhost_dev *dev = container_of(listener, struct vhost_dev,
memory_listener);
- vhost_sync_dirty_bitmap(dev, section, 0x0, ~0x0ULL);
+ MigrationState *ms = migrate_get_current();
+
+ if (!dev->log_enabled || !dev->started) {
+ return;
+ }
+
+ if (dev->vhost_ops->vhost_log_sync) {
+ int r = dev->vhost_ops->vhost_log_sync(dev);
+ if (r < 0) {
+ error_report("Failed to sync dirty log: 0x%x\n", r);
+ if (migration_is_running(ms->state)) {
+ qmp_migrate_cancel(NULL);
+ }
+ return;
+ }
+ }
+
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_sync_dirty_bytemap(dev, section);
+ } else {
+ vhost_sync_dirty_bitmap(dev, section, 0x0, ~0x0ULL);
+ }
}
static void vhost_log_sync_range(struct vhost_dev *dev,
@@ -247,7 +281,11 @@ static void vhost_log_sync_range(struct vhost_dev *dev,
/* FIXME: this is N^2 in number of sections */
for (i = 0; i < dev->n_mem_sections; ++i) {
MemoryRegionSection *section = &dev->mem_sections[i];
- vhost_sync_dirty_bitmap(dev, section, first, last);
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_sync_dirty_bytemap(dev, section);
+ } else {
+ vhost_sync_dirty_bitmap(dev, section, first, last);
+ }
}
}
@@ -255,11 +293,19 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
{
uint64_t log_size = 0;
int i;
+ uint64_t vhost_log_chunk_size;
+
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_log_chunk_size = VHOST_LOG_CHUNK_BYTES;
+ } else {
+ vhost_log_chunk_size = VHOST_LOG_CHUNK;
+ }
+
for (i = 0; i < dev->mem->nregions; ++i) {
struct vhost_memory_region *reg = dev->mem->regions + i;
uint64_t last = range_get_last(reg->guest_phys_addr,
reg->memory_size);
- log_size = MAX(log_size, last / VHOST_LOG_CHUNK + 1);
+ log_size = MAX(log_size, last / vhost_log_chunk_size + 1);
}
return log_size;
}
@@ -377,12 +423,21 @@ static bool vhost_dev_log_is_shared(struct vhost_dev *dev)
dev->vhost_ops->vhost_requires_shm_log(dev);
}
-static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
+static inline int vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
{
struct vhost_log *log = vhost_log_get(size, vhost_dev_log_is_shared(dev));
- uint64_t log_base = (uintptr_t)log->log;
+ uint64_t log_base;
+ int log_fd;
int r;
+ if (!log) {
+ r = -ENOMEM;
+ goto out;
+ }
+
+ log_base = (uint64_t)log->log;
+ log_fd = log_fd;
+
/* inform backend of log switching, this must be done before
releasing the current log, to ensure no logging is lost */
r = dev->vhost_ops->vhost_set_log_base(dev, log_base, log);
@@ -390,9 +445,19 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
VHOST_OPS_DEBUG(r, "vhost_set_log_base failed");
}
+ if (dev->vhost_ops->vhost_set_log_size) {
+ r = dev->vhost_ops->vhost_set_log_size(dev, size, dev->log);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_log_size failed");
+ }
+ }
+
vhost_log_put(dev, true);
dev->log = log;
dev->log_size = size;
+
+out:
+ return r;
}
static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
@@ -1018,7 +1083,11 @@ static int vhost_migration_log(MemoryListener *listener, bool enable)
}
vhost_log_put(dev, false);
} else {
- vhost_dev_log_resize(dev, vhost_get_log_size(dev));
+ r = vhost_dev_log_resize(dev, vhost_get_log_size(dev));
+ if ( r < 0 ) {
+ return r;
+ }
+
r = vhost_dev_set_log(dev, true);
if (r < 0) {
goto check_dev_state;
@@ -2057,6 +2126,14 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
VHOST_OPS_DEBUG(r, "vhost_set_log_base failed");
goto fail_log;
}
+
+ if (hdev->vhost_ops->vhost_set_log_size) {
+ r = hdev->vhost_ops->vhost_set_log_size(hdev, hdev->log_size, hdev->log);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_log_size failed");
+ goto fail_log;
+ }
+ }
}
if (vrings) {
r = vhost_dev_set_vring_enable(hdev, true);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 831f7c996d..e131c2682c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2594,6 +2594,15 @@ MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
MemOp op,
MemTxAttrs attrs);
+/**
+ * memory_section_set_dirty_bytemap: Mark a range of bytes as dirty for a memory section
+ * using a bytemap
+ *
+ * @section: the memory section being dirtied.
+ * @bytemap: bytemap that stores dirty page range information.
+ */
+int64_t memory_section_set_dirty_bytemap(MemoryRegionSection *section, unsigned long *bytemap);
+
/**
* address_space_init: initializes an address space
*
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 90676093f5..ef6988b445 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -535,5 +535,49 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
return num_dirty;
}
+
+#define BYTES_PER_LONG (sizeof(unsigned long))
+#define BYTE_WORD(nr) ((nr) / BYTES_PER_LONG)
+#define BYTES_TO_LONGS(nr) DIV_ROUND_UP(nr, BYTES_PER_LONG)
+
+static inline int64_t _set_dirty_bytemap_atomic(unsigned long *bytemap, unsigned long cur_pfn)
+{
+ char *byte_of_long = (char *)bytemap;
+ int i;
+ int64_t dirty_num = 0;
+
+ for (i = 0; i < BYTES_PER_LONG; i++) {
+ if (byte_of_long[i]) {
+ cpu_physical_memory_set_dirty_range((cur_pfn + i) << TARGET_PAGE_BITS,
+ TARGET_PAGE_SIZE,
+ 1 << DIRTY_MEMORY_MIGRATION);
+ /* Per byte ops, no need to atomic_xchg */
+ byte_of_long[i] = 0;
+ dirty_num++;
+ }
+ }
+
+ return dirty_num;
+}
+
+static inline int64_t cpu_physical_memory_set_dirty_bytemap(unsigned long *bytemap,
+ ram_addr_t start,
+ ram_addr_t pages)
+{
+ unsigned long i;
+ unsigned long len = BYTES_TO_LONGS(pages);
+ unsigned long pfn = (start >> TARGET_PAGE_BITS) /
+ BYTES_PER_LONG * BYTES_PER_LONG;
+ int64_t dirty_mig_bits = 0;
+
+ for (i = 0; i < len; i++) {
+ if (bytemap[i]) {
+ dirty_mig_bits += _set_dirty_bytemap_atomic(&bytemap[i],
+ pfn + BYTES_PER_LONG * i);
+ }
+ }
+
+ return dirty_mig_bits;
+}
#endif
#endif
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 444ca0ad42..6ae86833e3 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -43,6 +43,7 @@ typedef unsigned long vhost_log_chunk_t;
#define VHOST_LOG_PAGE 0x1000
#define VHOST_LOG_BITS (8 * sizeof(vhost_log_chunk_t))
#define VHOST_LOG_CHUNK (VHOST_LOG_PAGE * VHOST_LOG_BITS)
+#define VHOST_LOG_CHUNK_BYTES (VHOST_LOG_PAGE * sizeof(vhost_log_chunk_t))
#define VHOST_INVALID_FEATURE_BIT (0xff)
#define VHOST_QUEUE_NUM_CONFIG_INR 0
diff --git a/system/physmem.c b/system/physmem.c
index f14d64819b..247c252e53 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2602,6 +2602,17 @@ static void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
cpu_physical_memory_set_dirty_range(addr, length, dirty_log_mask);
}
+int64_t memory_section_set_dirty_bytemap(MemoryRegionSection *section, unsigned long *bytemap)
+{
+ ram_addr_t start = section->offset_within_region +
+ memory_region_get_ram_addr(section->mr);
+ ram_addr_t pages = int128_get64(section->size) >> TARGET_PAGE_BITS;
+
+ hwaddr idx = BYTE_WORD(
+ section->offset_within_address_space >> TARGET_PAGE_BITS);
+ return cpu_physical_memory_set_dirty_bytemap(bytemap + idx, start, pages);
+}
+
void memory_region_flush_rom_device(MemoryRegion *mr, hwaddr addr, hwaddr size)
{
/*
--
2.27.0