QEMU update to version 8.2.0-5

- vfio/migration: Add support for manual clear vfio dirty log
- vfio: Maintain DMA mapping range for the container
- linux-headers: update against 5.10 and manual clear vfio dirty log series
- arm/acpi: Fix when make qemu-system-aarch64 at x86_64 host bios_tables_test fail reason: __aarch64__ macro let build_pptt at x86_64 and aarch64 host build different function that let bios_tables_test fail.
- pl031: support rtc-timer property for pl031
- feature: Add logs for vm start and destroy
- feature: Add log for each modules
- log: Add log at boot & cpu init for aarch64
- bugfix: irq: Avoid covering object refcount of qemu_irq
- i386: cache passthrough: Update AMD 8000_001D.EAX[25:14] based on vCPU topo
- freeclock: set rtc_date_diff for X86
- freeclock: set rtc_date_diff for arm
- freeclock: add qmp command to get time offset of vm in seconds
- tests: Disable filemonitor testcase
- shadow_dev: introduce shadow dev for virtio-net device
- pl011: reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff
- tests: virt: Update expected ACPI tables for virt test(update BinDir)
- arm64: Add the cpufreq device to show cpufreq info to guest
- hw/arm64: add vcpu cache info support
- tests: virt: Allow changes to PPTT test table
- cpu: add Cortex-A72 processor kvm target support
- cpu: add Kunpeng-920 cpu support
- net: eepro100: validate various address valuesi(CVE-2021-20255)
- ide: ahci: add check to avoid null dereference (CVE-2019-12067)
- vdpa: set vring enable only if the vring address has already been set
- docs: Add generic vhost-vdpa device documentation
- vdpa: don't suspend/resume device when vdpa device not started
- vdpa: correct param passed in when unregister save
- vdpa: suspend function return 0 when the vdpa device is stopped
- vdpa: support vdpa device suspend/resume
- vdpa: move memory listener to the realize stage
- vdpa: implement vdpa device migration
- vhost: implement migration state notifier for vdpa device
- vhost: implement post resume bh
- vhost: implement savevm_handler for vdpa device
- vhost: implement vhost_vdpa_device_suspend/resume
- vhost: implement vhost-vdpa suspend/resume
- vhost: add vhost_dev_suspend/resume_op
- vhost: introduce bytemap for vhost backend logging
- vhost-vdpa: add migration log ops for VhostOps
- vhost-vdpa: add VHOST_BACKEND_F_BYTEMAPLOG
- hw/usb: reduce the vpcu cost of UHCI when VNC disconnect
- virtio-net: update the default and max of rx/tx_queue_size
- virtio-net: set the max of queue size to 4096
- virtio-net: fix max vring buf size when set ring num
- virtio-net: bugfix: do not delete netdev before virtio net
- monitor: Discard BLOCK_IO_ERROR event when VM rebooted
- vhost-user: add unregister_savevm when vhost-user cleanup
- vhost-user: add vhost_set_mem_table when vm load_setup at destination
- vhost-user: quit infinite loop while used memslots is more than the backend limit
- fix qemu-core when vhost-user-net config with server mode
- vhost-user: Add support reconnect vhost-user socket
- vhost-user: Set the acked_features to vm's featrue
- i6300esb watchdog: bugfix: Add a runstate transition
- hw/net/rocker_of_dpa: fix double free bug of rocker device
- net/dump.c: Suppress spurious compiler warning
- pcie: Add pcie-root-port fast plug/unplug feature
- pcie: Compat with devices which do not support Link Width, such as ioh3420
- qdev/monitors: Fix reundant error_setg of qdev_add_device
- qemu-nbd: set timeout to qemu-nbd socket
- qemu-nbd: make native as the default aio mode
- nbd/server.c: fix invalid read after client was already free
- virtio-scsi: bugfix: fix qemu crash for hotplug scsi disk with dataplane
- virtio: bugfix: check the value of caches before accessing it
- virtio: print the guest virtio_net features that host does not support
- virtio: bugfix: add rcu_read_lock when vring_avail_idx is called
- virtio: check descriptor numbers
- migration: report multiFd related thread pid to libvirt
- migration: report migration related thread pid to libvirt
- cpu/features: fix bug for memory leakage
- doc: Update multi-thread compression doc
- migration: Add compress_level sanity check
- migration: Add zstd support in multi-thread compression
- migration: Add multi-thread compress ops
- migration: Refactoring multi-thread compress migration
- migration: Add multi-thread compress method
- migration: skip cache_drop for bios bootloader and nvram template
- oslib-posix: optimise vm startup time for 1G hugepage
- monitor/qmp: drop inflight rsp if qmp client broken
- ps2: fix oob in ps2 kbd
- Currently, while kvm and qemu can not handle some kvm exit, qemu will do vm_stop, which will make vm in pause state. This action make vm unrecoverable, so send guest panic to libvirt instead.
- vhost: cancel migration when vhost-user restarted during migraiton

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
This commit is contained in:
Jiabo Feng 2024-04-07 10:21:31 +08:00
parent 1bf6609652
commit c300b8e80b
84 changed files with 9278 additions and 2 deletions

Binary file not shown.

View File

@ -0,0 +1,27 @@
From 59f038d21c1901245ba0be417f6285cec465d6c1 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Wed, 9 Feb 2022 11:24:32 +0800
Subject: [PATCH] Currently, while kvm and qemu can not handle some kvm exit,
qemu will do vm_stop, which will make vm in pause state. This action make vm
unrecoverable, so send guest panic to libvirt instead.
---
accel/kvm/kvm-all.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e39a810a4e..33f4c6d547 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2993,7 +2993,7 @@ int kvm_cpu_exec(CPUState *cpu)
if (ret < 0) {
cpu_dump_state(cpu, stderr, CPU_DUMP_CODE);
- vm_stop(RUN_STATE_INTERNAL_ERROR);
+ qemu_system_guest_panicked(cpu_get_crash_info(cpu));
}
qatomic_set(&cpu->exit_request, 0);
--
2.27.0

View File

@ -0,0 +1,98 @@
From d269fb9a41abf5888a9bfeec2f8d1684b2d4dfb0 Mon Sep 17 00:00:00 2001
From: saarloos <9090-90-90-9090@163.com>
Date: Sat, 30 Mar 2024 21:32:27 +0800
Subject: [PATCH] arm/acpi: Fix when make qemu-system-aarch64 at x86_64 host
bios_tables_test fail reason: __aarch64__ macro let build_pptt at x86_64 and
aarch64 host build different function that let bios_tables_test fail.
Signed-off-by: Yangzi Zhang <zhangziyang1@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/acpi/aml-build.c | 5 +----
hw/arm/virt-acpi-build.c | 2 +-
include/hw/acpi/aml-build.h | 5 +++--
3 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 714498165a..bf9c59f544 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2016,7 +2016,6 @@ static void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
}
}
-#ifdef __aarch64__
/*
* ACPI spec, Revision 6.3
* 5.2.29.2 Cache Type Structure (Type 1)
@@ -2072,7 +2071,7 @@ static void build_cache_hierarchy_node(GArray *tbl, uint32_t next_level,
* ACPI spec, Revision 6.3
* 5.2.29 Processor Properties Topology Table (PPTT)
*/
-void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
+void build_pptt_arm(GArray *table_data, BIOSLinker *linker, MachineState *ms,
const char *oem_id, const char *oem_table_id)
{
MachineClass *mc = MACHINE_GET_CLASS(ms);
@@ -2172,7 +2171,6 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
acpi_table_end(linker, &table);
}
-#else
/*
* ACPI spec, Revision 6.3
* 5.2.29 Processor Properties Topology Table (PPTT)
@@ -2248,7 +2246,6 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
acpi_table_end(linker, &table);
}
-#endif
/* build rev1/rev3/rev5.1/rev6.0 FADT */
void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 3cb50bdc65..48fc77fb83 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -1024,7 +1024,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
if (!vmc->no_cpu_topology) {
acpi_add_table(table_offsets, tables_blob);
- build_pptt(tables_blob, tables->linker, ms,
+ build_pptt_arm(tables_blob, tables->linker, ms,
vms->oem_id, vms->oem_table_id);
}
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 200cb113de..7281c281f6 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -221,7 +221,6 @@ struct AcpiBuildTables {
BIOSLinker *linker;
} AcpiBuildTables;
-#ifdef __aarch64__
/* Definitions of the hardcoded cache info*/
typedef enum {
@@ -266,7 +265,6 @@ struct offset_status {
uint32_t l1i_offset;
};
-#endif
typedef
struct CrsRangeEntry {
@@ -542,6 +540,9 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
const char *oem_id, const char *oem_table_id);
+void build_pptt_arm(GArray *table_data, BIOSLinker *linker, MachineState *ms,
+ const char *oem_id, const char *oem_table_id);
+
void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
const char *oem_id, const char *oem_table_id);
--
2.27.0

View File

@ -0,0 +1,615 @@
From ebe05c34a66969e4cacc4d6c030dfe93ace89cb2 Mon Sep 17 00:00:00 2001
From: Ying Fang <fangying1@huawei.com>
Date: Tue, 19 Mar 2024 14:35:55 +0800
Subject: [PATCH] arm64: Add the cpufreq device to show cpufreq info to guest
On ARM64 platform, cpu frequency is retrieved via ACPI CPPC.
A virtual cpufreq device based on ACPI CPPC is created to
present cpu frequency info to the guest.
The default frequency is set to host cpu nominal frequency,
which is obtained from the host CPPC sysfs. Other performance
data are set to the same value, since we don't support guest
performance scaling here.
Performance counters are also not emulated and they simply
return 1 if read, and guest should fallback to use desired
performance value as the current performance.
Guest kernel version above 4.18 is required to make it work.
This series is backported from:
https://patchwork.kernel.org/cover/11379943/
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
configs/devices/aarch64-softmmu/default.mak | 1 +
hw/acpi/aml-build.c | 22 ++
hw/acpi/cpufreq.c | 283 ++++++++++++++++++++
hw/acpi/meson.build | 1 +
hw/arm/virt-acpi-build.c | 79 +++++-
hw/arm/virt.c | 13 +
hw/char/Kconfig | 4 +
include/hw/acpi/acpi-defs.h | 40 +++
include/hw/acpi/aml-build.h | 3 +
include/hw/arm/virt.h | 1 +
10 files changed, 444 insertions(+), 3 deletions(-)
create mode 100644 hw/acpi/cpufreq.c
diff --git a/configs/devices/aarch64-softmmu/default.mak b/configs/devices/aarch64-softmmu/default.mak
index f82a04c27d..8d66d0f1af 100644
--- a/configs/devices/aarch64-softmmu/default.mak
+++ b/configs/devices/aarch64-softmmu/default.mak
@@ -8,3 +8,4 @@ include ../arm-softmmu/default.mak
# CONFIG_XLNX_ZYNQMP_ARM=n
# CONFIG_XLNX_VERSAL=n
# CONFIG_SBSA_REF=n
+# CONFIG_CPUFREQ=n
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 2968df5562..714498165a 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1554,6 +1554,28 @@ Aml *aml_sleep(uint64_t msec)
return var;
}
+/* ACPI 5.0b: 6.4.3.7 Generic Register Descriptor */
+Aml *aml_generic_register(AmlRegionSpace rs, uint8_t reg_width,
+ uint8_t reg_offset, AmlAccessType type, uint64_t addr)
+{
+ int i;
+ Aml *var = aml_alloc();
+ build_append_byte(var->buf, 0x82); /* Generic Register Descriptor */
+ build_append_byte(var->buf, 0x0C); /* Length, bits[7:0] value = 0x0C */
+ build_append_byte(var->buf, 0); /* Length, bits[15:8] value = 0 */
+ build_append_byte(var->buf, rs); /* Address Space ID */
+ build_append_byte(var->buf, reg_width); /* Register Bit Width */
+ build_append_byte(var->buf, reg_offset); /* Register Bit Offset */
+ build_append_byte(var->buf, type); /* Access Size */
+
+ /* Register address */
+ for (i = 0; i < 8; i++) {
+ build_append_byte(var->buf, extract64(addr, i * 8, 8));
+ }
+
+ return var;
+}
+
static uint8_t Hex2Byte(const char *src)
{
int hi, lo;
diff --git a/hw/acpi/cpufreq.c b/hw/acpi/cpufreq.c
new file mode 100644
index 0000000000..a84db490b3
--- /dev/null
+++ b/hw/acpi/cpufreq.c
@@ -0,0 +1,283 @@
+/*
+ * ACPI CPPC register device
+ *
+ * Support for showing CPU frequency in guest OS.
+ *
+ * Copyright (c) 2019 HUAWEI TECHNOLOGIES CO.,LTD.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "chardev/char.h"
+#include "qemu/log.h"
+#include "trace.h"
+#include "qemu/option.h"
+#include "sysemu/sysemu.h"
+#include "hw/acpi/acpi-defs.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "hw/boards.h"
+
+#define TYPE_CPUFREQ "cpufreq"
+#define CPUFREQ(obj) OBJECT_CHECK(CpuhzState, (obj), TYPE_CPUFREQ)
+#define NOMINAL_FREQ_FILE "/sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq"
+#define CPU_MAX_FREQ_FILE "/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq"
+#define HZ_MAX_LENGTH 1024
+#define MAX_SUPPORT_SPACE 0x10000
+
+/*
+ * Since Hi1616 will not support CPPC, we simply use its nominal frequency as
+ * the default.
+ */
+#define DEFAULT_HZ 2400
+
+int cppc_regs_offset[CPPC_REG_COUNT] = {
+ [HIGHEST_PERF] = 0,
+ [NOMINAL_PERF] = 4,
+ [LOW_NON_LINEAR_PERF] = 8,
+ [LOWEST_PERF] = 12,
+ [GUARANTEED_PERF] = 16,
+ [DESIRED_PERF] = 20,
+ [MIN_PERF] = -1,
+ [MAX_PERF] = -1,
+ [PERF_REDUC_TOLERANCE] = -1,
+ [TIME_WINDOW] = -1,
+ [CTR_WRAP_TIME] = -1,
+ [REFERENCE_CTR] = 24,
+ [DELIVERED_CTR] = 32,
+ [PERF_LIMITED] = 40,
+ [ENABLE] = -1,
+ [AUTO_SEL_ENABLE] = -1,
+ [AUTO_ACT_WINDOW] = -1,
+ [ENERGY_PERF] = -1,
+ [REFERENCE_PERF] = -1,
+ [LOWEST_FREQ] = 44,
+ [NOMINAL_FREQ] = 48,
+};
+
+typedef struct CpuhzState {
+ SysBusDevice parent_obj;
+
+ MemoryRegion iomem;
+ uint32_t HighestPerformance;
+ uint32_t NominalPerformance;
+ uint32_t LowestNonlinearPerformance;
+ uint32_t LowestPerformance;
+ uint32_t GuaranteedPerformance;
+ uint32_t DesiredPerformance;
+ uint64_t ReferencePerformanceCounter;
+ uint64_t DeliveredPerformanceCounter;
+ uint32_t PerformanceLimited;
+ uint32_t LowestFreq;
+ uint32_t NominalFreq;
+ uint32_t reg_size;
+} CpuhzState;
+
+
+static uint64_t cpufreq_read(void *opaque, hwaddr offset, unsigned size)
+{
+ CpuhzState *s = (CpuhzState *)opaque;
+ uint64_t r;
+ uint64_t n;
+
+ MachineState *ms = MACHINE(qdev_get_machine());
+ unsigned int smp_cpus = ms->smp.cpus;
+
+ if (offset >= smp_cpus * CPPC_REG_PER_CPU_STRIDE) {
+ warn_report("cpufreq_read: offset 0x%lx out of range", offset);
+ return 0;
+ }
+
+ n = offset % CPPC_REG_PER_CPU_STRIDE;
+ switch (n) {
+ case 0:
+ r = s->HighestPerformance;
+ break;
+ case 4:
+ r = s->NominalPerformance;
+ break;
+ case 8:
+ r = s->LowestNonlinearPerformance;
+ break;
+ case 12:
+ r = s->LowestPerformance;
+ break;
+ case 16:
+ r = s->GuaranteedPerformance;
+ break;
+ case 20:
+ r = s->DesiredPerformance;
+ break;
+ /*
+ * We don't have real counters and it is hard to emulate, so always set the
+ * counter value to 1 to rely on Linux to use the DesiredPerformance value
+ * directly.
+ */
+ case 24:
+ r = s->ReferencePerformanceCounter;
+ break;
+ /*
+ * Guest may still access the register by 32bit; add the process to
+ * eliminate unnecessary warnings.
+ */
+ case 28:
+ r = s->ReferencePerformanceCounter >> 32;
+ break;
+ case 32:
+ r = s->DeliveredPerformanceCounter;
+ break;
+ case 36:
+ r = s->DeliveredPerformanceCounter >> 32;
+ break;
+
+ case 40:
+ r = s->PerformanceLimited;
+ break;
+ case 44:
+ r = s->LowestFreq;
+ break;
+ case 48:
+ r = s->NominalFreq;
+ break;
+ default:
+ error_printf("cpufreq_read: Bad offset 0x%lx\n", offset);
+ r = 0;
+ break;
+ }
+ return r;
+}
+
+static void cpufreq_write(void *opaque, hwaddr offset,
+ uint64_t value, unsigned size)
+{
+ uint64_t n;
+ MachineState *ms = MACHINE(qdev_get_machine());
+ unsigned int smp_cpus = ms->smp.cpus;
+
+ if (offset >= smp_cpus * CPPC_REG_PER_CPU_STRIDE) {
+ error_printf("cpufreq_write: offset 0x%lx out of range", offset);
+ return;
+ }
+
+ n = offset % CPPC_REG_PER_CPU_STRIDE;
+
+ switch (n) {
+ case 20:
+ break;
+ default:
+ error_printf("cpufreq_write: Bad offset 0x%lx\n", offset);
+ }
+}
+
+static uint32_t CPPC_Read(const char *hostpath)
+{
+ int fd;
+ char buffer[HZ_MAX_LENGTH] = { 0 };
+ uint64_t hz;
+ int len;
+ const char *endptr = NULL;
+ int ret;
+
+ fd = qemu_open_old(hostpath, O_RDONLY);
+ if (fd < 0) {
+ return 0;
+ }
+
+ len = read(fd, buffer, HZ_MAX_LENGTH);
+ qemu_close(fd);
+ if (len <= 0) {
+ return 0;
+ }
+ ret = qemu_strtoul(buffer, &endptr, 0, &hz);
+ if (ret < 0) {
+ return 0;
+ }
+ return (uint32_t)hz;
+}
+
+static const MemoryRegionOps cpufreq_ops = {
+ .read = cpufreq_read,
+ .write = cpufreq_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void hz_init(CpuhzState *s)
+{
+ uint32_t hz;
+
+ hz = CPPC_Read(NOMINAL_FREQ_FILE);
+ if (hz == 0) {
+ hz = CPPC_Read(CPU_MAX_FREQ_FILE);
+ if (hz == 0) {
+ hz = DEFAULT_HZ;
+ } else {
+ /* Value in CpuMaxFrequency is in KHz unit; convert to MHz */
+ hz = hz / 1000;
+ }
+ }
+
+ s->HighestPerformance = hz;
+ s->NominalPerformance = hz;
+ s->LowestNonlinearPerformance = hz;
+ s->LowestPerformance = hz;
+ s->GuaranteedPerformance = hz;
+ s->DesiredPerformance = hz;
+ s->ReferencePerformanceCounter = 1;
+ s->DeliveredPerformanceCounter = 1;
+ s->PerformanceLimited = 0;
+ s->LowestFreq = hz;
+ s->NominalFreq = hz;
+}
+
+static void cpufreq_init(Object *obj)
+{
+ SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
+ CpuhzState *s = CPUFREQ(obj);
+
+ MachineState *ms = MACHINE(qdev_get_machine());
+ unsigned int smp_cpus = ms->smp.cpus;
+
+ s->reg_size = smp_cpus * CPPC_REG_PER_CPU_STRIDE;
+ if (s->reg_size > MAX_SUPPORT_SPACE) {
+ error_report("Required space 0x%x excesses the max support 0x%x",
+ s->reg_size, MAX_SUPPORT_SPACE);
+ goto err_end;
+ }
+
+ memory_region_init_io(&s->iomem, OBJECT(s), &cpufreq_ops, s, "cpufreq",
+ s->reg_size);
+ sysbus_init_mmio(sbd, &s->iomem);
+ hz_init(s);
+ return;
+
+err_end:
+ /* Set desired perf register offset to -1 to indicate no support for CPPC */
+ cppc_regs_offset[DESIRED_PERF] = -1;
+}
+
+static const TypeInfo cpufreq_arm_info = {
+ .name = TYPE_CPUFREQ,
+ .parent = TYPE_SYS_BUS_DEVICE,
+ .instance_size = sizeof(CpuhzState),
+ .instance_init = cpufreq_init,
+};
+
+static void cpufreq_register_types(void)
+{
+ type_register_static(&cpufreq_arm_info);
+}
+
+type_init(cpufreq_register_types)
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index fc1b952379..d36b10ea3c 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -27,6 +27,7 @@ acpi_ss.add(when: 'CONFIG_ACPI_ICH9', if_true: files('ich9.c', 'ich9_tco.c'))
acpi_ss.add(when: 'CONFIG_ACPI_ERST', if_true: files('erst.c'))
acpi_ss.add(when: 'CONFIG_IPMI', if_true: files('ipmi.c'), if_false: files('ipmi-stub.c'))
acpi_ss.add(when: 'CONFIG_PC', if_false: files('acpi-x86-stub.c'))
+acpi_ss.add(when: 'CONFIG_CPUFREQ', if_true: files('cpufreq.c'))
if have_tpm
acpi_ss.add(files('tpm.c'))
endif
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 8bc35a483c..3cb50bdc65 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -63,7 +63,68 @@
#define ACPI_BUILD_TABLE_SIZE 0x20000
-static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms)
+static void acpi_dsdt_add_psd(Aml *dev, int cpus)
+{
+ Aml *pkg;
+ Aml *sub;
+
+ sub = aml_package(5);
+ aml_append(sub, aml_int(5));
+ aml_append(sub, aml_int(0));
+ /* Assume all vCPUs belong to the same domain */
+ aml_append(sub, aml_int(0));
+ /* SW_ANY: OSPM coordinate, initiate on any processor */
+ aml_append(sub, aml_int(0xFD));
+ aml_append(sub, aml_int(cpus));
+
+ pkg = aml_package(1);
+ aml_append(pkg, sub);
+
+ aml_append(dev, aml_name_decl("_PSD", pkg));
+}
+
+static void acpi_dsdt_add_cppc(Aml *dev, uint64_t cpu_base, int *regs_offset)
+{
+ Aml *cpc;
+ int i;
+
+ /* Use version 3 of CPPC table from ACPI 6.3 */
+ cpc = aml_package(23);
+ aml_append(cpc, aml_int(23));
+ aml_append(cpc, aml_int(3));
+
+ for (i = 0; i < CPPC_REG_COUNT; i++) {
+ Aml *res;
+ uint8_t reg_width;
+ uint8_t acc_type;
+ uint64_t addr;
+
+ if (regs_offset[i] == -1) {
+ reg_width = 0;
+ acc_type = AML_ANY_ACC;
+ addr = 0;
+ } else {
+ addr = cpu_base + regs_offset[i];
+ if (i == REFERENCE_CTR || i == DELIVERED_CTR) {
+ reg_width = 64;
+ acc_type = AML_QWORD_ACC;
+ } else {
+ reg_width = 32;
+ acc_type = AML_DWORD_ACC;
+ }
+ }
+
+ res = aml_resource_template();
+ aml_append(res, aml_generic_register(AML_SYSTEM_MEMORY, reg_width, 0,
+ acc_type, addr));
+ aml_append(cpc, res);
+ }
+
+ aml_append(dev, aml_name_decl("_CPC", cpc));
+}
+
+static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms,
+ const MemMapEntry *cppc_memmap)
{
MachineState *ms = MACHINE(vms);
uint16_t i;
@@ -72,7 +133,19 @@ static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms)
Aml *dev = aml_device("C%.03X", i);
aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
aml_append(dev, aml_name_decl("_UID", aml_int(i)));
- aml_append(scope, dev);
+
+ /*
+ * Append _CPC and _PSD to support CPU frequence show
+ * Check CPPC available by DESIRED_PERF register
+ */
+ if (cppc_regs_offset[DESIRED_PERF] != -1) {
+ acpi_dsdt_add_cppc(dev,
+ cppc_memmap->base + i * CPPC_REG_PER_CPU_STRIDE,
+ cppc_regs_offset);
+ acpi_dsdt_add_psd(dev, ms->smp.cpus);
+ }
+
+ aml_append(scope, dev);
}
}
@@ -858,7 +931,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
* the RTC ACPI device at all when using UEFI.
*/
scope = aml_scope("\\_SB");
- acpi_dsdt_add_cpus(scope, vms);
+ acpi_dsdt_add_cpus(scope, vms, &memmap[VIRT_CPUFREQ]);
acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
(irqmap[VIRT_UART] + ARM_SPI_BASE));
if (vmc->acpi_expose_flash) {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b82bd1b8c8..c19cacec8b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -157,6 +157,7 @@ static const MemMapEntry base_memmap[] = {
[VIRT_PVTIME] = { 0x090a0000, 0x00010000 },
[VIRT_SECURE_GPIO] = { 0x090b0000, 0x00001000 },
[VIRT_MMIO] = { 0x0a000000, 0x00000200 },
+ [VIRT_CPUFREQ] = { 0x0b000000, 0x00010000 },
/* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
[VIRT_PLATFORM_BUS] = { 0x0c000000, 0x02000000 },
[VIRT_SECURE_MEM] = { 0x0e000000, 0x01000000 },
@@ -980,6 +981,16 @@ static void create_uart(const VirtMachineState *vms, int uart,
g_free(nodename);
}
+static void create_cpufreq(const VirtMachineState *vms, MemoryRegion *mem)
+{
+ hwaddr base = vms->memmap[VIRT_CPUFREQ].base;
+ DeviceState *dev = qdev_new("cpufreq");
+ SysBusDevice *s = SYS_BUS_DEVICE(dev);
+
+ sysbus_realize_and_unref(s, &error_fatal);
+ memory_region_add_subregion(mem, base, sysbus_mmio_get_region(s, 0));
+}
+
static void create_rtc(const VirtMachineState *vms)
{
char *nodename;
@@ -2346,6 +2357,8 @@ static void machvirt_init(MachineState *machine)
create_uart(vms, VIRT_UART, sysmem, serial_hd(0));
+ create_cpufreq(vms, sysmem);
+
if (vms->secure) {
create_secure_ram(vms, secure_sysmem, secure_tag_sysmem);
create_uart(vms, VIRT_SECURE_UART, secure_sysmem, serial_hd(1));
diff --git a/hw/char/Kconfig b/hw/char/Kconfig
index 6b6cf2fc1d..335a60c2c1 100644
--- a/hw/char/Kconfig
+++ b/hw/char/Kconfig
@@ -71,3 +71,7 @@ config GOLDFISH_TTY
config SHAKTI_UART
bool
+
+config CPUFREQ
+ bool
+ default y
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 2b42e4192b..b1f389fb4b 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -93,4 +93,44 @@ typedef struct AcpiFadtData {
#define ACPI_FADT_ARM_PSCI_COMPLIANT (1 << 0)
#define ACPI_FADT_ARM_PSCI_USE_HVC (1 << 1)
+/*
+ * CPPC register definition from kernel header
+ * include/acpi/cppc_acpi.h
+ * The last element is newly added for easy use
+ */
+enum cppc_regs {
+ HIGHEST_PERF,
+ NOMINAL_PERF,
+ LOW_NON_LINEAR_PERF,
+ LOWEST_PERF,
+ GUARANTEED_PERF,
+ DESIRED_PERF,
+ MIN_PERF,
+ MAX_PERF,
+ PERF_REDUC_TOLERANCE,
+ TIME_WINDOW,
+ CTR_WRAP_TIME,
+ REFERENCE_CTR,
+ DELIVERED_CTR,
+ PERF_LIMITED,
+ ENABLE,
+ AUTO_SEL_ENABLE,
+ AUTO_ACT_WINDOW,
+ ENERGY_PERF,
+ REFERENCE_PERF,
+ LOWEST_FREQ,
+ NOMINAL_FREQ,
+ CPPC_REG_COUNT,
+};
+
+#define CPPC_REG_PER_CPU_STRIDE 0x40
+
+/*
+ * Offset for each CPPC register; -1 for unavailable
+ *
+ * Offset for each CPPC register; -1 for unavailable
+ * The whole register space is unavailable if desired perf offset is -1.
+ */
+extern int cppc_regs_offset[CPPC_REG_COUNT];
+
#endif
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 84ded2ecd3..200cb113de 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -429,6 +429,9 @@ Aml *aml_dma(AmlDmaType typ, AmlDmaBusMaster bm, AmlTransferSize sz,
uint8_t channel);
Aml *aml_sleep(uint64_t msec);
Aml *aml_i2c_serial_bus_device(uint16_t address, const char *resource_source);
+Aml *aml_generic_register(AmlRegionSpace rs, uint8_t reg_width,
+ uint8_t reg_offset, AmlAccessType type,
+ uint64_t addr);
/* Block AML object primitives */
Aml *aml_scope(const char *name_format, ...) G_GNUC_PRINTF(1, 2);
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index f69239850e..e944d434c4 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -60,6 +60,7 @@ enum {
VIRT_GIC_REDIST,
VIRT_SMMU,
VIRT_UART,
+ VIRT_CPUFREQ,
VIRT_MMIO,
VIRT_RTC,
VIRT_FW_CFG,
--
2.27.0

View File

@ -0,0 +1,32 @@
From 48a328ee1a5a71b7048e4591310471c759fc5af6 Mon Sep 17 00:00:00 2001
From: Keqian Zhu <zhukeqian1@huawei.com>
Date: Mon, 27 Jul 2020 20:39:07 +0800
Subject: [PATCH] bugfix: irq: Avoid covering object refcount of qemu_irq
Avoid covering object refcount of qemu_irq, otherwise it may causes
memory leak.
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
hw/core/irq.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/core/irq.c b/hw/core/irq.c
index 3f14e2dda7..df9b5dac9b 100644
--- a/hw/core/irq.c
+++ b/hw/core/irq.c
@@ -110,7 +110,10 @@ void qemu_irq_intercept_in(qemu_irq *gpio_in, qemu_irq_handler handler, int n)
int i;
qemu_irq *old_irqs = qemu_allocate_irqs(NULL, NULL, n);
for (i = 0; i < n; i++) {
- *old_irqs[i] = *gpio_in[i];
+ old_irqs[i]->handler = gpio_in[i]->handler;
+ old_irqs[i]->opaque = gpio_in[i]->opaque;
+ old_irqs[i]->n = gpio_in[i]->n;
+
gpio_in[i]->handler = handler;
gpio_in[i]->opaque = &old_irqs[i];
}
--
2.27.0

View File

@ -0,0 +1,60 @@
From 5853333c9513caea541701c95a4ac691bb97452f Mon Sep 17 00:00:00 2001
From: Xu Yandong <xuyandong2@huawei.com>
Date: Tue, 19 Mar 2024 10:45:56 +0800
Subject: [PATCH] cpu: add Cortex-A72 processor kvm target support
The ARM Cortex-A72 is ARMv8-A micro-architecture,
add kvm target to ARM Cortex-A72 processor definition.
Signed-off-by: Xu Yandong <xuyandong2@huawei.com>
Signed-off-by: Mingwang Li <limingwang@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
target/arm/cpu64.c | 2 +-
target/arm/kvm-consts.h | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 922eac3b61..471014b5a9 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -710,6 +710,7 @@ static void aarch64_a72_initfn(Object *obj)
ARMCPU *cpu = ARM_CPU(obj);
cpu->dtb_compatible = "arm,cortex-a72";
+ cpu->kvm_target = QEMU_KVM_ARM_TARGET_GENERIC_V8;
set_feature(&cpu->env, ARM_FEATURE_V8);
set_feature(&cpu->env, ARM_FEATURE_NEON);
set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
@@ -773,7 +774,6 @@ static void aarch64_kunpeng_920_initfn(Object *obj)
cpu->isar.id_aa64dfr0 = 0x110305408;
cpu->isar.id_aa64isar0 = 0x10211120;
cpu->isar.id_aa64mmfr0 = 0x101125;
- cpu->kvm_target = KVM_ARM_TARGET_GENERIC_V8;
}
static void aarch64_host_initfn(Object *obj)
diff --git a/target/arm/kvm-consts.h b/target/arm/kvm-consts.h
index 7c6adc14f6..c034823170 100644
--- a/target/arm/kvm-consts.h
+++ b/target/arm/kvm-consts.h
@@ -133,6 +133,8 @@ MISMATCH_CHECK(QEMU_PSCI_RET_DISABLED, PSCI_RET_DISABLED);
#define QEMU_KVM_ARM_TARGET_CORTEX_A57 2
#define QEMU_KVM_ARM_TARGET_XGENE_POTENZA 3
#define QEMU_KVM_ARM_TARGET_CORTEX_A53 4
+/* Generic ARM v8 target */
+#define QEMU_KVM_ARM_TARGET_GENERIC_V8 5
/* There's no kernel define for this: sentinel value which
* matches no KVM target value for either 64 or 32 bit
@@ -144,6 +146,7 @@ MISMATCH_CHECK(QEMU_KVM_ARM_TARGET_FOUNDATION_V8, KVM_ARM_TARGET_FOUNDATION_V8);
MISMATCH_CHECK(QEMU_KVM_ARM_TARGET_CORTEX_A57, KVM_ARM_TARGET_CORTEX_A57);
MISMATCH_CHECK(QEMU_KVM_ARM_TARGET_XGENE_POTENZA, KVM_ARM_TARGET_XGENE_POTENZA);
MISMATCH_CHECK(QEMU_KVM_ARM_TARGET_CORTEX_A53, KVM_ARM_TARGET_CORTEX_A53);
+MISMATCH_CHECK(QEMU_KVM_ARM_TARGET_GENERIC_V8, KVM_ARM_TARGET_GENERIC_V8);
#define CP_REG_ARM64 0x6000000000000000ULL
#define CP_REG_ARM_COPROC_MASK 0x000000000FFF0000
--
2.27.0

View File

@ -0,0 +1,120 @@
From e4ae54316651bf6af12de263da158c5ec4ed0401 Mon Sep 17 00:00:00 2001
From: Xu Yandong <xuyandong2@huawei.com>
Date: Mon, 18 Mar 2024 17:31:31 +0800
Subject: [PATCH] cpu: add Kunpeng-920 cpu support
Add the Kunpeng-920 CPU model
Signed-off-by: Xu Yandong <xuyandong2@huawei.com>
Signed-off-by: Mingwang Li <limingwang@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/arm/virt.c | 1 +
target/arm/cpu64.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 73 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index be2856c018..500a15aa5b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -220,6 +220,7 @@ static const char *valid_cpus[] = {
#endif
ARM_CPU_TYPE_NAME("cortex-a53"),
ARM_CPU_TYPE_NAME("cortex-a57"),
+ ARM_CPU_TYPE_NAME("Kunpeng-920"),
ARM_CPU_TYPE_NAME("host"),
ARM_CPU_TYPE_NAME("max"),
};
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 1e9c6c85ae..922eac3b61 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -705,6 +705,77 @@ static void aarch64_a53_initfn(Object *obj)
define_cortex_a72_a57_a53_cp_reginfo(cpu);
}
+static void aarch64_a72_initfn(Object *obj)
+{
+ ARMCPU *cpu = ARM_CPU(obj);
+
+ cpu->dtb_compatible = "arm,cortex-a72";
+ set_feature(&cpu->env, ARM_FEATURE_V8);
+ set_feature(&cpu->env, ARM_FEATURE_NEON);
+ set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
+ set_feature(&cpu->env, ARM_FEATURE_AARCH64);
+ set_feature(&cpu->env, ARM_FEATURE_CBAR_RO);
+ set_feature(&cpu->env, ARM_FEATURE_EL2);
+ set_feature(&cpu->env, ARM_FEATURE_EL3);
+ set_feature(&cpu->env, ARM_FEATURE_PMU);
+ cpu->midr = 0x410fd083;
+ cpu->revidr = 0x00000000;
+ cpu->reset_fpsid = 0x41034080;
+ cpu->isar.mvfr0 = 0x10110222;
+ cpu->isar.mvfr1 = 0x12111111;
+ cpu->isar.mvfr2 = 0x00000043;
+ cpu->ctr = 0x8444c004;
+ cpu->reset_sctlr = 0x00c50838;
+ cpu->isar.id_pfr0 = 0x00000131;
+ cpu->isar.id_pfr1 = 0x00011011;
+ cpu->isar.id_dfr0 = 0x03010066;
+ cpu->id_afr0 = 0x00000000;
+ cpu->isar.id_mmfr0 = 0x10201105;
+ cpu->isar.id_mmfr1 = 0x40000000;
+ cpu->isar.id_mmfr2 = 0x01260000;
+ cpu->isar.id_mmfr3 = 0x02102211;
+ cpu->isar.id_isar0 = 0x02101110;
+ cpu->isar.id_isar1 = 0x13112111;
+ cpu->isar.id_isar2 = 0x21232042;
+ cpu->isar.id_isar3 = 0x01112131;
+ cpu->isar.id_isar4 = 0x00011142;
+ cpu->isar.id_isar5 = 0x00011121;
+ cpu->isar.id_aa64pfr0 = 0x00002222;
+ cpu->isar.id_aa64dfr0 = 0x10305106;
+ cpu->isar.id_aa64isar0 = 0x00011120;
+ cpu->isar.id_aa64mmfr0 = 0x00001124;
+ cpu->isar.dbgdidr = 0x3516d000;
+ cpu->clidr = 0x0a200023;
+ cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
+ cpu->ccsidr[1] = 0x201fe012; /* 48KB L1 icache */
+ cpu->ccsidr[2] = 0x707fe07a; /* 1MB L2 cache */
+ cpu->dcz_blocksize = 4; /* 64 bytes */
+ cpu->gic_num_lrs = 4;
+ cpu->gic_vpribits = 5;
+ cpu->gic_vprebits = 5;
+ define_cortex_a72_a57_a53_cp_reginfo(cpu);
+}
+
+static void aarch64_kunpeng_920_initfn(Object *obj)
+{
+ ARMCPU *cpu = ARM_CPU(obj);
+
+ /*
+ * Hisilicon Kunpeng-920 CPU is similar to cortex-a72,
+ * so first initialize cpu data as cortex-a72,
+ * and then update the special register.
+ */
+ aarch64_a72_initfn(obj);
+
+ cpu->midr = 0x480fd010;
+ cpu->ctr = 0x84448004;
+ cpu->isar.id_aa64pfr0 = 0x11001111;
+ cpu->isar.id_aa64dfr0 = 0x110305408;
+ cpu->isar.id_aa64isar0 = 0x10211120;
+ cpu->isar.id_aa64mmfr0 = 0x101125;
+ cpu->kvm_target = KVM_ARM_TARGET_GENERIC_V8;
+}
+
static void aarch64_host_initfn(Object *obj)
{
#if defined(CONFIG_KVM)
@@ -744,6 +815,7 @@ static void aarch64_max_initfn(Object *obj)
static const ARMCPUInfo aarch64_cpus[] = {
{ .name = "cortex-a57", .initfn = aarch64_a57_initfn },
{ .name = "cortex-a53", .initfn = aarch64_a53_initfn },
+ { .name = "Kunpeng-920", .initfn = aarch64_kunpeng_920_initfn},
{ .name = "max", .initfn = aarch64_max_initfn },
#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
{ .name = "host", .initfn = aarch64_host_initfn },
--
2.27.0

View File

@ -0,0 +1,25 @@
From 9ebad9c3020625df0a178e6a2d06eaae15ef767c Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Wed, 9 Feb 2022 12:51:19 +0800
Subject: [PATCH] cpu/features: fix bug for memory leakage
strList hash not free after used, Fix it.
---
target/i386/cpu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fc61a84b1e..f94405c02b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5475,6 +5475,7 @@ static void x86_cpu_get_unavailable_features(Object *obj, Visitor *v,
x86_cpu_list_feature_names(xc->filtered_features, &result);
visit_type_strList(v, "unavailable-features", &result, errp);
+ qapi_free_strList(result);
}
/* Print all cpuid feature names in featureset
--
2.27.0

View File

@ -0,0 +1,86 @@
From 55e5f8cafda3c7d4a91e9d58c7b3259476e0dab9 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Sat, 30 Jan 2021 16:36:47 +0800
Subject: [PATCH] doc: Update multi-thread compression doc
Modify the doc to fit the previous changes.
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Zeyu Jin <jinzeyu@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
docs/multi-thread-compression.txt | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/docs/multi-thread-compression.txt b/docs/multi-thread-compression.txt
index 95b1556f67..450e5de469 100644
--- a/docs/multi-thread-compression.txt
+++ b/docs/multi-thread-compression.txt
@@ -33,14 +33,15 @@ thread compression can be used to accelerate the compression process.
The decompression speed of Zlib is at least 4 times as quick as
compression, if the source and destination CPU have equal speed,
-keeping the compression thread count 4 times the decompression
-thread count can avoid resource waste.
+and you choose Zlib as compression method, keeping the compression
+thread count 4 times the decompression thread count can avoid resource waste.
Compression level can be used to control the compression speed and the
-compression ratio. High compression ratio will take more time, level 0
-stands for no compression, level 1 stands for the best compression
-speed, and level 9 stands for the best compression ratio. Users can
-select a level number between 0 and 9.
+compression ratio. High compression ratio will take more time,
+level 1 stands for the best compression speed, and higher level means higher
+compression ration. For Zlib, users can select a level number between 0 and 9,
+where level 0 stands for no compression. For Zstd, users can select a
+level number between 1 and 22.
When to use the multiple thread compression in live migration
@@ -116,16 +117,19 @@ to support the multiple thread compression migration:
2. Activate compression on the source:
{qemu} migrate_set_capability compress on
-3. Set the compression thread count on source:
+3. Set the compression method:
+ {qemu} migrate_set_parameter compress_method zstd
+
+4. Set the compression thread count on source:
{qemu} migrate_set_parameter compress-threads 12
-4. Set the compression level on the source:
+5. Set the compression level on the source:
{qemu} migrate_set_parameter compress-level 1
-5. Set the decompression thread count on destination:
+6. Set the decompression thread count on destination:
{qemu} migrate_set_parameter decompress-threads 3
-6. Start outgoing migration:
+7. Start outgoing migration:
{qemu} migrate -d tcp:destination.host:4444
{qemu} info migrate
Capabilities: ... compress: on
@@ -136,6 +140,7 @@ The following are the default settings:
compress-threads: 8
decompress-threads: 2
compress-level: 1 (which means best speed)
+ compress_method: zlib
So, only the first two steps are required to use the multiple
thread compression in migration. You can do more if the default
@@ -143,7 +148,7 @@ settings are not appropriate.
TODO
====
-Some faster (de)compression method such as LZ4 and Quicklz can help
-to reduce the CPU consumption when doing (de)compression. If using
-these faster (de)compression method, less (de)compression threads
+Comparing to Zlib, Some faster (de)compression method such as LZ4
+and Quicklz can help to reduce the CPU consumption when doing (de)compression.
+If using these faster (de)compression method, less (de)compression threads
are needed when doing the migration.
--
2.27.0

View File

@ -0,0 +1,78 @@
From 28ed79b98f08b5701dcaab7c6ad1015602b28e02 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Sat, 12 Nov 2022 22:40:13 +0800
Subject: [PATCH] docs: Add generic vhost-vdpa device documentation
Add the description of the generic vhost-vdpa device
Signed-off-by: libai <libai12@huawei.com>
---
docs/system/device-emulation.rst | 1 +
.../devices/vhost-vdpa-generic-device.rst | 46 +++++++++++++++++++
2 files changed, 47 insertions(+)
create mode 100644 docs/system/devices/vhost-vdpa-generic-device.rst
diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst
index d1f3277cb0..e1b2d18fb1 100644
--- a/docs/system/device-emulation.rst
+++ b/docs/system/device-emulation.rst
@@ -98,3 +98,4 @@ Emulated Devices
devices/canokey.rst
devices/usb-u2f.rst
devices/igb.rst
+ devices/vhost-vdpa-generic-device.rst
diff --git a/docs/system/devices/vhost-vdpa-generic-device.rst b/docs/system/devices/vhost-vdpa-generic-device.rst
new file mode 100644
index 0000000000..25fbcac60e
--- /dev/null
+++ b/docs/system/devices/vhost-vdpa-generic-device.rst
@@ -0,0 +1,46 @@
+
+=========================
+vhost-vDPA generic device
+=========================
+
+This document explains the usage of the vhost-vDPA generic device.
+
+Description
+-----------
+
+vDPA(virtio data path acceleration) device is a device that uses a datapath
+which complies with the virtio specifications with vendor specific control
+path.
+
+QEMU provides two types of vhost-vDPA devices to enable the vDPA device, one
+is type sensitive which means QEMU needs to know the actual device type
+(e.g. net, blk, scsi) and another is called "vhost-vDPA generic device" which
+is type insensitive
+
+The vhost-vDPA generic device builds on the vhost-vdpa subsystem and virtio
+subsystem. It is quite small, but it can support any type of virtio device.
+
+Examples
+--------
+
+Prepare the vhost-vDPA backends first:
+
+::
+ host# ls -l /dev/vhost-vdpa-*
+ crw------- 1 root root 236, 0 Nov 2 00:49 /dev/vhost-vdpa-0
+
+Start QEMU with virtio-mmio bus:
+
+::
+ host# qemu-system \
+ -M microvm -m 512 -smp 2 -kernel ... -initrd ... \
+ -device vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-0 \
+ ...
+
+Start QEMU with virtio-pci bus:
+
+::
+ host# qemu-system \
+ -M pc -m 512 -smp 2 \
+ -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-0 \
+ ...\
--
2.27.0

View File

@ -0,0 +1,250 @@
From 30cc47b6dd3e9ff4842eb1c2a918bbabfd8c593b Mon Sep 17 00:00:00 2001
From: "wangxinxin.wang@huawei.com" <wangxinxin.wang@huawei.com>
Date: Sun, 17 Mar 2024 15:44:28 +0800
Subject: [PATCH] feature: Add log for each modules
add log for each modules.
Signed-off-by: miaoyubo <miaoyubo@huawei.com>
Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
accel/kvm/kvm-all.c | 5 ++++-
hw/char/virtio-serial-bus.c | 5 +++++
hw/pci/pci.c | 1 +
hw/usb/bus.c | 6 ++++++
hw/usb/host-libusb.c | 5 +++++
hw/virtio/virtio-scsi-pci.c | 3 +++
monitor/qmp-cmds.c | 3 +++
os-posix.c | 1 +
qapi/qmp-dispatch.c | 15 +++++++++++++++
system/qdev-monitor.c | 5 +++++
10 files changed, 48 insertions(+), 1 deletion(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 33f4c6d547..d900df93a4 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1834,7 +1834,10 @@ void kvm_irqchip_commit_routes(KVMState *s)
s->irq_routes->flags = 0;
trace_kvm_irqchip_commit_routes();
ret = kvm_vm_ioctl(s, KVM_SET_GSI_ROUTING, s->irq_routes);
- assert(ret == 0);
+ if (ret < 0) {
+ error_report("Set GSI routing failed: %m");
+ abort();
+ }
}
static void kvm_add_routing_entry(KVMState *s,
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index dd619f0731..44906057be 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -257,6 +257,8 @@ static size_t send_control_event(VirtIOSerial *vser, uint32_t port_id,
virtio_stw_p(vdev, &cpkt.value, value);
trace_virtio_serial_send_control_event(port_id, event, value);
+ qemu_log("virtio serial port %d send control message"
+ " event = %d, value = %d\n", port_id, event, value);
return send_control_msg(vser, &cpkt, sizeof(cpkt));
}
@@ -364,6 +366,9 @@ static void handle_control_message(VirtIOSerial *vser, void *buf, size_t len)
cpkt.value = virtio_lduw_p(vdev, &gcpkt->value);
trace_virtio_serial_handle_control_message(cpkt.event, cpkt.value);
+ qemu_log("virtio serial port '%u' handle control message"
+ " event = %d, value = %d\n",
+ virtio_ldl_p(vdev, &gcpkt->id), cpkt.event, cpkt.value);
if (cpkt.event == VIRTIO_CONSOLE_DEVICE_READY) {
if (!cpkt.value) {
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index c49417abb2..9da41088df 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2411,6 +2411,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom,
snprintf(name, sizeof(name), "%s.rom",
vmsd ? vmsd->name : object_get_typename(OBJECT(pdev)));
+ qemu_log("add rom file: %s\n", name);
pdev->has_rom = true;
memory_region_init_rom(&pdev->rom, OBJECT(pdev), name, pdev->romsize,
&error_fatal);
diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index 92d6ed5626..20cd9b6e6f 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -536,6 +536,10 @@ void usb_check_attach(USBDevice *dev, Error **errp)
bus->qbus.name, port->path, portspeed);
return;
}
+
+ qemu_log("attach usb device \"%s\" (%s speed) to VM bus \"%s\", "
+ "port \"%s\" (%s speed)\n", dev->product_desc, devspeed,
+ bus->qbus.name, port->path, portspeed);
}
void usb_device_attach(USBDevice *dev, Error **errp)
@@ -564,6 +568,8 @@ int usb_device_detach(USBDevice *dev)
usb_detach(port);
dev->attached = false;
+ qemu_log("detach usb device \"%s\" from VM bus \"%s\", port \"%s\"\n",
+ dev->product_desc, bus->qbus.name, port->path);
return 0;
}
diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index dba469c1ef..11a246ac72 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -992,6 +992,8 @@ static int usb_host_open(USBHostDevice *s, libusb_device *dev, int hostfd)
rc = libusb_open(dev, &s->dh);
if (rc != 0) {
+ qemu_log("libusb open usb device bus %d, device %d failed\n",
+ bus_num, addr);
goto fail;
}
} else {
@@ -1019,6 +1021,7 @@ static int usb_host_open(USBHostDevice *s, libusb_device *dev, int hostfd)
libusb_get_device_descriptor(dev, &s->ddesc);
usb_host_get_port(s->dev, s->port, sizeof(s->port));
+ qemu_log("open a host usb device on bus %d, device %d\n", bus_num, addr);
usb_ep_init(udev);
usb_host_ep_update(s);
@@ -1146,6 +1149,8 @@ static int usb_host_close(USBHostDevice *s)
usb_device_detach(udev);
}
+ qemu_log("begin to reset the usb device, bus : %d, device : %d\n",
+ s->bus_num, s->addr);
usb_host_release_interfaces(s);
libusb_reset_device(s->dh);
usb_host_attach_kernel(s);
diff --git a/hw/virtio/virtio-scsi-pci.c b/hw/virtio/virtio-scsi-pci.c
index e8e3442f38..e542d47162 100644
--- a/hw/virtio/virtio-scsi-pci.c
+++ b/hw/virtio/virtio-scsi-pci.c
@@ -20,6 +20,7 @@
#include "qemu/module.h"
#include "hw/virtio/virtio-pci.h"
#include "qom/object.h"
+#include "qemu/log.h"
typedef struct VirtIOSCSIPCI VirtIOSCSIPCI;
@@ -51,6 +52,8 @@ static void virtio_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
VirtIOSCSIConf *conf = &dev->vdev.parent_obj.conf;
char *bus_name;
+ qemu_log("virtio scsi HBA %s begin to initialize.\n",
+ !proxy->id ? "NULL" : proxy->id);
if (conf->num_queues == VIRTIO_SCSI_AUTO_NUM_QUEUES) {
conf->num_queues =
virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED);
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index b0f948d337..e78462b857 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -32,6 +32,7 @@
#include "hw/mem/memory-device.h"
#include "hw/intc/intc.h"
#include "hw/rdma/rdma.h"
+#include "qemu/log.h"
NameInfo *qmp_query_name(Error **errp)
{
@@ -110,8 +111,10 @@ void qmp_cont(Error **errp)
}
if (runstate_check(RUN_STATE_INMIGRATE)) {
+ qemu_log("qmp cont is received in migration\n");
autostart = 1;
} else {
+ qemu_log("qmp cont is received and vm is started\n");
vm_start();
}
}
diff --git a/os-posix.c b/os-posix.c
index 52ef6990ff..8f70ee0534 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -306,6 +306,7 @@ int os_mlock(void)
#ifdef HAVE_MLOCKALL
int ret = 0;
+ qemu_log("do mlockall\n");
ret = mlockall(MCL_CURRENT | MCL_FUTURE);
if (ret < 0) {
error_report("mlockall: %s", strerror(errno));
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 7a215cbfd7..e33efd3740 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -25,6 +25,7 @@
#include "qemu/coroutine.h"
#include "qemu/main-loop.h"
#include "qemu/log.h"
+#include "qapi/qmp/qstring.h"
Visitor *qobject_input_visitor_new_qmp(QObject *obj)
{
@@ -220,6 +221,20 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ
assert(!(oob && qemu_in_coroutine()));
assert(monitor_cur() == NULL);
+
+ json = qobject_to_json(QOBJECT(args));
+ if (json) {
+ if ((strcmp(command, "query-block-jobs") != 0)
+ && (strcmp(command, "query-migrate") != 0)
+ && (strcmp(command, "query-blockstats") != 0)
+ && (strcmp(command, "query-balloon") != 0)
+ && (strcmp(command, "set_password") != 0)) {
+ qemu_log("qmp_cmd_name: %s, arguments: %s\n",
+ command, json->str);
+ }
+ g_string_free(json, true);
+ }
+
if (!!(cmd->options & QCO_COROUTINE) == qemu_in_coroutine()) {
monitor_set_cur(qemu_coroutine_self(), cur_mon);
cmd->fn(args, &ret, &err);
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index b10e483a9a..5b35704b5e 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -644,6 +644,7 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
if (path != NULL) {
bus = qbus_find(path, errp);
if (!bus) {
+ qemu_log("can not find bus for %s\n", driver);
return NULL;
}
if (!object_dynamic_cast(OBJECT(bus), dc->bus_type)) {
@@ -714,6 +715,8 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
object_set_properties_from_keyval(&dev->parent_obj, dev->opts, from_json,
errp);
if (*errp) {
+ qemu_log("the bus %s -driver %s set property failed\n",
+ bus ? bus->name : "None", driver);
goto err_del_dev;
}
qemu_log("add qdev %s:%s success\n", driver, dev->id ? dev->id : "none");
@@ -738,6 +741,8 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
ret = qdev_device_add_from_qdict(qdict, false, errp);
if (ret) {
+ qemu_log("add qdev %s:%s success\n", qemu_opt_get(opts, "driver"),
+ qemu_opts_id(opts) ? qemu_opts_id(opts) : "none");
qemu_opts_del(opts);
}
qobject_unref(qdict);
--
2.27.0

View File

@ -0,0 +1,158 @@
From 9a47271fb6c855ec92e087d59d65f3cc0c684725 Mon Sep 17 00:00:00 2001
From: "wangxinxin.wang@huawei.com" <wangxinxin.wang@huawei.com>
Date: Sun, 17 Mar 2024 15:04:09 +0800
Subject: [PATCH] feature: Add logs for vm start and destroy
Add QEMU_LOG for vm start and destroy
Signed-off-by: miaoyubo <miaoyubo@huawei.com>
Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/acpi/core.c | 4 ++++
hw/core/reset.c | 2 ++
system/main.c | 2 ++
system/runstate.c | 2 ++
system/vl.c | 6 ++++++
5 files changed, 16 insertions(+)
diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index ec5e127d17..b6241f70e9 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -24,6 +24,7 @@
#include "hw/acpi/acpi.h"
#include "hw/nvram/fw_cfg.h"
#include "qemu/config-file.h"
+#include "qemu/log.h"
#include "qapi/error.h"
#include "qapi/opts-visitor.h"
#include "qapi/qapi-events-run-state.h"
@@ -588,13 +589,16 @@ static void acpi_pm_cnt_write(void *opaque, hwaddr addr, uint64_t val,
uint16_t sus_typ = (val >> 10) & 7;
switch (sus_typ) {
case 0: /* soft power off */
+ qemu_log("VM will be soft power off\n");
qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
break;
case 1:
+ qemu_log("VM will be suspend state\n");
qemu_system_suspend_request();
break;
default:
if (sus_typ == ar->pm1.cnt.s4_val) { /* S4 request */
+ qemu_log("VM will be S4 state\n");
qapi_event_send_suspend_disk();
qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
}
diff --git a/hw/core/reset.c b/hw/core/reset.c
index d3263b613e..fa63bfedb7 100644
--- a/hw/core/reset.c
+++ b/hw/core/reset.c
@@ -25,6 +25,7 @@
#include "qemu/osdep.h"
#include "qemu/queue.h"
+#include "qemu/log.h"
#include "sysemu/reset.h"
/* reset/shutdown handler */
@@ -75,6 +76,7 @@ void qemu_devices_reset(ShutdownCause reason)
{
QEMUResetEntry *re, *nre;
+ qemu_log("reset all devices\n");
/* reset all devices */
QTAILQ_FOREACH_SAFE(re, &reset_handlers, entry, nre) {
if (reason == SHUTDOWN_CAUSE_SNAPSHOT_LOAD &&
diff --git a/system/main.c b/system/main.c
index 9b91d21ea8..28bb283ebf 100644
--- a/system/main.c
+++ b/system/main.c
@@ -23,6 +23,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/log.h"
#include "qemu-main.h"
#include "sysemu/sysemu.h"
@@ -34,6 +35,7 @@ int qemu_default_main(void)
{
int status;
+ qemu_log("qemu enter main_loop\n");
status = qemu_main_loop();
qemu_cleanup(status);
diff --git a/system/runstate.c b/system/runstate.c
index 62e6db8d42..538c645326 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -769,9 +769,11 @@ static bool main_loop_should_exit(int *status)
}
if (qemu_powerdown_requested()) {
qemu_system_powerdown();
+ qemu_log("domain is power down by outside operation\n");
}
if (qemu_vmstop_requested(&r)) {
vm_stop(r);
+ qemu_log("domain is stopped by outside operation\n");
}
return false;
}
diff --git a/system/vl.c b/system/vl.c
index 2bcd9efb9a..165c3cae8a 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -26,6 +26,7 @@
#include "qemu/help-texts.h"
#include "qemu/datadir.h"
#include "qemu/units.h"
+#include "qemu/log.h"
#include "exec/cpu-common.h"
#include "exec/page-vary.h"
#include "hw/qdev-properties.h"
@@ -2633,6 +2634,7 @@ static void qemu_create_cli_devices(void)
}
/* init generic devices */
+ qemu_log("device init start\n");
rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
qemu_opts_foreach(qemu_find_opts("device"),
device_init_func, NULL, &error_fatal);
@@ -2778,6 +2780,7 @@ void qemu_init(int argc, char **argv)
qemu_init_subsystems();
+ qemu_log("qemu pid is %d, options parsing start\n", getpid());
/* first pass of option parsing */
optind = 1;
while (optind < argc) {
@@ -2997,6 +3000,7 @@ void qemu_init(int argc, char **argv)
exit(0);
break;
case QEMU_OPTION_m:
+ qemu_log("memory options parse start\n");
opts = qemu_opts_parse_noisily(qemu_find_opts("memory"), optarg, true);
if (opts == NULL) {
exit(1);
@@ -3714,6 +3718,7 @@ void qemu_init(int argc, char **argv)
*/
machine_class = MACHINE_GET_CLASS(current_machine);
+ qemu_log("configure accelerator %s start\n", machine_class->name);
if (!qtest_enabled() && machine_class->deprecation_reason) {
warn_report("Machine type '%s' is deprecated: %s",
machine_class->name, machine_class->deprecation_reason);
@@ -3732,6 +3737,7 @@ void qemu_init(int argc, char **argv)
*/
migration_object_init();
+ qemu_log("machine init start\n");
/* parse features once if machine provides default cpu_type */
current_machine->cpu_type = machine_class->default_cpu_type;
if (cpu_option) {
--
2.27.0

View File

@ -0,0 +1,46 @@
From 97335ac382e36db18a61d3891f1fafd15475822e Mon Sep 17 00:00:00 2001
From: caojinhuahw <caojinhua1@huawei.com>
Date: Mon, 19 Dec 2022 12:35:50 +0000
Subject: [PATCH] fix qemu-core when vhost-user-net config with server mode
commit 3a223111d7 set default reconnect for vhost-user-net
device, if vhost-user-net config with server mode will
casuse the core when ovs client stop.
tcp_chr_disconnect ---> set tcp_char state disconnect
tcp_chr start reconnect ---> set tcp_char state connecting
tcp_char is listen ---> call tcp_chr_accept()
fun tcp_char_accept() set tcp_char state to connecting, but
current tcp_char state already is connecting, assert failed
in tcp_char_change_state() raise qemu core
assert(s->state == TCP_CHARDEV_STATE_DISCONNECTED)
this commit check tcp_char mode, if tcp_char config with server
mode, dont set reconnect time for tcp_chr.
fix: 3a223111d7 vhost-user: Add support reconnect vhost-user socket
Signed-off-by: caojinhuahw <caojinhua1@huawei.com>
---
chardev/char-socket.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 9c60e15c8e..0c9ab069ae 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -347,6 +347,12 @@ static void tcp_chr_set_reconnect_time(Chardev *chr,
void qemu_chr_set_reconnect_time(Chardev *chr, int64_t reconnect_time)
{
ChardevClass *cc = CHARDEV_GET_CLASS(chr);
+ SocketChardev *s = SOCKET_CHARDEV(chr);
+
+ /* if sock dev is listen, dont set reconnect time */
+ if (s->is_listen) {
+ return;
+ }
if (cc->chr_set_reconnect_time) {
cc->chr_set_reconnect_time(chr, reconnect_time);
--
2.27.0

View File

@ -0,0 +1,129 @@
From 0a6baf4799dd6e70d7959002ea6ddb998eddbc6d Mon Sep 17 00:00:00 2001
From: "shenghualong@huawei.com" <shenghualong@huawei.com>
Date: Mon, 18 Mar 2024 15:53:43 +0800
Subject: [PATCH] freeclock: add qmp command to get time offset of vm in
seconds
When setting the system time in VM, a RTC_CHANGE event will be reported.
However, if libvirt is restarted while the event is be reporting, the
event will be lost and we will get the old time (not the time we set in
VM) after rebooting the VM.
We save the delta time in QEMU and add a rtc-date-diff qmp to get the
delta time so that libvirt can get the latest time in VM according to
the qmp after libvirt is restarted.
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: zhangxinhao <zhangxinhao1@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/core/machine-qmp-cmds.c | 6 ++++++
include/sysemu/rtc.h | 4 +++-
qapi/misc.json | 9 +++++++++
qapi/pragma.json | 3 ++-
system/rtc.c | 11 +++++++++++
5 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 3860a50c3b..f1389ef644 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -8,6 +8,7 @@
*/
#include "qemu/osdep.h"
+#include "sysemu/rtc.h"
#include "hw/acpi/vmgenid.h"
#include "hw/boards.h"
#include "hw/intc/intc.h"
@@ -373,6 +374,11 @@ HumanReadableText *qmp_x_query_irq(Error **errp)
return human_readable_text_from_str(buf);
}
+int64_t qmp_query_rtc_date_diff(Error **errp)
+{
+ return get_rtc_date_diff();
+}
+
GuidInfo *qmp_query_vm_generation_id(Error **errp)
{
GuidInfo *info;
diff --git a/include/sysemu/rtc.h b/include/sysemu/rtc.h
index 0fc8ad6fdf..3edae762d4 100644
--- a/include/sysemu/rtc.h
+++ b/include/sysemu/rtc.h
@@ -54,5 +54,7 @@ void qemu_get_timedate(struct tm *tm, time_t offset);
* then this function will return 3600.
*/
time_t qemu_timedate_diff(struct tm *tm);
-
+time_t get_rtc_date_diff(void);
+void set_rtc_date_diff(time_t diff);
+int64_t qmp_query_rtc_date_diff(Error **errp);
#endif
diff --git a/qapi/misc.json b/qapi/misc.json
index cda2effa81..1832d5f460 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -550,6 +550,15 @@
'returns': ['CommandLineOptionInfo'],
'allow-preconfig': true}
+##
+# @query-rtc-date-diff:
+#
+# get vm's time offset
+#
+# Since: 2.8
+##
+{ 'command': 'query-rtc-date-diff', 'returns': 'int64' }
+
##
# @RTC_CHANGE:
#
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 0aa4eeddd3..7a07b44bb1 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -30,7 +30,8 @@
'qom-get',
'query-tpm-models',
'query-tpm-types',
- 'ringbuf-read' ],
+ 'ringbuf-read',
+ 'query-rtc-date-diff'],
# Externally visible types whose member names may use uppercase
'member-name-exceptions': [ # visible in:
'ACPISlotType', # query-acpi-ospm-status
diff --git a/system/rtc.c b/system/rtc.c
index 4904581abe..e16b5fffc5 100644
--- a/system/rtc.c
+++ b/system/rtc.c
@@ -44,6 +44,7 @@ static time_t rtc_ref_start_datetime;
static int rtc_realtime_clock_offset; /* used only with QEMU_CLOCK_REALTIME */
static int rtc_host_datetime_offset = -1; /* valid & used only with
RTC_BASE_DATETIME */
+static time_t rtc_date_diff = 0;
QEMUClockType rtc_clock;
/***********************************************************/
/* RTC reference time/date access */
@@ -108,6 +109,16 @@ time_t qemu_timedate_diff(struct tm *tm)
return seconds - qemu_ref_timedate(QEMU_CLOCK_HOST);
}
+time_t get_rtc_date_diff(void)
+{
+ return rtc_date_diff;
+}
+
+void set_rtc_date_diff(time_t diff)
+{
+ rtc_date_diff = diff;
+}
+
static void configure_rtc_base_datetime(const char *startdate)
{
time_t rtc_start_datetime;
--
2.27.0

View File

@ -0,0 +1,31 @@
From 0a0010fe0656a63e82aea495ab0a59145d3b5750 Mon Sep 17 00:00:00 2001
From: "shenghualong@huawei.com" <shenghualong@huawei.com>
Date: Thu, 21 Mar 2024 12:26:38 +0800
Subject: [PATCH] freeclock: set rtc_date_diff for X86
Set rtc_date_diff in mc146818rtc.
Signed-off-by: l00500761 <liuxiangdong5@huawei.com>
Signed-off-by: zhangxinhao <zhangxinhao1@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/rtc/mc146818rtc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/rtc/mc146818rtc.c b/hw/rtc/mc146818rtc.c
index 2d391a8396..e61c76d060 100644
--- a/hw/rtc/mc146818rtc.c
+++ b/hw/rtc/mc146818rtc.c
@@ -606,7 +606,8 @@ static void rtc_set_time(MC146818RtcState *s)
s->base_rtc = mktimegm(&tm);
s->last_update = qemu_clock_get_ns(rtc_clock);
- qapi_event_send_rtc_change(qemu_timedate_diff(&tm), qom_path);
+ set_rtc_date_diff(qemu_timedate_diff(&tm));
+ qapi_event_send_rtc_change(get_rtc_date_diff(), qom_path);
}
static void rtc_set_cmos(MC146818RtcState *s, const struct tm *tm)
--
2.27.0

View File

@ -0,0 +1,31 @@
From 156be254a48d1d9b7aadcbfa4423485c592bc75d Mon Sep 17 00:00:00 2001
From: "shenghualong@huawei.com" <shenghualong@huawei.com>
Date: Thu, 21 Mar 2024 11:21:14 +0800
Subject: [PATCH] freeclock: set rtc_date_diff for arm
Set rtc_date_diff in pl031.
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: zhangxinhao <zhangxinhao1@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/rtc/pl031.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/rtc/pl031.c b/hw/rtc/pl031.c
index b01d0e75d1..f2e6baebba 100644
--- a/hw/rtc/pl031.c
+++ b/hw/rtc/pl031.c
@@ -144,7 +144,8 @@ static void pl031_write(void * opaque, hwaddr offset,
s->tick_offset += value - pl031_get_count(s);
qemu_get_timedate(&tm, s->tick_offset);
- qapi_event_send_rtc_change(qemu_timedate_diff(&tm), qom_path);
+ set_rtc_date_diff(qemu_timedate_diff(&tm));
+ qapi_event_send_rtc_change(get_rtc_date_diff(), qom_path);
pl031_set_alarm(s);
break;
--
2.27.0

View File

@ -0,0 +1,352 @@
From 7d3d37d3af4278aee627952d6a81b63dec6ac62b Mon Sep 17 00:00:00 2001
From: Ying Fang <fangying1@huawei.com>
Date: Sun, 17 Mar 2024 18:56:09 +0800
Subject: [PATCH] hw/arm64: add vcpu cache info support
Support VCPU Cache info by dtb and PPTT table, including L1, L2 and L3 Cache.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Honghao <honghao5@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/acpi/aml-build.c | 158 ++++++++++++++++++++++++++++++++++++
hw/arm/virt.c | 72 ++++++++++++++++
include/hw/acpi/aml-build.h | 47 +++++++++++
3 files changed, 277 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index af66bde0f5..2968df5562 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1994,6 +1994,163 @@ static void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
}
}
+#ifdef __aarch64__
+/*
+ * ACPI spec, Revision 6.3
+ * 5.2.29.2 Cache Type Structure (Type 1)
+ */
+static void build_cache_hierarchy_node(GArray *tbl, uint32_t next_level,
+ uint32_t cache_type)
+{
+ build_append_byte(tbl, 1);
+ build_append_byte(tbl, 24);
+ build_append_int_noprefix(tbl, 0, 2);
+ build_append_int_noprefix(tbl, 127, 4);
+ build_append_int_noprefix(tbl, next_level, 4);
+
+ switch (cache_type) {
+ case ARM_L1D_CACHE: /* L1 dcache info */
+ build_append_int_noprefix(tbl, ARM_L1DCACHE_SIZE, 4);
+ build_append_int_noprefix(tbl, ARM_L1DCACHE_SETS, 4);
+ build_append_byte(tbl, ARM_L1DCACHE_ASSOCIATIVITY);
+ build_append_byte(tbl, ARM_L1DCACHE_ATTRIBUTES);
+ build_append_int_noprefix(tbl, ARM_L1DCACHE_LINE_SIZE, 2);
+ break;
+ case ARM_L1I_CACHE: /* L1 icache info */
+ build_append_int_noprefix(tbl, ARM_L1ICACHE_SIZE, 4);
+ build_append_int_noprefix(tbl, ARM_L1ICACHE_SETS, 4);
+ build_append_byte(tbl, ARM_L1ICACHE_ASSOCIATIVITY);
+ build_append_byte(tbl, ARM_L1ICACHE_ATTRIBUTES);
+ build_append_int_noprefix(tbl, ARM_L1ICACHE_LINE_SIZE, 2);
+ break;
+ case ARM_L2_CACHE: /* L2 cache info */
+ build_append_int_noprefix(tbl, ARM_L2CACHE_SIZE, 4);
+ build_append_int_noprefix(tbl, ARM_L2CACHE_SETS, 4);
+ build_append_byte(tbl, ARM_L2CACHE_ASSOCIATIVITY);
+ build_append_byte(tbl, ARM_L2CACHE_ATTRIBUTES);
+ build_append_int_noprefix(tbl, ARM_L2CACHE_LINE_SIZE, 2);
+ break;
+ case ARM_L3_CACHE: /* L3 cache info */
+ build_append_int_noprefix(tbl, ARM_L3CACHE_SIZE, 4);
+ build_append_int_noprefix(tbl, ARM_L3CACHE_SETS, 4);
+ build_append_byte(tbl, ARM_L3CACHE_ASSOCIATIVITY);
+ build_append_byte(tbl, ARM_L3CACHE_ATTRIBUTES);
+ build_append_int_noprefix(tbl, ARM_L3CACHE_LINE_SIZE, 2);
+ break;
+ default:
+ build_append_int_noprefix(tbl, 0, 4);
+ build_append_int_noprefix(tbl, 0, 4);
+ build_append_byte(tbl, 0);
+ build_append_byte(tbl, 0);
+ build_append_int_noprefix(tbl, 0, 2);
+ }
+}
+
+/*
+ * ACPI spec, Revision 6.3
+ * 5.2.29 Processor Properties Topology Table (PPTT)
+ */
+void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
+ const char *oem_id, const char *oem_table_id)
+{
+ MachineClass *mc = MACHINE_GET_CLASS(ms);
+ GQueue *list = g_queue_new();
+ guint pptt_start = table_data->len;
+ guint parent_offset;
+ guint length, i;
+ int uid = 0;
+ int socket;
+ AcpiTable table = { .sig = "PPTT", .rev = 2,
+ .oem_id = oem_id, .oem_table_id = oem_table_id };
+
+ acpi_table_begin(&table, table_data);
+
+ for (socket = 0; socket < ms->smp.sockets; socket++) {
+ uint32_t l3_cache_offset = table_data->len - pptt_start;
+ build_cache_hierarchy_node(table_data, 0, ARM_L3_CACHE);
+
+ g_queue_push_tail(list,
+ GUINT_TO_POINTER(table_data->len - pptt_start));
+ build_processor_hierarchy_node(
+ table_data,
+ /*
+ * Physical package - represents the boundary
+ * of a physical package
+ */
+ (1 << 0),
+ 0, socket, &l3_cache_offset, 1);
+ }
+
+ if (mc->smp_props.clusters_supported) {
+ length = g_queue_get_length(list);
+ for (i = 0; i < length; i++) {
+ int cluster;
+
+ parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
+ for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
+ g_queue_push_tail(list,
+ GUINT_TO_POINTER(table_data->len - pptt_start));
+ build_processor_hierarchy_node(
+ table_data,
+ (0 << 0), /* not a physical package */
+ parent_offset, cluster, NULL, 0);
+ }
+ }
+ }
+
+ length = g_queue_get_length(list);
+ for (i = 0; i < length; i++) {
+ int core;
+
+ parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
+ for (core = 0; core < ms->smp.cores; core++) {
+ uint32_t priv_rsrc[3] = {};
+ priv_rsrc[0] = table_data->len - pptt_start; /* L2 cache offset */
+ build_cache_hierarchy_node(table_data, 0, ARM_L2_CACHE);
+
+ priv_rsrc[1] = table_data->len - pptt_start; /* L1 dcache offset */
+ build_cache_hierarchy_node(table_data, priv_rsrc[0], ARM_L1D_CACHE);
+
+ priv_rsrc[2] = table_data->len - pptt_start; /* L1 icache offset */
+ build_cache_hierarchy_node(table_data, priv_rsrc[0], ARM_L1I_CACHE);
+
+ if (ms->smp.threads > 1) {
+ g_queue_push_tail(list,
+ GUINT_TO_POINTER(table_data->len - pptt_start));
+ build_processor_hierarchy_node(
+ table_data,
+ (0 << 0), /* not a physical package */
+ parent_offset, core, priv_rsrc, 3);
+ } else {
+ build_processor_hierarchy_node(
+ table_data,
+ (1 << 1) | /* ACPI Processor ID valid */
+ (1 << 3), /* Node is a Leaf */
+ parent_offset, uid++, priv_rsrc, 3);
+ }
+ }
+ }
+
+ length = g_queue_get_length(list);
+ for (i = 0; i < length; i++) {
+ int thread;
+
+ parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
+ for (thread = 0; thread < ms->smp.threads; thread++) {
+ build_processor_hierarchy_node(
+ table_data,
+ (1 << 1) | /* ACPI Processor ID valid */
+ (1 << 2) | /* Processor is a Thread */
+ (1 << 3), /* Node is a Leaf */
+ parent_offset, uid++, NULL, 0);
+ }
+ }
+
+ g_queue_free(list);
+ acpi_table_end(linker, &table);
+}
+
+#else
/*
* ACPI spec, Revision 6.3
* 5.2.29 Processor Properties Topology Table (PPTT)
@@ -2069,6 +2226,7 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
acpi_table_end(linker, &table);
}
+#endif
/* build rev1/rev3/rev5.1/rev6.0 FADT */
void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 500a15aa5b..b82bd1b8c8 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -379,6 +379,72 @@ static void fdt_add_timer_nodes(const VirtMachineState *vms)
INTID_TO_PPI(ARCH_TIMER_NS_EL2_IRQ), irqflags);
}
+static void fdt_add_l3cache_nodes(const VirtMachineState *vms)
+{
+ int i;
+ const MachineState *ms = MACHINE(vms);
+ int cpus_per_socket = ms->smp.clusters * ms->smp.cores * ms->smp.threads;
+ int sockets = (ms->smp.cpus + cpus_per_socket - 1) / cpus_per_socket;
+
+ for (i = 0; i < sockets; i++) {
+ char *nodename = g_strdup_printf("/cpus/l3-cache%d", i);
+
+ qemu_fdt_add_subnode(ms->fdt, nodename);
+ qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", "cache");
+ qemu_fdt_setprop_string(ms->fdt, nodename, "cache-unified", "true");
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "cache-level", 3);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "cache-size", 0x2000000);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "cache-line-size", 128);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "cache-sets", 2048);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
+ qemu_fdt_alloc_phandle(ms->fdt));
+ g_free(nodename);
+ }
+}
+
+static void fdt_add_l2cache_nodes(const VirtMachineState *vms)
+{
+ const MachineState *ms = MACHINE(vms);
+ int cpus_per_socket = ms->smp.clusters * ms->smp.cores * ms->smp.threads;
+ int cpu;
+
+ for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
+ char *next_path = g_strdup_printf("/cpus/l3-cache%d",
+ cpu / cpus_per_socket);
+ char *nodename = g_strdup_printf("/cpus/l2-cache%d", cpu);
+
+ qemu_fdt_add_subnode(ms->fdt, nodename);
+ qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", "cache");
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "cache-size", 0x80000);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "cache-line-size", 64);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "cache-sets", 1024);
+ qemu_fdt_setprop_phandle(ms->fdt, nodename, "next-level-cache",
+ next_path);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
+ qemu_fdt_alloc_phandle(ms->fdt));
+
+ g_free(next_path);
+ g_free(nodename);
+ }
+}
+
+static void fdt_add_l1cache_prop(const VirtMachineState *vms,
+ char *nodename, int cpu)
+{
+ const MachineState *ms = MACHINE(vms);
+ char *cachename = g_strdup_printf("/cpus/l2-cache%d", cpu);
+
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "d-cache-size", 0x10000);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "d-cache-line-size", 64);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "d-cache-sets", 256);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "i-cache-size", 0x10000);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "i-cache-line-size", 64);
+ qemu_fdt_setprop_cell(ms->fdt, nodename, "i-cache-sets", 256);
+ qemu_fdt_setprop_phandle(ms->fdt, nodename, "next-level-cache",
+ cachename);
+ g_free(cachename);
+}
+
static void fdt_add_cpu_nodes(const VirtMachineState *vms)
{
int cpu;
@@ -413,6 +479,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
qemu_fdt_setprop_cell(ms->fdt, "/cpus", "#address-cells", addr_cells);
qemu_fdt_setprop_cell(ms->fdt, "/cpus", "#size-cells", 0x0);
+ if (!vmc->no_cpu_topology) {
+ fdt_add_l3cache_nodes(vms);
+ fdt_add_l2cache_nodes(vms);
+ }
+
for (cpu = smp_cpus - 1; cpu >= 0; cpu--) {
char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
@@ -442,6 +513,7 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
}
if (!vmc->no_cpu_topology) {
+ fdt_add_l1cache_prop(vms, nodename, cpu);
qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
qemu_fdt_alloc_phandle(ms->fdt));
}
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index ff2a310270..84ded2ecd3 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -221,6 +221,53 @@ struct AcpiBuildTables {
BIOSLinker *linker;
} AcpiBuildTables;
+#ifdef __aarch64__
+/* Definitions of the hardcoded cache info*/
+
+typedef enum {
+ ARM_L1D_CACHE,
+ ARM_L1I_CACHE,
+ ARM_L2_CACHE,
+ ARM_L3_CACHE
+} ArmCacheType;
+
+/* L1 data cache: */
+#define ARM_L1DCACHE_SIZE 65536
+#define ARM_L1DCACHE_SETS 256
+#define ARM_L1DCACHE_ASSOCIATIVITY 4
+#define ARM_L1DCACHE_ATTRIBUTES 2
+#define ARM_L1DCACHE_LINE_SIZE 64
+
+/* L1 instruction cache: */
+#define ARM_L1ICACHE_SIZE 65536
+#define ARM_L1ICACHE_SETS 256
+#define ARM_L1ICACHE_ASSOCIATIVITY 4
+#define ARM_L1ICACHE_ATTRIBUTES 4
+#define ARM_L1ICACHE_LINE_SIZE 64
+
+/* Level 2 unified cache: */
+#define ARM_L2CACHE_SIZE 524288
+#define ARM_L2CACHE_SETS 1024
+#define ARM_L2CACHE_ASSOCIATIVITY 8
+#define ARM_L2CACHE_ATTRIBUTES 10
+#define ARM_L2CACHE_LINE_SIZE 64
+
+/* Level 3 unified cache: */
+#define ARM_L3CACHE_SIZE 33554432
+#define ARM_L3CACHE_SETS 2048
+#define ARM_L3CACHE_ASSOCIATIVITY 15
+#define ARM_L3CACHE_ATTRIBUTES 10
+#define ARM_L3CACHE_LINE_SIZE 128
+
+struct offset_status {
+ uint32_t parent;
+ uint32_t l2_offset;
+ uint32_t l1d_offset;
+ uint32_t l1i_offset;
+};
+
+#endif
+
typedef
struct CrsRangeEntry {
uint64_t base;
--
2.27.0

View File

@ -0,0 +1,40 @@
From c3f204e02eacdd3e9ec6ac55396ccc7f115ad63e Mon Sep 17 00:00:00 2001
From: Qiang Ning <ningqiang1@huawei.com>
Date: Mon, 12 Jul 2021 17:30:45 +0800
Subject: [PATCH] hw/net/rocker_of_dpa: fix double free bug of rocker device
The of_dpa_cmd_add_l2_flood function of the rocker device
releases the memory of group->l2_flood.group_ids before
applying for new memory. If the l2_group configured by
the guest does not match the input group->l2_flood.group_ids,
the err_out branch is redirected to release the memory of the
group->l2_flood.group_ids branch. The pointer is not set to
NULL after the memory is freed. When the guest accesses the
of_dpa_cmd_add_l2_flood function again, the memory of
group->l2_flood.group_ids is released again. As a result,
the memory is double free.
Fix that by setting group->l2_flood.group_ids to NULL after free.
Signed-off-by: Jiajie Li <lijiajie11@huawei.com>
Signed-off-by: Qiang Ning <ningqiang1@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
---
hw/net/rocker/rocker_of_dpa.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/net/rocker/rocker_of_dpa.c b/hw/net/rocker/rocker_of_dpa.c
index 5e16056be6..c25438cccc 100644
--- a/hw/net/rocker/rocker_of_dpa.c
+++ b/hw/net/rocker/rocker_of_dpa.c
@@ -2070,6 +2070,7 @@ static int of_dpa_cmd_add_l2_flood(OfDpa *of_dpa, OfDpaGroup *group,
err_out:
group->l2_flood.group_count = 0;
g_free(group->l2_flood.group_ids);
+ group->l2_flood.group_ids = NULL;
g_free(tlvs);
return err;
--
2.27.0

View File

@ -0,0 +1,459 @@
From dc7e40b2841132b0bc43d25c2c31f41ae3fa2c68 Mon Sep 17 00:00:00 2001
From: eillon <yezhenyu2@huawei.com>
Date: Tue, 8 Feb 2022 22:43:59 -0500
Subject: [PATCH] hw/usb: reduce the vpcu cost of UHCI when VNC disconnect
Reduce the vpcu cost by set a lower FRAME_TIMER_FREQ of the UHCI
when VNC client disconnected. This can reduce about 3% cost of
vcpu thread.
Signed-off-by: eillon <yezhenyu2@huawei.com>
---
hw/usb/core.c | 5 ++--
hw/usb/desc.c | 7 +++--
hw/usb/dev-hid.c | 2 +-
hw/usb/hcd-uhci.c | 63 ++++++++++++++++++++++++++++++++++------
hw/usb/hcd-uhci.h | 1 +
hw/usb/host-libusb.c | 32 ++++++++++++++++++++
include/hw/usb.h | 1 +
include/qemu/timer.h | 28 ++++++++++++++++++
ui/vnc.c | 4 +++
util/qemu-timer.c | 69 ++++++++++++++++++++++++++++++++++++++++++++
10 files changed, 197 insertions(+), 15 deletions(-)
diff --git a/hw/usb/core.c b/hw/usb/core.c
index 975f76250a..51b36126ca 100644
--- a/hw/usb/core.c
+++ b/hw/usb/core.c
@@ -87,7 +87,7 @@ void usb_device_reset(USBDevice *dev)
return;
}
usb_device_handle_reset(dev);
- dev->remote_wakeup = 0;
+ dev->remote_wakeup &= ~USB_DEVICE_REMOTE_WAKEUP;
dev->addr = 0;
dev->state = USB_STATE_DEFAULT;
}
@@ -105,7 +105,8 @@ void usb_wakeup(USBEndpoint *ep, unsigned int stream)
*/
return;
}
- if (dev->remote_wakeup && dev->port && dev->port->ops->wakeup) {
+ if ((dev->remote_wakeup & USB_DEVICE_REMOTE_WAKEUP)
+ && dev->port && dev->port->ops->wakeup) {
dev->port->ops->wakeup(dev->port);
}
if (bus->ops->wakeup_endpoint) {
diff --git a/hw/usb/desc.c b/hw/usb/desc.c
index f2bdc05a95..333f73fff1 100644
--- a/hw/usb/desc.c
+++ b/hw/usb/desc.c
@@ -752,7 +752,7 @@ int usb_desc_handle_control(USBDevice *dev, USBPacket *p,
if (config->bmAttributes & USB_CFG_ATT_SELFPOWER) {
data[0] |= 1 << USB_DEVICE_SELF_POWERED;
}
- if (dev->remote_wakeup) {
+ if (dev->remote_wakeup & USB_DEVICE_REMOTE_WAKEUP) {
data[0] |= 1 << USB_DEVICE_REMOTE_WAKEUP;
}
data[1] = 0x00;
@@ -762,14 +762,15 @@ int usb_desc_handle_control(USBDevice *dev, USBPacket *p,
}
case DeviceOutRequest | USB_REQ_CLEAR_FEATURE:
if (value == USB_DEVICE_REMOTE_WAKEUP) {
- dev->remote_wakeup = 0;
+ dev->remote_wakeup &= ~USB_DEVICE_REMOTE_WAKEUP;
ret = 0;
}
trace_usb_clear_device_feature(dev->addr, value, ret);
break;
case DeviceOutRequest | USB_REQ_SET_FEATURE:
+ dev->remote_wakeup |= USB_DEVICE_REMOTE_WAKEUP_IS_SUPPORTED;
if (value == USB_DEVICE_REMOTE_WAKEUP) {
- dev->remote_wakeup = 1;
+ dev->remote_wakeup |= USB_DEVICE_REMOTE_WAKEUP;
ret = 0;
}
trace_usb_set_device_feature(dev->addr, value, ret);
diff --git a/hw/usb/dev-hid.c b/hw/usb/dev-hid.c
index bdd6d1ffaf..cc68d1ce9e 100644
--- a/hw/usb/dev-hid.c
+++ b/hw/usb/dev-hid.c
@@ -745,7 +745,7 @@ static int usb_ptr_post_load(void *opaque, int version_id)
{
USBHIDState *s = opaque;
- if (s->dev.remote_wakeup) {
+ if (s->dev.remote_wakeup & USB_DEVICE_REMOTE_WAKEUP) {
hid_pointer_activate(&s->hid);
}
return 0;
diff --git a/hw/usb/hcd-uhci.c b/hw/usb/hcd-uhci.c
index 6975966c3f..a92581ff5f 100644
--- a/hw/usb/hcd-uhci.c
+++ b/hw/usb/hcd-uhci.c
@@ -44,6 +44,8 @@
#include "hcd-uhci.h"
#define FRAME_TIMER_FREQ 1000
+#define FRAME_TIMER_FREQ_LAZY 10
+#define USB_DEVICE_NEED_NORMAL_FREQ "QEMU USB Tablet"
#define FRAME_MAX_LOOPS 256
@@ -109,6 +111,22 @@ static void uhci_async_cancel(UHCIAsync *async);
static void uhci_queue_fill(UHCIQueue *q, UHCI_TD *td);
static void uhci_resume(void *opaque);
+static int64_t uhci_frame_timer_freq = FRAME_TIMER_FREQ_LAZY;
+
+static void uhci_set_frame_freq(int freq)
+{
+ if (freq <= 0) {
+ return;
+ }
+
+ uhci_frame_timer_freq = freq;
+}
+
+static qemu_usb_controller qemu_uhci = {
+ .name = "uhci",
+ .qemu_set_freq = uhci_set_frame_freq,
+};
+
static inline int32_t uhci_queue_token(UHCI_TD *td)
{
if ((td->token & (0xf << 15)) == 0) {
@@ -351,7 +369,7 @@ static int uhci_post_load(void *opaque, int version_id)
if (version_id < 2) {
s->expire_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
- (NANOSECONDS_PER_SECOND / FRAME_TIMER_FREQ);
+ (NANOSECONDS_PER_SECOND / uhci_frame_timer_freq);
}
return 0;
}
@@ -392,8 +410,29 @@ static void uhci_port_write(void *opaque, hwaddr addr,
if ((val & UHCI_CMD_RS) && !(s->cmd & UHCI_CMD_RS)) {
/* start frame processing */
trace_usb_uhci_schedule_start();
- s->expire_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
- (NANOSECONDS_PER_SECOND / FRAME_TIMER_FREQ);
+
+ /*
+ * If the frequency of frame_timer is too slow, Guest OS (Win2012) would become
+ * blue-screen after hotplugging some vcpus.
+ * If this USB device support the remote-wakeup, the UHCI controller
+ * will enter global suspend mode when there is no input for several seconds.
+ * In this case, Qemu will delete the frame_timer. Since the frame_timer has been deleted,
+ * there is no influence to the performance of Vms. So, we can change the frequency to 1000.
+ * After that the frequency will be safe when we trigger the frame_timer again.
+ * Excepting this, there are two ways to change the frequency:
+ * 1)VNC connect/disconnect;2)attach/detach USB device.
+ */
+ if ((uhci_frame_timer_freq != FRAME_TIMER_FREQ)
+ && (s->ports[0].port.dev)
+ && (!memcmp(s->ports[0].port.dev->product_desc,
+ USB_DEVICE_NEED_NORMAL_FREQ, strlen(USB_DEVICE_NEED_NORMAL_FREQ)))
+ && (s->ports[0].port.dev->remote_wakeup & USB_DEVICE_REMOTE_WAKEUP_IS_SUPPORTED)) {
+ qemu_log("turn up the frequency of UHCI controller to %d\n", FRAME_TIMER_FREQ);
+ uhci_frame_timer_freq = FRAME_TIMER_FREQ;
+ }
+
+ s->frame_time = NANOSECONDS_PER_SECOND / FRAME_TIMER_FREQ;
+ s->expire_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + s->frame_time;
timer_mod(s->frame_timer, s->expire_time);
s->status &= ~UHCI_STS_HCHALTED;
} else if (!(val & UHCI_CMD_RS)) {
@@ -1083,7 +1122,6 @@ static void uhci_frame_timer(void *opaque)
UHCIState *s = opaque;
uint64_t t_now, t_last_run;
int i, frames;
- const uint64_t frame_t = NANOSECONDS_PER_SECOND / FRAME_TIMER_FREQ;
s->completions_only = false;
qemu_bh_cancel(s->bh);
@@ -1099,14 +1137,14 @@ static void uhci_frame_timer(void *opaque)
}
/* We still store expire_time in our state, for migration */
- t_last_run = s->expire_time - frame_t;
+ t_last_run = s->expire_time - s->frame_time;
t_now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
/* Process up to MAX_FRAMES_PER_TICK frames */
- frames = (t_now - t_last_run) / frame_t;
+ frames = (t_now - t_last_run) / s->frame_time;
if (frames > s->maxframes) {
int skipped = frames - s->maxframes;
- s->expire_time += skipped * frame_t;
+ s->expire_time += skipped * s->frame_time;
s->frnum = (s->frnum + skipped) & 0x7ff;
frames -= skipped;
}
@@ -1123,7 +1161,7 @@ static void uhci_frame_timer(void *opaque)
/* The spec says frnum is the frame currently being processed, and
* the guest must look at frnum - 1 on interrupt, so inc frnum now */
s->frnum = (s->frnum + 1) & 0x7ff;
- s->expire_time += frame_t;
+ s->expire_time += s->frame_time;
}
/* Complete the previous frame(s) */
@@ -1134,7 +1172,12 @@ static void uhci_frame_timer(void *opaque)
}
s->pending_int_mask = 0;
- timer_mod(s->frame_timer, t_now + frame_t);
+ /* expire_time is calculated from last frame_time, we should calculate it
+ * according to new frame_time which equals to
+ * NANOSECONDS_PER_SECOND / uhci_frame_timer_freq */
+ s->expire_time -= s->frame_time - NANOSECONDS_PER_SECOND / uhci_frame_timer_freq;
+ s->frame_time = NANOSECONDS_PER_SECOND / uhci_frame_timer_freq;
+ timer_mod(s->frame_timer, t_now + s->frame_time);
}
static const MemoryRegionOps uhci_ioport_ops = {
@@ -1195,8 +1238,10 @@ void usb_uhci_common_realize(PCIDevice *dev, Error **errp)
s->bh = qemu_bh_new_guarded(uhci_bh, s, &DEVICE(dev)->mem_reentrancy_guard);
s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, uhci_frame_timer, s);
s->num_ports_vmstate = NB_PORTS;
+ s->frame_time = NANOSECONDS_PER_SECOND / uhci_frame_timer_freq;
QTAILQ_INIT(&s->queues);
+ qemu_register_usb_controller(&qemu_uhci, QEMU_USB_CONTROLLER_UHCI);
memory_region_init_io(&s->io_bar, OBJECT(s), &uhci_ioport_ops, s,
"uhci", 0x20);
diff --git a/hw/usb/hcd-uhci.h b/hw/usb/hcd-uhci.h
index 69f8b40c49..0918719911 100644
--- a/hw/usb/hcd-uhci.h
+++ b/hw/usb/hcd-uhci.h
@@ -50,6 +50,7 @@ typedef struct UHCIState {
uint16_t status;
uint16_t intr; /* interrupt enable register */
uint16_t frnum; /* frame number */
+ uint64_t frame_time; /* frame time in ns */
uint32_t fl_base_addr; /* frame list base address */
uint8_t sof_timing;
uint8_t status2; /* bit 0 and 1 are used to generate UHCI_STS_USBINT */
diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index d7060a42d5..dba469c1ef 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -945,6 +945,30 @@ static void usb_host_ep_update(USBHostDevice *s)
libusb_free_config_descriptor(conf);
}
+static unsigned int usb_get_controller_type(int speed)
+{
+ unsigned int type = MAX_USB_CONTROLLER_TYPES;
+
+ switch (speed) {
+ case USB_SPEED_SUPER:
+ type = QEMU_USB_CONTROLLER_XHCI;
+ break;
+ case USB_SPEED_HIGH:
+ type = QEMU_USB_CONTROLLER_EHCI;
+ break;
+ case USB_SPEED_FULL:
+ type = QEMU_USB_CONTROLLER_UHCI;
+ break;
+ case USB_SPEED_LOW:
+ type = QEMU_USB_CONTROLLER_OHCI;
+ break;
+ default:
+ break;
+ }
+
+ return type;
+}
+
static int usb_host_open(USBHostDevice *s, libusb_device *dev, int hostfd)
{
USBDevice *udev = USB_DEVICE(s);
@@ -1054,6 +1078,12 @@ static int usb_host_open(USBHostDevice *s, libusb_device *dev, int hostfd)
}
trace_usb_host_open_success(bus_num, addr);
+
+ /* change ehci frame time freq when USB passthrough */
+ qemu_log("usb host speed is %d\n", udev->speed);
+ qemu_timer_set_mode(QEMU_TIMER_USB_NORMAL_MODE,
+ usb_get_controller_type(udev->speed));
+
return 0;
fail:
@@ -1129,6 +1159,8 @@ static int usb_host_close(USBHostDevice *s)
}
usb_host_auto_check(NULL);
+ qemu_timer_set_mode(QEMU_TIMER_USB_LAZY_MODE,
+ usb_get_controller_type(udev->speed));
return 0;
}
diff --git a/include/hw/usb.h b/include/hw/usb.h
index 32c23a5ca2..911179158d 100644
--- a/include/hw/usb.h
+++ b/include/hw/usb.h
@@ -142,6 +142,7 @@
#define USB_DEVICE_SELF_POWERED 0
#define USB_DEVICE_REMOTE_WAKEUP 1
+#define USB_DEVICE_REMOTE_WAKEUP_IS_SUPPORTED 2
#define USB_DT_DEVICE 0x01
#define USB_DT_CONFIG 0x02
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 9a366e551f..475c2a3f18 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -91,6 +91,34 @@ struct QEMUTimer {
int scale;
};
+#define QEMU_USB_NORMAL_FREQ 1000
+#define QEMU_USB_LAZY_FREQ 10
+#define MAX_USB_CONTROLLER_TYPES 4
+#define QEMU_USB_CONTROLLER_OHCI 0
+#define QEMU_USB_CONTROLLER_UHCI 1
+#define QEMU_USB_CONTROLLER_EHCI 2
+#define QEMU_USB_CONTROLLER_XHCI 3
+
+typedef void (*QEMUSetFreqHandler) (int freq);
+
+typedef struct qemu_usb_controller {
+ const char *name;
+ QEMUSetFreqHandler qemu_set_freq;
+} qemu_usb_controller;
+
+typedef qemu_usb_controller* qemu_usb_controller_ptr;
+
+enum qemu_timer_mode {
+ QEMU_TIMER_USB_NORMAL_MODE = 1 << 0, /* Set when VNC connect or
+ * with usb dev passthrough
+ */
+ QEMU_TIMER_USB_LAZY_MODE = 1 << 1, /* Set when VNC disconnect */
+};
+
+int qemu_register_usb_controller(qemu_usb_controller_ptr controller,
+ unsigned int type);
+int qemu_timer_set_mode(enum qemu_timer_mode mode, unsigned int type);
+
extern QEMUTimerListGroup main_loop_tlg;
/*
diff --git a/ui/vnc.c b/ui/vnc.c
index 4f23a0fa79..5dd77e73cb 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1365,6 +1365,8 @@ void vnc_disconnect_finish(VncState *vs)
g_free(vs->zrle);
g_free(vs->tight);
g_free(vs);
+
+ qemu_timer_set_mode(QEMU_TIMER_USB_LAZY_MODE, QEMU_USB_CONTROLLER_UHCI);
}
size_t vnc_client_io_error(VncState *vs, ssize_t ret, Error *err)
@@ -3341,6 +3343,8 @@ static void vnc_connect(VncDisplay *vd, QIOChannelSocket *sioc,
}
}
}
+
+ qemu_timer_set_mode(QEMU_TIMER_USB_NORMAL_MODE, QEMU_USB_CONTROLLER_UHCI);
}
void vnc_start_protocol(VncState *vs)
diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 6a0de33dd2..dc891cc557 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -23,6 +23,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/log.h"
#include "qemu/main-loop.h"
#include "qemu/timer.h"
#include "qemu/lockable.h"
@@ -75,6 +76,74 @@ struct QEMUTimerList {
QemuEvent timers_done_ev;
};
+typedef struct qemu_controller_timer_state {
+ qemu_usb_controller_ptr controller;
+ int refs;
+} controller_timer_state;
+
+typedef controller_timer_state* controller_timer_state_ptr;
+
+static controller_timer_state uhci_timer_state = {
+ .controller = NULL,
+ .refs = 0,
+};
+
+static controller_timer_state_ptr \
+ qemu_usb_controller_tab[MAX_USB_CONTROLLER_TYPES] = {NULL,
+ &uhci_timer_state,
+ NULL, NULL};
+
+int qemu_register_usb_controller(qemu_usb_controller_ptr controller,
+ unsigned int type)
+{
+ if (type != QEMU_USB_CONTROLLER_UHCI) {
+ return 0;
+ }
+
+ /* for companion EHCI controller will create three UHCI controllers,
+ * we init it only once.
+ */
+ if (!qemu_usb_controller_tab[type]->controller) {
+ qemu_log("the usb controller (%d) registed frame handler\n", type);
+ qemu_usb_controller_tab[type]->controller = controller;
+ }
+
+ return 0;
+}
+
+int qemu_timer_set_mode(enum qemu_timer_mode mode, unsigned int type)
+{
+ if (type != QEMU_USB_CONTROLLER_UHCI) {
+ qemu_log("the usb controller (%d) no need change frame frep\n", type);
+ return 0;
+ }
+
+ if (!qemu_usb_controller_tab[type]->controller) {
+ qemu_log("the usb controller (%d) not registed yet\n", type);
+ return 0;
+ }
+
+ if (mode == QEMU_TIMER_USB_NORMAL_MODE) {
+ if (qemu_usb_controller_tab[type]->refs++ > 0) {
+ return 0;
+ }
+ qemu_usb_controller_tab[type]->controller->
+ qemu_set_freq(QEMU_USB_NORMAL_FREQ);
+ qemu_log("Set the controller (%d) of freq %d HZ,\n",
+ type, QEMU_USB_NORMAL_FREQ);
+ } else {
+ if (--qemu_usb_controller_tab[type]->refs > 0) {
+ return 0;
+ }
+ qemu_usb_controller_tab[type]->controller->
+ qemu_set_freq(QEMU_USB_LAZY_FREQ);
+ qemu_log("Set the controller(type:%d) of freq %d HZ,\n",
+ type, QEMU_USB_LAZY_FREQ);
+ }
+
+ return 0;
+}
+
/**
* qemu_clock_ptr:
* @type: type of clock
--
2.27.0

View File

@ -0,0 +1,65 @@
From ff43e9201aba8f4047e6fd5edb93a4861cc8fed2 Mon Sep 17 00:00:00 2001
From: Yanan Wang <wangyanan55@huawei.com>
Date: Thu, 28 Mar 2024 18:57:56 +0800
Subject: [PATCH] i386: cache passthrough: Update AMD 8000_001D.EAX[25:14]
based on vCPU topo
On AMD target, when host cache passthrough is disabled we will
emulate the guest caches with default values and initialize the
shared cpu list of the caches based on vCPU topology. However
when host cache passthrough is enabled, the shared cpu list is
consistent with host regardless what the vCPU topology is.
For example, when cache passthrough is enabled, running a guest
with vThreads=1 on a host with pThreads=2, we will get that there
are every *two* logical vCPUs sharing a L1/L2 cache, which is not
consistent with the vCPU topology (vThreads=1).
So let's reinitialize BITs[25:14] of AMD CPUID 8000_001D.EAX
based on the actual vCPU topology instead of host pCPU topology.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
target/i386/cpu.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f94405c02b..491cf40cc7 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6597,9 +6597,31 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
}
break;
case 0x8000001D:
+ /* Populate AMD Processor Cache Information */
*eax = 0;
if (cpu->cache_info_passthrough) {
x86_cpu_get_cache_cpuid(index, count, eax, ebx, ecx, edx);
+
+ /*
+ * Clear BITs[25:14] and then update them based on the guest
+ * vCPU topology, like what we do in encode_cache_cpuid8000001d
+ * when cache_info_passthrough is not enabled.
+ */
+ *eax &= ~0x03FFC000;
+ switch (count) {
+ case 0: /* L1 dcache info */
+ case 1: /* L1 icache info */
+ case 2: /* L2 cache info */
+ *eax |= ((topo_info.threads_per_core - 1) << 14);
+ break;
+ case 3: /* L3 cache info */
+ *eax |= ((topo_info.cores_per_die *
+ topo_info.threads_per_core - 1) << 14);
+ break;
+ default: /* end of info */
+ *eax = *ebx = *ecx = *edx = 0;
+ break;
+ }
break;
}
switch (count) {
--
2.27.0

View File

@ -0,0 +1,42 @@
From 06fc5eb48668a1c83e6a4e76c1a71403917b1835 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Fri, 11 Feb 2022 20:33:47 +0800
Subject: [PATCH] i6300esb watchdog: bugfix: Add a runstate transition
QEMU will abort() for the reasons now:
invalid runstate transition: 'prelaunch' -> 'postmigrate'
Aborted
This happens when:
|<- watchdog timeout happened, then sets reset_requested to
| SHUTDOWN_CAUSE_GUEST_RESET;
|<- hot-migration thread sets vm state to RUN_STATE_FINISH_MIGRATE
| before the last time of migration;
|<- main thread gets the change of reset_requested and triggers
| reset, then sets vm state to RUN_STATE_PRELAUNCH;
|<- hot-migration thread sets vm state to RUN_STATE_POSTMIGRATE.
Then 'prelaunch' -> 'postmigrate' runstate transition will happen.
It is legal so add this transition to runstate_transitions_def.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
system/runstate.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/system/runstate.c b/system/runstate.c
index ea9d6c2a32..9d3f627fee 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -116,6 +116,7 @@ static const RunStateTransition runstate_transitions_def[] = {
{ RUN_STATE_PRELAUNCH, RUN_STATE_RUNNING },
{ RUN_STATE_PRELAUNCH, RUN_STATE_FINISH_MIGRATE },
{ RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
+ { RUN_STATE_PRELAUNCH, RUN_STATE_POSTMIGRATE },
{ RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
{ RUN_STATE_FINISH_MIGRATE, RUN_STATE_PAUSED },
--
2.27.0

View File

@ -0,0 +1,42 @@
From 6689eebbb520dc75bc65e0914c4e05e40a4efc1d Mon Sep 17 00:00:00 2001
From: Prasad J Pandit <address@hidden>
Date: Mon, 21 Jun 2021 09:22:35 +0800
Subject: [PATCH] ide: ahci: add check to avoid null dereference
(CVE-2019-12067)
Fix CVE-2019-12067
AHCI emulator while committing DMA buffer in ahci_commit_buf()
may do a NULL dereference if the command header 'ad->cur_cmd'
is null. Add check to avoid it.
Reported-by: Bugs SysSec <address@hidden>
Signed-off-by: Prasad J Pandit <address@hidden>
Signed-off-by: Jiajie Li <lijiajie11@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: Adttil <yangtao286@huawei.com>
---
hw/ide/ahci.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index afdc44b8e0..8062e1743c 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1519,8 +1519,10 @@ static void ahci_commit_buf(const IDEDMA *dma, uint32_t tx_bytes)
{
AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
- tx_bytes += le32_to_cpu(ad->cur_cmd->status);
- ad->cur_cmd->status = cpu_to_le32(tx_bytes);
+ if (ad->cur_cmd) {
+ tx_bytes += le32_to_cpu(ad->cur_cmd->status);
+ ad->cur_cmd->status = cpu_to_le32(tx_bytes);
+ }
}
static int ahci_dma_rw_buf(const IDEDMA *dma, bool is_write)
--
2.27.0

View File

@ -0,0 +1,82 @@
From 2ccd1ec0d18070727ad9b9647da6b6937f16de2a Mon Sep 17 00:00:00 2001
From: Zenghui Yu <yuzenghui@huawei.com>
Date: Sat, 8 May 2021 17:31:03 +0800
Subject: [PATCH] linux-headers: update against 5.10 and manual clear vfio
dirty log series
The new capability VFIO_DIRTY_LOG_MANUAL_CLEAR and the new ioctl
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP have been introduced in
the kernel, update the header to add them.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
linux-headers/linux/vfio.h | 36 +++++++++++++++++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 8e175ece31..956154e509 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -56,6 +56,16 @@
*/
#define VFIO_UPDATE_VADDR 10
+/*
+ * The vfio_iommu driver may support user clears dirty log manually, which means
+ * dirty log can be requested to not cleared automatically after dirty log is
+ * copied to userspace, it's user's duty to clear dirty log.
+ *
+ * Note: please refer to VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
+ * VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP.
+ */
+#define VFIO_DIRTY_LOG_MANUAL_CLEAR 11
+
/*
* The IOCTL interface is designed for extensibility by embedding the
* structure length (argsz) and flags into structures passed between
@@ -1651,8 +1661,30 @@ struct vfio_iommu_type1_dma_unmap {
* actual bitmap. If dirty pages logging is not enabled, an error will be
* returned.
*
- * Only one of the flags _START, _STOP and _GET may be specified at a time.
+ * The VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR flag is almost same as
+ * VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP, except that it requires underlying
+ * dirty bitmap is not cleared automatically. The user can clear it manually by
+ * calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP flag set.
*
+ * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP flag set,
+ * instructs the IOMMU driver to clear the dirty status of pages in a bitmap
+ * for IOMMU container for a given IOVA range. The user must specify the IOVA
+ * range, the bitmap and the pgsize through the structure
+ * vfio_iommu_type1_dirty_bitmap_get in the data[] portion. This interface
+ * supports clearing a bitmap of the smallest supported pgsize only and can be
+ * modified in future to clear a bitmap of any specified supported pgsize. The
+ * user must provide a memory area for the bitmap memory and specify its size
+ * in bitmap.size. One bit is used to represent one page consecutively starting
+ * from iova offset. The user should provide page size in bitmap.pgsize field.
+ * A bit set in the bitmap indicates that the page at that offset from iova is
+ * cleared the dirty status, and dirty tracking is re-enabled for that page. The
+ * caller must set argsz to a value including the size of structure
+ * vfio_iommu_dirty_bitmap_get, but excluing the size of the actual bitmap. If
+ * dirty pages logging is not enabled, an error will be returned. Note: user
+ * should clear dirty log before handle corresponding dirty pages.
+ *
+ * Only one of the flags _START, _STOP, _GET, _GET_NOCLEAR_, and _CLEAR may be
+ * specified at a time.
*/
struct vfio_iommu_type1_dirty_bitmap {
__u32 argsz;
@@ -1660,6 +1692,8 @@ struct vfio_iommu_type1_dirty_bitmap {
#define VFIO_IOMMU_DIRTY_PAGES_FLAG_START (1 << 0)
#define VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP (1 << 1)
#define VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP (1 << 2)
+#define VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR (1 << 3)
+#define VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP (1 << 4)
__u8 data[];
};
--
2.27.0

View File

@ -0,0 +1,68 @@
From 16c4b8946903985e3dfd470d0e04b79d473505bc Mon Sep 17 00:00:00 2001
From: "wanghaibin.wang" <wanghaibin.wang@huawei.com>
Date: Sun, 17 Mar 2024 15:53:57 +0800
Subject: [PATCH] log: Add log at boot & cpu init for aarch64
Add log at boot & cpu init for aarch64
Signed-off-by: miaoyubo <miaoyubo@huawei.com>
Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/arm/boot.c | 4 ++++
hw/arm/virt.c | 3 +++
2 files changed, 7 insertions(+)
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 84ea6a807a..d1671e1d42 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -11,6 +11,7 @@
#include "qemu/datadir.h"
#include "qemu/error-report.h"
#include "qapi/error.h"
+#include "qemu/log.h"
#include <libfdt.h>
#include "hw/arm/boot.h"
#include "hw/arm/linux-boot-if.h"
@@ -1226,6 +1227,9 @@ void arm_load_kernel(ARMCPU *cpu, MachineState *ms, struct arm_boot_info *info)
* doesn't support secure.
*/
assert(!(info->secure_board_setup && kvm_enabled()));
+
+ qemu_log("load the kernel\n");
+
info->kernel_filename = ms->kernel_filename;
info->kernel_cmdline = ms->kernel_cmdline;
info->initrd_filename = ms->initrd_filename;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index c19cacec8b..f4c3d47f30 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -32,6 +32,7 @@
#include "qemu/datadir.h"
#include "qemu/units.h"
#include "qemu/option.h"
+#include "qemu/log.h"
#include "monitor/qdev.h"
#include "hw/sysbus.h"
#include "hw/arm/boot.h"
@@ -1020,6 +1021,7 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
{
VirtMachineState *s = container_of(n, VirtMachineState, powerdown_notifier);
+ qemu_log("send powerdown to vm.\n");
if (s->acpi_dev) {
acpi_send_event(s->acpi_dev, ACPI_POWER_DOWN_STATUS);
} else {
@@ -2240,6 +2242,7 @@ static void machvirt_init(MachineState *machine)
}
create_fdt(vms);
+ qemu_log("cpu init start\n");
assert(possible_cpus->len == max_cpus);
for (n = 0; n < possible_cpus->len; n++) {
--
2.27.0

View File

@ -0,0 +1,68 @@
From 79863c5ccdd4c635657d2e32e91bc02aa49655e0 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Sat, 30 Jan 2021 16:23:15 +0800
Subject: [PATCH] migration: Add compress_level sanity check
Zlib compression has level from 1 to 9. However Zstd compression has level
from 1 to 22 (level >= 20 not recommanded). Let's do sanity check here
to make sure a vaild compress_level is given by user.
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Zeyu Jin <jinzeyu@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
migration/options.c | 32 ++++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/migration/options.c b/migration/options.c
index 6aaee702dc..9b68962a65 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -1065,16 +1065,40 @@ void migrate_params_init(MigrationParameters *params)
params->has_mode = true;
}
+static bool compress_level_check(MigrationParameters *params, Error **errp)
+{
+ switch (params->compress_method) {
+ case COMPRESS_METHOD_ZLIB:
+ if (params->compress_level > 9 || params->compress_level < 1) {
+ error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "compress_level",
+ "a value in the range of 0 to 9 for Zlib method");
+ return false;
+ }
+ break;
+#ifdef CONFIG_ZSTD
+ case COMPRESS_METHOD_ZSTD:
+ if (params->compress_level > 19 || params->compress_level < 1) {
+ error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "compress_level",
+ "a value in the range of 1 to 19 for Zstd method");
+ return false;
+ }
+ break;
+#endif
+ default:
+ error_setg(errp, "Checking compress_level failed for unknown reason");
+ return false;
+ }
+
+ return true;
+}
+
/*
* Check whether the parameters are valid. Error will be put into errp
* (if provided). Return true if valid, otherwise false.
*/
bool migrate_params_check(MigrationParameters *params, Error **errp)
{
- if (params->has_compress_level &&
- (params->compress_level > 9)) {
- error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "compress_level",
- "a value between 0 and 9");
+ if (params->has_compress_level && !compress_level_check(params, errp)) {
return false;
}
--
2.27.0

View File

@ -0,0 +1,292 @@
From c2402b63ecb10b9a25695b710f2664dbcbc01ec4 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Sat, 30 Jan 2021 14:57:54 +0800
Subject: [PATCH] migration: Add multi-thread compress method
A multi-thread compress method parameter is added to hold the method we
are going to use. By default the 'zlib' method is used to maintain the
compatibility as before.
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Zeyu Jin <jinzeyu@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
hw/core/qdev-properties-system.c | 11 +++++++++++
include/hw/qdev-properties.h | 4 ++++
migration/migration-hmp-cmds.c | 13 +++++++++++++
migration/options.c | 15 +++++++++++++++
monitor/hmp-cmds.c | 1 +
qapi/migration.json | 32 ++++++++++++++++++++++++++++++--
util/oslib-posix.c | 2 +-
7 files changed, 75 insertions(+), 3 deletions(-)
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index f2e2718c74..cd5571fcfb 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -1202,6 +1202,17 @@ const PropertyInfo qdev_prop_uuid = {
.set_default_value = set_default_uuid_auto,
};
+/* --- CompressMethod --- */
+const PropertyInfo qdev_prop_compress_method = {
+ .name = "CompressMethod",
+ .description = "multi-thread compression method, "
+ "zlib",
+ .enum_table = &CompressMethod_lookup,
+ .get = qdev_propinfo_get_enum,
+ .set = qdev_propinfo_set_enum,
+ .set_default_value = qdev_propinfo_set_default_value_enum,
+};
+
/* --- s390 cpu entitlement policy --- */
QEMU_BUILD_BUG_ON(sizeof(CpuS390Entitlement) != sizeof(int));
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 25743a29a0..63602c2c74 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -60,6 +60,7 @@ extern const PropertyInfo qdev_prop_int64;
extern const PropertyInfo qdev_prop_size;
extern const PropertyInfo qdev_prop_string;
extern const PropertyInfo qdev_prop_on_off_auto;
+extern const PropertyInfo qdev_prop_compress_method;
extern const PropertyInfo qdev_prop_size32;
extern const PropertyInfo qdev_prop_array;
extern const PropertyInfo qdev_prop_link;
@@ -168,6 +169,9 @@ extern const PropertyInfo qdev_prop_link;
DEFINE_PROP(_n, _s, _f, qdev_prop_string, char*)
#define DEFINE_PROP_ON_OFF_AUTO(_n, _s, _f, _d) \
DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_on_off_auto, OnOffAuto)
+#define DEFINE_PROP_COMPRESS_METHOD(_n, _s, _f, _d) \
+ DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_compress_method, \
+ CompressMethod)
#define DEFINE_PROP_SIZE32(_n, _s, _f, _d) \
DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size32, uint32_t)
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 86ae832176..261ec1e35c 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -22,6 +22,7 @@
#include "qapi/qapi-commands-migration.h"
#include "qapi/qapi-visit-migration.h"
#include "qapi/qmp/qdict.h"
+#include "qapi/qapi-visit-migration.h"
#include "qapi/string-input-visitor.h"
#include "qapi/string-output-visitor.h"
#include "qemu/cutils.h"
@@ -291,6 +292,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
MigrationParameter_str(MIGRATION_PARAMETER_DECOMPRESS_THREADS),
params->decompress_threads);
assert(params->has_throttle_trigger_threshold);
+ monitor_printf(mon, "%s: %s\n",
+ MigrationParameter_str(MIGRATION_PARAMETER_COMPRESS_METHOD),
+ CompressMethod_str(params->compress_method));
monitor_printf(mon, "%s: %u\n",
MigrationParameter_str(MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD),
params->throttle_trigger_threshold);
@@ -519,6 +523,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
MigrateSetParameters *p = g_new0(MigrateSetParameters, 1);
uint64_t valuebw = 0;
uint64_t cache_size;
+ CompressMethod compress_method;
Error *err = NULL;
int val, ret;
@@ -544,6 +549,14 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
p->has_decompress_threads = true;
visit_type_uint8(v, param, &p->decompress_threads, &err);
break;
+ case MIGRATION_PARAMETER_COMPRESS_METHOD:
+ p->has_compress_method = true;
+ visit_type_CompressMethod(v, param, &compress_method, &err);
+ if (err) {
+ break;
+ }
+ p->compress_method = compress_method;
+ break;
case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD:
p->has_throttle_trigger_threshold = true;
visit_type_uint8(v, param, &p->throttle_trigger_threshold, &err);
diff --git a/migration/options.c b/migration/options.c
index 8d8ec73ad9..af7ea7b346 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -47,6 +47,7 @@
#define DEFAULT_MIGRATE_DECOMPRESS_THREAD_COUNT 2
/*0: means nocompress, 1: best speed, ... 9: best compress ratio */
#define DEFAULT_MIGRATE_COMPRESS_LEVEL 1
+#define DEFAULT_MIGRATE_COMPRESS_METHOD COMPRESS_METHOD_ZLIB
/* Define default autoconverge cpu throttle migration parameters */
#define DEFAULT_MIGRATE_THROTTLE_TRIGGER_THRESHOLD 50
#define DEFAULT_MIGRATE_CPU_THROTTLE_INITIAL 20
@@ -113,6 +114,9 @@ Property migration_properties[] = {
DEFINE_PROP_UINT8("x-decompress-threads", MigrationState,
parameters.decompress_threads,
DEFAULT_MIGRATE_DECOMPRESS_THREAD_COUNT),
+ DEFINE_PROP_COMPRESS_METHOD("compress-method", MigrationState,
+ parameters.compress_method,
+ DEFAULT_MIGRATE_COMPRESS_METHOD),
DEFINE_PROP_UINT8("x-throttle-trigger-threshold", MigrationState,
parameters.throttle_trigger_threshold,
DEFAULT_MIGRATE_THROTTLE_TRIGGER_THRESHOLD),
@@ -953,6 +957,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
params->compress_wait_thread = s->parameters.compress_wait_thread;
params->has_decompress_threads = true;
params->decompress_threads = s->parameters.decompress_threads;
+ params->has_compress_method = true;
+ params->compress_method = s->parameters.compress_method;
params->has_throttle_trigger_threshold = true;
params->throttle_trigger_threshold = s->parameters.throttle_trigger_threshold;
params->has_cpu_throttle_initial = true;
@@ -1025,6 +1031,7 @@ void migrate_params_init(MigrationParameters *params)
params->has_compress_threads = true;
params->has_compress_wait_thread = true;
params->has_decompress_threads = true;
+ params->has_compress_method = true;
params->has_throttle_trigger_threshold = true;
params->has_cpu_throttle_initial = true;
params->has_cpu_throttle_increment = true;
@@ -1259,6 +1266,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
dest->decompress_threads = params->decompress_threads;
}
+ if (params->has_compress_method) {
+ dest->compress_method = params->compress_method;
+ }
+
if (params->has_throttle_trigger_threshold) {
dest->throttle_trigger_threshold = params->throttle_trigger_threshold;
}
@@ -1380,6 +1391,10 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
s->parameters.decompress_threads = params->decompress_threads;
}
+ if (params->has_compress_method) {
+ s->parameters.compress_method = params->compress_method;
+ }
+
if (params->has_throttle_trigger_threshold) {
s->parameters.throttle_trigger_threshold = params->throttle_trigger_threshold;
}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 871898ac46..5bb3c9cd46 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -24,6 +24,7 @@
#include "qapi/qapi-commands-control.h"
#include "qapi/qapi-commands-misc.h"
#include "qapi/qmp/qdict.h"
+#include "qapi/qapi-visit-migration.h"
#include "qemu/cutils.h"
#include "hw/intc/intc.h"
#include "qemu/log.h"
diff --git a/qapi/migration.json b/qapi/migration.json
index eb2f883513..cafaa5ccb3 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -708,6 +708,19 @@
'bitmaps': [ 'BitmapMigrationBitmapAlias' ]
} }
+##
+# @CompressMethod:
+#
+# An enumeration of multi-thread compression methods.
+#
+# @zlib: use zlib compression method.
+#
+# Since: 5.0
+#
+##
+{ 'enum': 'CompressMethod',
+ 'data': [ 'zlib' ] }
+
##
# @MigrationParameter:
#
@@ -746,6 +759,9 @@
# fast as compression, so set the decompress-threads to the number
# about 1/4 of compress-threads is adequate.
#
+# @compress-method: Which multi-thread compression method to use.
+# Defaults to none. (Since 5.0)
+#
# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
# bytes_xfer_period to trigger throttling. It is expressed as
# percentage. The default value is 50. (Since 5.0)
@@ -892,6 +908,7 @@
{ 'name': 'compress-level', 'features': [ 'deprecated' ] },
{ 'name': 'compress-threads', 'features': [ 'deprecated' ] },
{ 'name': 'decompress-threads', 'features': [ 'deprecated' ] },
+ { 'name': 'compress-method', 'features': [ 'deprecated' ] },
{ 'name': 'compress-wait-thread', 'features': [ 'deprecated' ] },
'throttle-trigger-threshold',
'cpu-throttle-initial', 'cpu-throttle-increment',
@@ -935,6 +952,9 @@
#
# @decompress-threads: decompression thread count
#
+# @compress-method: Set compression method to use in multi-thread compression.
+# Defaults to none. (Since 5.0)
+#
# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
# bytes_xfer_period to trigger throttling. It is expressed as
# percentage. The default value is 50. (Since 5.0)
@@ -1066,8 +1086,9 @@
#
# @deprecated: Member @block-incremental is deprecated. Use
# blockdev-mirror with NBD instead. Members @compress-level,
-# @compress-threads, @decompress-threads and @compress-wait-thread
-# are deprecated because @compression is deprecated.
+# @compress-threads, @decompress-threads, @compress-method
+# and @compress-wait-thread are deprecated because
+# @compression is deprecated.
#
# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
# are experimental.
@@ -1090,6 +1111,8 @@
'features': [ 'deprecated' ] },
'*decompress-threads': { 'type': 'uint8',
'features': [ 'deprecated' ] },
+ '*compress-method': { 'type': 'CompressMethod',
+ 'features': [ 'deprecated' ] },
'*throttle-trigger-threshold': 'uint8',
'*cpu-throttle-initial': 'uint8',
'*cpu-throttle-increment': 'uint8',
@@ -1161,6 +1184,9 @@
#
# @decompress-threads: decompression thread count
#
+# @compress-method: Which multi-thread compression method to use.
+# Defaults to none. (Since 5.0)
+#
# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
# bytes_xfer_period to trigger throttling. It is expressed as
# percentage. The default value is 50. (Since 5.0)
@@ -1315,6 +1341,8 @@
'features': [ 'deprecated' ] },
'*decompress-threads': { 'type': 'uint8',
'features': [ 'deprecated' ] },
+ '*compress-method': { 'type': 'CompressMethod',
+ 'features': [ 'deprecated' ] },
'*throttle-trigger-threshold': 'uint8',
'*cpu-throttle-initial': 'uint8',
'*cpu-throttle-increment': 'uint8',
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 9ca3fee2b8..43af077fed 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -346,7 +346,7 @@ static void *do_touch_pages(void *arg)
}
qemu_mutex_unlock(&page_mutex);
- while (started_num_threads != memset_args->context.num_threads) {
+ while (started_num_threads != memset_args->context->num_threads) {
smp_mb();
}
--
2.27.0

View File

@ -0,0 +1,493 @@
From 5896dedf32c7e4417bd7f3e889ca41a34b06f5db Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Sat, 30 Jan 2021 15:57:31 +0800
Subject: [PATCH] migration: Add multi-thread compress ops
Add the MigrationCompressOps and MigrationDecompressOps structures to make
the compression method configurable for multi-thread compression migration.
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Zeyu Jin <jinzeyu@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
migration/options.c | 9 ++
migration/options.h | 1 +
migration/ram-compress.c | 261 ++++++++++++++++++++++++++-------------
migration/ram-compress.h | 31 ++++-
migration/ram.c | 4 +-
5 files changed, 215 insertions(+), 91 deletions(-)
diff --git a/migration/options.c b/migration/options.c
index af7ea7b346..6aaee702dc 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -799,6 +799,15 @@ int migrate_decompress_threads(void)
return s->parameters.decompress_threads;
}
+CompressMethod migrate_compress_method(void)
+{
+ MigrationState *s;
+
+ s = migrate_get_current();
+
+ return s->parameters.compress_method;
+}
+
uint64_t migrate_downtime_limit(void)
{
MigrationState *s = migrate_get_current();
diff --git a/migration/options.h b/migration/options.h
index 246c160aee..9aca5e41ad 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -78,6 +78,7 @@ uint8_t migrate_cpu_throttle_increment(void);
uint8_t migrate_cpu_throttle_initial(void);
bool migrate_cpu_throttle_tailslow(void);
int migrate_decompress_threads(void);
+CompressMethod migrate_compress_method(void);
uint64_t migrate_downtime_limit(void);
uint8_t migrate_max_cpu_throttle(void);
uint64_t migrate_max_bandwidth(void);
diff --git a/migration/ram-compress.c b/migration/ram-compress.c
index 2be344acbc..6e37b22492 100644
--- a/migration/ram-compress.c
+++ b/migration/ram-compress.c
@@ -65,26 +65,167 @@ static QemuThread *compress_threads;
static QemuMutex comp_done_lock;
static QemuCond comp_done_cond;
-struct DecompressParam {
- bool done;
- bool quit;
- QemuMutex mutex;
- QemuCond cond;
- void *des;
- uint8_t *compbuf;
- int len;
- z_stream stream;
-};
-typedef struct DecompressParam DecompressParam;
-
static QEMUFile *decomp_file;
static DecompressParam *decomp_param;
static QemuThread *decompress_threads;
+MigrationCompressOps *compress_ops;
+MigrationDecompressOps *decompress_ops;
static QemuMutex decomp_done_lock;
static QemuCond decomp_done_cond;
static CompressResult do_compress_ram_page(CompressParam *param, RAMBlock *block);
+static int zlib_save_setup(CompressParam *param)
+{
+ if (deflateInit(&param->stream,
+ migrate_compress_level()) != Z_OK) {
+ return -1;
+ }
+
+ return 0;
+}
+
+static ssize_t zlib_compress_data(CompressParam *param, size_t size)
+{
+ int err;
+ uint8_t *dest = NULL;
+ z_stream *stream = &param->stream;
+ uint8_t *p = param->originbuf;
+ QEMUFile *f = f = param->file;
+ ssize_t blen = qemu_put_compress_start(f, &dest);
+
+ if (blen < compressBound(size)) {
+ return -1;
+ }
+
+ err = deflateReset(stream);
+ if (err != Z_OK) {
+ return -1;
+ }
+
+ stream->avail_in = size;
+ stream->next_in = p;
+ stream->avail_out = blen;
+ stream->next_out = dest;
+
+ err = deflate(stream, Z_FINISH);
+ if (err != Z_STREAM_END) {
+ return -1;
+ }
+
+ blen = stream->next_out - dest;
+ if (blen < 0) {
+ return -1;
+ }
+
+ qemu_put_compress_end(f, blen);
+ return blen + sizeof(int32_t);
+}
+
+static void zlib_save_cleanup(CompressParam *param)
+{
+ deflateEnd(&param->stream);
+}
+
+static int zlib_load_setup(DecompressParam *param)
+{
+ if (inflateInit(&param->stream) != Z_OK) {
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+zlib_decompress_data(DecompressParam *param, uint8_t *dest, size_t size)
+{
+ int err;
+
+ z_stream *stream = &param->stream;
+
+ err = inflateReset(stream);
+ if (err != Z_OK) {
+ return -1;
+ }
+
+ stream->avail_in = param->len;
+ stream->next_in = param->compbuf;
+ stream->avail_out = size;
+ stream->next_out = dest;
+
+ err = inflate(stream, Z_NO_FLUSH);
+ if (err != Z_STREAM_END) {
+ return -1;
+ }
+
+ return stream->total_out;
+}
+
+static void zlib_load_cleanup(DecompressParam *param)
+{
+ inflateEnd(&param->stream);
+}
+
+static int zlib_check_len(int len)
+{
+ return len < 0 || len > compressBound(TARGET_PAGE_SIZE);
+}
+
+static int set_compress_ops(void)
+{
+ compress_ops = g_new0(MigrationCompressOps, 1);
+
+ switch (migrate_compress_method()) {
+ case COMPRESS_METHOD_ZLIB:
+ compress_ops->save_setup = zlib_save_setup;
+ compress_ops->save_cleanup = zlib_save_cleanup;
+ compress_ops->compress_data = zlib_compress_data;
+ break;
+ default:
+ return -1;
+ }
+
+ return 0;
+}
+
+static int set_decompress_ops(void)
+{
+ decompress_ops = g_new0(MigrationDecompressOps, 1);
+
+ switch (migrate_compress_method()) {
+ case COMPRESS_METHOD_ZLIB:
+ decompress_ops->load_setup = zlib_load_setup;
+ decompress_ops->load_cleanup = zlib_load_cleanup;
+ decompress_ops->decompress_data = zlib_decompress_data;
+ decompress_ops->check_len = zlib_check_len;
+ break;
+ default:
+ return -1;
+ }
+
+ return 0;
+}
+
+static void clean_compress_ops(void)
+{
+ compress_ops->save_setup = NULL;
+ compress_ops->save_cleanup = NULL;
+ compress_ops->compress_data = NULL;
+
+ g_free(compress_ops);
+ compress_ops = NULL;
+}
+
+static void clean_decompress_ops(void)
+{
+ decompress_ops->load_setup = NULL;
+ decompress_ops->load_cleanup = NULL;
+ decompress_ops->decompress_data = NULL;
+
+ g_free(decompress_ops);
+ decompress_ops = NULL;
+}
+
static void *do_data_compress(void *opaque)
{
CompressParam *param = opaque;
@@ -141,7 +282,7 @@ void compress_threads_save_cleanup(void)
qemu_thread_join(compress_threads + i);
qemu_mutex_destroy(&comp_param[i].mutex);
qemu_cond_destroy(&comp_param[i].cond);
- deflateEnd(&comp_param[i].stream);
+ compress_ops->save_cleanup(&comp_param[i]);
g_free(comp_param[i].originbuf);
qemu_fclose(comp_param[i].file);
comp_param[i].file = NULL;
@@ -152,6 +293,7 @@ void compress_threads_save_cleanup(void)
g_free(comp_param);
compress_threads = NULL;
comp_param = NULL;
+ clean_compress_ops();
}
int compress_threads_save_setup(void)
@@ -161,6 +303,12 @@ int compress_threads_save_setup(void)
if (!migrate_compress()) {
return 0;
}
+
+ if (set_compress_ops() < 0) {
+ clean_compress_ops();
+ return -1;
+ }
+
thread_count = migrate_compress_threads();
compress_threads = g_new0(QemuThread, thread_count);
comp_param = g_new0(CompressParam, thread_count);
@@ -172,8 +320,7 @@ int compress_threads_save_setup(void)
goto exit;
}
- if (deflateInit(&comp_param[i].stream,
- migrate_compress_level()) != Z_OK) {
+ if (compress_ops->save_setup(&comp_param[i]) < 0) {
g_free(comp_param[i].originbuf);
goto exit;
}
@@ -198,50 +345,6 @@ exit:
return -1;
}
-/*
- * Compress size bytes of data start at p and store the compressed
- * data to the buffer of f.
- *
- * Since the file is dummy file with empty_ops, return -1 if f has no space to
- * save the compressed data.
- */
-static ssize_t qemu_put_compression_data(CompressParam *param, size_t size)
-{
- int err;
- uint8_t *dest = NULL;
- z_stream *stream = &param->stream;
- uint8_t *p = param->originbuf;
- QEMUFile *f = f = param->file;
- ssize_t blen = qemu_put_compress_start(f, &dest);
-
- if (blen < compressBound(size)) {
- return -1;
- }
-
- err = deflateReset(stream);
- if (err != Z_OK) {
- return -1;
- }
-
- stream->avail_in = size;
- stream->next_in = p;
- stream->avail_out = blen;
- stream->next_out = dest;
-
- err = deflate(stream, Z_FINISH);
- if (err != Z_STREAM_END) {
- return -1;
- }
-
- blen = stream->next_out - dest;
- if (blen < 0) {
- return -1;
- }
-
- qemu_put_compress_end(f, blen);
- return blen + sizeof(int32_t);
-}
-
static CompressResult do_compress_ram_page(CompressParam *param, RAMBlock *block)
{
uint8_t *p = block->host + (param->offset & TARGET_PAGE_MASK);
@@ -260,7 +363,7 @@ static CompressResult do_compress_ram_page(CompressParam *param, RAMBlock *block
* decompression
*/
memcpy(param->originbuf, p, page_size);
- ret = qemu_put_compression_data(param, page_size);
+ ret = compress_ops->compress_data(param, page_size);
if (ret < 0) {
qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
error_report("compressed data failed!");
@@ -356,32 +459,6 @@ bool compress_page_with_multi_thread(RAMBlock *block, ram_addr_t offset,
}
}
-/* return the size after decompression, or negative value on error */
-static int
-qemu_uncompress_data(DecompressParam *param, uint8_t *dest, size_t pagesize)
-{
- int err;
-
- z_stream *stream = &param->stream;
-
- err = inflateReset(stream);
- if (err != Z_OK) {
- return -1;
- }
-
- stream->avail_in = param->len;
- stream->next_in = param->compbuf;
- stream->avail_out = pagesize;
- stream->next_out = dest;
-
- err = inflate(stream, Z_NO_FLUSH);
- if (err != Z_STREAM_END) {
- return -1;
- }
-
- return stream->total_out;
-}
-
static void *do_data_decompress(void *opaque)
{
DecompressParam *param = opaque;
@@ -398,7 +475,7 @@ static void *do_data_decompress(void *opaque)
pagesize = qemu_target_page_size();
- ret = qemu_uncompress_data(param, des, pagesize);
+ ret = decompress_ops->decompress_data(param, des, pagesize);
if (ret < 0 && migrate_get_current()->decompress_error_check) {
error_report("decompress data failed");
qemu_file_set_error(decomp_file, ret);
@@ -466,7 +543,7 @@ void compress_threads_load_cleanup(void)
qemu_thread_join(decompress_threads + i);
qemu_mutex_destroy(&decomp_param[i].mutex);
qemu_cond_destroy(&decomp_param[i].cond);
- inflateEnd(&decomp_param[i].stream);
+ decompress_ops->load_cleanup(&decomp_param[i]);
g_free(decomp_param[i].compbuf);
decomp_param[i].compbuf = NULL;
}
@@ -475,6 +552,7 @@ void compress_threads_load_cleanup(void)
decompress_threads = NULL;
decomp_param = NULL;
decomp_file = NULL;
+ clean_decompress_ops();
}
int compress_threads_load_setup(QEMUFile *f)
@@ -485,6 +563,11 @@ int compress_threads_load_setup(QEMUFile *f)
return 0;
}
+ if (set_decompress_ops() < 0) {
+ clean_decompress_ops();
+ return -1;
+ }
+
/*
* set compression_counters memory to zero for a new migration
*/
@@ -497,7 +580,7 @@ int compress_threads_load_setup(QEMUFile *f)
qemu_cond_init(&decomp_done_cond);
decomp_file = f;
for (i = 0; i < thread_count; i++) {
- if (inflateInit(&decomp_param[i].stream) != Z_OK) {
+ if (decompress_ops->load_setup(&decomp_param[i]) < 0) {
goto exit;
}
diff --git a/migration/ram-compress.h b/migration/ram-compress.h
index 0d89a2f55e..daf241987f 100644
--- a/migration/ram-compress.h
+++ b/migration/ram-compress.h
@@ -39,6 +39,20 @@ enum CompressResult {
};
typedef enum CompressResult CompressResult;
+struct DecompressParam {
+ bool done;
+ bool quit;
+ QemuMutex mutex;
+ QemuCond cond;
+ void *des;
+ uint8_t *compbuf;
+ int len;
+
+ /* for zlib compression */
+ z_stream stream;
+};
+typedef struct DecompressParam DecompressParam;
+
struct CompressParam {
bool done;
bool quit;
@@ -51,11 +65,26 @@ struct CompressParam {
ram_addr_t offset;
/* internally used fields */
- z_stream stream;
uint8_t *originbuf;
+
+ /* for zlib compression */
+ z_stream stream;
};
typedef struct CompressParam CompressParam;
+typedef struct {
+ int (*save_setup)(CompressParam *param);
+ void (*save_cleanup)(CompressParam *param);
+ ssize_t (*compress_data)(CompressParam *param, size_t size);
+} MigrationCompressOps;
+
+typedef struct {
+ int (*load_setup)(DecompressParam *param);
+ void (*load_cleanup)(DecompressParam *param);
+ int (*decompress_data)(DecompressParam *param, uint8_t *dest, size_t size);
+ int (*check_len)(int len);
+} MigrationDecompressOps;
+
void compress_threads_save_cleanup(void);
int compress_threads_save_setup(void);
diff --git a/migration/ram.c b/migration/ram.c
index 8c7886ab79..f9b2b9b985 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -96,6 +96,8 @@
XBZRLECacheStats xbzrle_counters;
+extern MigrationDecompressOps *decompress_ops;
+
/* used by the search for pages to send */
struct PageSearchStatus {
/* The migration channel used for a specific host page */
@@ -3979,7 +3981,7 @@ static int ram_load_precopy(QEMUFile *f)
case RAM_SAVE_FLAG_COMPRESS_PAGE:
len = qemu_get_be32(f);
- if (len < 0 || len > compressBound(TARGET_PAGE_SIZE)) {
+ if (decompress_ops->check_len(len)) {
error_report("Invalid compressed data length: %d", len);
ret = -EINVAL;
break;
--
2.27.0

View File

@ -0,0 +1,229 @@
From 8c9603270184d8dadf64ec6de263268e846f8c18 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Sat, 30 Jan 2021 16:15:10 +0800
Subject: [PATCH] migration: Add zstd support in multi-thread compression
This patch enables zstd option in multi-thread compression.
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Zeyu Jin <jinzeyu@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
hw/core/qdev-properties-system.c | 2 +-
migration/ram-compress.c | 112 +++++++++++++++++++++++++++++++
migration/ram-compress.h | 15 +++++
qapi/migration.json | 3 +-
4 files changed, 130 insertions(+), 2 deletions(-)
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index cd5571fcfb..c581d46f2e 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -1206,7 +1206,7 @@ const PropertyInfo qdev_prop_uuid = {
const PropertyInfo qdev_prop_compress_method = {
.name = "CompressMethod",
.description = "multi-thread compression method, "
- "zlib",
+ "zlib/zstd",
.enum_table = &CompressMethod_lookup,
.get = qdev_propinfo_get_enum,
.set = qdev_propinfo_set_enum,
diff --git a/migration/ram-compress.c b/migration/ram-compress.c
index 6e37b22492..74703f0ec4 100644
--- a/migration/ram-compress.c
+++ b/migration/ram-compress.c
@@ -171,6 +171,103 @@ static int zlib_check_len(int len)
return len < 0 || len > compressBound(TARGET_PAGE_SIZE);
}
+#ifdef CONFIG_ZSTD
+static int zstd_save_setup(CompressParam *param)
+{
+ int res;
+ param->zstd_cs = ZSTD_createCStream();
+ if (!param->zstd_cs) {
+ return -1;
+ }
+ res = ZSTD_initCStream(param->zstd_cs, migrate_compress_level());
+ if (ZSTD_isError(res)) {
+ return -1;
+ }
+ return 0;
+}
+static void zstd_save_cleanup(CompressParam *param)
+{
+ ZSTD_freeCStream(param->zstd_cs);
+ param->zstd_cs = NULL;
+}
+static ssize_t zstd_compress_data(CompressParam *param, size_t size)
+{
+ int ret;
+ uint8_t *dest = NULL;
+ uint8_t *p = param->originbuf;
+ QEMUFile *f = f = param->file;
+ ssize_t blen = qemu_put_compress_start(f, &dest);
+ if (blen < ZSTD_compressBound(size)) {
+ return -1;
+ }
+ param->out.dst = dest;
+ param->out.size = blen;
+ param->out.pos = 0;
+ param->in.src = p;
+ param->in.size = size;
+ param->in.pos = 0;
+ do {
+ ret = ZSTD_compressStream2(param->zstd_cs, &param->out,
+ &param->in, ZSTD_e_end);
+ } while (ret > 0 && (param->in.size - param->in.pos > 0)
+ && (param->out.size - param->out.pos > 0));
+ if (ret > 0 && (param->in.size - param->in.pos > 0)) {
+ return -1;
+ }
+ if (ZSTD_isError(ret)) {
+ return -1;
+ }
+ blen = param->out.pos;
+ qemu_put_compress_end(f, blen);
+ return blen + sizeof(int32_t);
+}
+
+static int zstd_load_setup(DecompressParam *param)
+{
+ int ret;
+ param->zstd_ds = ZSTD_createDStream();
+ if (!param->zstd_ds) {
+ return -1;
+ }
+ ret = ZSTD_initDStream(param->zstd_ds);
+ if (ZSTD_isError(ret)) {
+ return -1;
+ }
+ return 0;
+}
+static void zstd_load_cleanup(DecompressParam *param)
+{
+ ZSTD_freeDStream(param->zstd_ds);
+ param->zstd_ds = NULL;
+}
+static int
+zstd_decompress_data(DecompressParam *param, uint8_t *dest, size_t size)
+{
+ int ret;
+ param->out.dst = dest;
+ param->out.size = size;
+ param->out.pos = 0;
+ param->in.src = param->compbuf;
+ param->in.size = param->len;
+ param->in.pos = 0;
+ do {
+ ret = ZSTD_decompressStream(param->zstd_ds, &param->out, &param->in);
+ } while (ret > 0 && (param->in.size - param->in.pos > 0)
+ && (param->out.size - param->out.pos > 0));
+ if (ret > 0 && (param->in.size - param->in.pos > 0)) {
+ return -1;
+ }
+ if (ZSTD_isError(ret)) {
+ return -1;
+ }
+ return ret;
+}
+static int zstd_check_len(int len)
+{
+ return len < 0 || len > ZSTD_compressBound(TARGET_PAGE_SIZE);
+}
+#endif
+
static int set_compress_ops(void)
{
compress_ops = g_new0(MigrationCompressOps, 1);
@@ -181,6 +278,13 @@ static int set_compress_ops(void)
compress_ops->save_cleanup = zlib_save_cleanup;
compress_ops->compress_data = zlib_compress_data;
break;
+#ifdef CONFIG_ZSTD
+ case COMPRESS_METHOD_ZSTD:
+ compress_ops->save_setup = zstd_save_setup;
+ compress_ops->save_cleanup = zstd_save_cleanup;
+ compress_ops->compress_data = zstd_compress_data;
+ break;
+#endif
default:
return -1;
}
@@ -199,6 +303,14 @@ static int set_decompress_ops(void)
decompress_ops->decompress_data = zlib_decompress_data;
decompress_ops->check_len = zlib_check_len;
break;
+#ifdef CONFIG_ZSTD
+ case COMPRESS_METHOD_ZSTD:
+ decompress_ops->load_setup = zstd_load_setup;
+ decompress_ops->load_cleanup = zstd_load_cleanup;
+ decompress_ops->decompress_data = zstd_decompress_data;
+ decompress_ops->check_len = zstd_check_len;
+ break;
+#endif
default:
return -1;
}
diff --git a/migration/ram-compress.h b/migration/ram-compress.h
index daf241987f..e8700eb36f 100644
--- a/migration/ram-compress.h
+++ b/migration/ram-compress.h
@@ -29,6 +29,10 @@
#ifndef QEMU_MIGRATION_COMPRESS_H
#define QEMU_MIGRATION_COMPRESS_H
+#ifdef CONFIG_ZSTD
+#include <zstd.h>
+#include <zstd_errors.h>
+#endif
#include "qemu-file.h"
#include "qapi/qapi-types-migration.h"
@@ -50,6 +54,11 @@ struct DecompressParam {
/* for zlib compression */
z_stream stream;
+#ifdef CONFIG_ZSTD
+ ZSTD_DStream *zstd_ds;
+ ZSTD_inBuffer in;
+ ZSTD_outBuffer out;
+#endif
};
typedef struct DecompressParam DecompressParam;
@@ -69,6 +78,12 @@ struct CompressParam {
/* for zlib compression */
z_stream stream;
+
+#ifdef CONFIG_ZSTD
+ ZSTD_CStream *zstd_cs;
+ ZSTD_inBuffer in;
+ ZSTD_outBuffer out;
+#endif
};
typedef struct CompressParam CompressParam;
diff --git a/qapi/migration.json b/qapi/migration.json
index cafaa5ccb3..29af841f4e 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -714,12 +714,13 @@
# An enumeration of multi-thread compression methods.
#
# @zlib: use zlib compression method.
+# @zstd: use zstd compression method.
#
# Since: 5.0
#
##
{ 'enum': 'CompressMethod',
- 'data': [ 'zlib' ] }
+ 'data': [ 'zlib', { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
##
# @MigrationParameter:
--
2.27.0

View File

@ -0,0 +1,330 @@
From cf6f31249817380e91cbc4e55b189216645fac18 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Sat, 30 Jan 2021 15:21:17 +0800
Subject: [PATCH] migration: Refactoring multi-thread compress migration
Code refactor for the compression procedure which includes:
1. Move qemu_compress_data and qemu_put_compression_data from qemu-file.c to
ram.c, for the reason that most part of the code logical has nothing to do
with qemu-file. Besides, the decompression code is located at ram.c only.
2. Simplify the function input arguments for compression and decompression.
Wrap the input into the param structure which already exists. This change also
makes the function much more flexible for other compression methods.
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Zeyu Jin <jinzeyu@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
migration/meson.build | 4 +-
migration/migration-hmp-cmds.c | 1 -
migration/qemu-file.c | 61 +++++-------------------
migration/qemu-file.h | 4 +-
migration/ram-compress.c | 87 ++++++++++++++++++++++++----------
5 files changed, 77 insertions(+), 80 deletions(-)
diff --git a/migration/meson.build b/migration/meson.build
index 92b1cc4297..d9b46ef0df 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -22,7 +22,6 @@ system_ss.add(files(
'migration.c',
'multifd.c',
'multifd-zlib.c',
- 'ram-compress.c',
'options.c',
'postcopy-ram.c',
'savevm.c',
@@ -43,4 +42,5 @@ system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
if_true: files('ram.c',
- 'target.c'))
+ 'target.c',
+ 'ram-compress.c'))
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 261ec1e35c..1fa6a5f478 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -22,7 +22,6 @@
#include "qapi/qapi-commands-migration.h"
#include "qapi/qapi-visit-migration.h"
#include "qapi/qmp/qdict.h"
-#include "qapi/qapi-visit-migration.h"
#include "qapi/string-input-visitor.h"
#include "qapi/string-output-visitor.h"
#include "qemu/cutils.h"
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 94231ff295..bd1dbc3db1 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -669,55 +669,6 @@ uint64_t qemu_get_be64(QEMUFile *f)
return v;
}
-/* return the size after compression, or negative value on error */
-static int qemu_compress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
- const uint8_t *source, size_t source_len)
-{
- int err;
-
- err = deflateReset(stream);
- if (err != Z_OK) {
- return -1;
- }
-
- stream->avail_in = source_len;
- stream->next_in = (uint8_t *)source;
- stream->avail_out = dest_len;
- stream->next_out = dest;
-
- err = deflate(stream, Z_FINISH);
- if (err != Z_STREAM_END) {
- return -1;
- }
-
- return stream->next_out - dest;
-}
-
-/* Compress size bytes of data start at p and store the compressed
- * data to the buffer of f.
- *
- * Since the file is dummy file with empty_ops, return -1 if f has no space to
- * save the compressed data.
- */
-ssize_t qemu_put_compression_data(QEMUFile *f, z_stream *stream,
- const uint8_t *p, size_t size)
-{
- ssize_t blen = IO_BUF_SIZE - f->buf_index - sizeof(int32_t);
-
- if (blen < compressBound(size)) {
- return -1;
- }
-
- blen = qemu_compress_data(stream, f->buf + f->buf_index + sizeof(int32_t),
- blen, p, size);
- if (blen < 0) {
- return -1;
- }
-
- qemu_put_be32(f, blen);
- add_buf_to_iovec(f, blen);
- return blen + sizeof(int32_t);
-}
/* Put the data in the buffer of f_src to the buffer of f_des, and
* then reset the buf_index of f_src to 0.
@@ -834,3 +785,15 @@ int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size)
return 0;
}
+
+ssize_t qemu_put_compress_start(QEMUFile *f, uint8_t **dest_ptr)
+{
+ *dest_ptr = f->buf + f->buf_index + sizeof(int32_t);
+ return IO_BUF_SIZE - f->buf_index - sizeof(int32_t);
+}
+
+void qemu_put_compress_end(QEMUFile *f, unsigned int v)
+{
+ qemu_put_be32(f, v);
+ add_buf_to_iovec(f, v);
+}
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 8aec9fabf7..8afa95732b 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -54,8 +54,8 @@ void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size,
size_t coroutine_mixed_fn qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t size, size_t offset);
size_t coroutine_mixed_fn qemu_get_buffer_in_place(QEMUFile *f, uint8_t **buf, size_t size);
-ssize_t qemu_put_compression_data(QEMUFile *f, z_stream *stream,
- const uint8_t *p, size_t size);
+ssize_t qemu_put_compress_start(QEMUFile *f, uint8_t **dest_ptr);
+void qemu_put_compress_end(QEMUFile *f, unsigned int v);
int qemu_put_qemu_file(QEMUFile *f_des, QEMUFile *f_src);
bool qemu_file_buffer_empty(QEMUFile *file);
diff --git a/migration/ram-compress.c b/migration/ram-compress.c
index fa4388f6a6..2be344acbc 100644
--- a/migration/ram-compress.c
+++ b/migration/ram-compress.c
@@ -28,7 +28,6 @@
#include "qemu/osdep.h"
#include "qemu/cutils.h"
-
#include "ram-compress.h"
#include "qemu/error-report.h"
@@ -40,6 +39,7 @@
#include "exec/ramblock.h"
#include "ram.h"
#include "migration-stats.h"
+#include "exec/ram_addr.h"
static struct {
int64_t pages;
@@ -83,28 +83,22 @@ static QemuThread *decompress_threads;
static QemuMutex decomp_done_lock;
static QemuCond decomp_done_cond;
-static CompressResult do_compress_ram_page(QEMUFile *f, z_stream *stream,
- RAMBlock *block, ram_addr_t offset,
- uint8_t *source_buf);
+static CompressResult do_compress_ram_page(CompressParam *param, RAMBlock *block);
static void *do_data_compress(void *opaque)
{
CompressParam *param = opaque;
RAMBlock *block;
- ram_addr_t offset;
CompressResult result;
qemu_mutex_lock(&param->mutex);
while (!param->quit) {
if (param->trigger) {
block = param->block;
- offset = param->offset;
param->trigger = false;
qemu_mutex_unlock(&param->mutex);
- result = do_compress_ram_page(param->file, &param->stream,
- block, offset, param->originbuf);
-
+ result = do_compress_ram_page(param, block);
qemu_mutex_lock(&comp_done_lock);
param->done = true;
param->result = result;
@@ -204,15 +198,57 @@ exit:
return -1;
}
-static CompressResult do_compress_ram_page(QEMUFile *f, z_stream *stream,
- RAMBlock *block, ram_addr_t offset,
- uint8_t *source_buf)
+/*
+ * Compress size bytes of data start at p and store the compressed
+ * data to the buffer of f.
+ *
+ * Since the file is dummy file with empty_ops, return -1 if f has no space to
+ * save the compressed data.
+ */
+static ssize_t qemu_put_compression_data(CompressParam *param, size_t size)
+{
+ int err;
+ uint8_t *dest = NULL;
+ z_stream *stream = &param->stream;
+ uint8_t *p = param->originbuf;
+ QEMUFile *f = f = param->file;
+ ssize_t blen = qemu_put_compress_start(f, &dest);
+
+ if (blen < compressBound(size)) {
+ return -1;
+ }
+
+ err = deflateReset(stream);
+ if (err != Z_OK) {
+ return -1;
+ }
+
+ stream->avail_in = size;
+ stream->next_in = p;
+ stream->avail_out = blen;
+ stream->next_out = dest;
+
+ err = deflate(stream, Z_FINISH);
+ if (err != Z_STREAM_END) {
+ return -1;
+ }
+
+ blen = stream->next_out - dest;
+ if (blen < 0) {
+ return -1;
+ }
+
+ qemu_put_compress_end(f, blen);
+ return blen + sizeof(int32_t);
+}
+
+static CompressResult do_compress_ram_page(CompressParam *param, RAMBlock *block)
{
- uint8_t *p = block->host + offset;
+ uint8_t *p = block->host + (param->offset & TARGET_PAGE_MASK);
size_t page_size = qemu_target_page_size();
int ret;
- assert(qemu_file_buffer_empty(f));
+ assert(qemu_file_buffer_empty(param->file));
if (buffer_is_zero(p, page_size)) {
return RES_ZEROPAGE;
@@ -223,12 +259,12 @@ static CompressResult do_compress_ram_page(QEMUFile *f, z_stream *stream,
* so that we can catch up the error during compression and
* decompression
*/
- memcpy(source_buf, p, page_size);
- ret = qemu_put_compression_data(f, stream, source_buf, page_size);
+ memcpy(param->originbuf, p, page_size);
+ ret = qemu_put_compression_data(param, page_size);
if (ret < 0) {
qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
error_report("compressed data failed!");
- qemu_fflush(f);
+ qemu_fflush(param->file);
return RES_NONE;
}
return RES_COMPRESS;
@@ -322,19 +358,20 @@ bool compress_page_with_multi_thread(RAMBlock *block, ram_addr_t offset,
/* return the size after decompression, or negative value on error */
static int
-qemu_uncompress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
- const uint8_t *source, size_t source_len)
+qemu_uncompress_data(DecompressParam *param, uint8_t *dest, size_t pagesize)
{
int err;
+ z_stream *stream = &param->stream;
+
err = inflateReset(stream);
if (err != Z_OK) {
return -1;
}
- stream->avail_in = source_len;
- stream->next_in = (uint8_t *)source;
- stream->avail_out = dest_len;
+ stream->avail_in = param->len;
+ stream->next_in = param->compbuf;
+ stream->avail_out = pagesize;
stream->next_out = dest;
err = inflate(stream, Z_NO_FLUSH);
@@ -350,20 +387,18 @@ static void *do_data_decompress(void *opaque)
DecompressParam *param = opaque;
unsigned long pagesize;
uint8_t *des;
- int len, ret;
+ int ret;
qemu_mutex_lock(&param->mutex);
while (!param->quit) {
if (param->des) {
des = param->des;
- len = param->len;
param->des = 0;
qemu_mutex_unlock(&param->mutex);
pagesize = qemu_target_page_size();
- ret = qemu_uncompress_data(&param->stream, des, pagesize,
- param->compbuf, len);
+ ret = qemu_uncompress_data(param, des, pagesize);
if (ret < 0 && migrate_get_current()->decompress_error_check) {
error_report("decompress data failed");
qemu_file_set_error(decomp_file, ret);
--
2.27.0

View File

@ -0,0 +1,54 @@
From 7caa5d818e0fa0e1cee2513f2fde4e81f8b5cc13 Mon Sep 17 00:00:00 2001
From: zhengchuan <zhengchuan@huawei.com>
Date: Mon, 5 Dec 2022 20:52:25 +0800
Subject: [PATCH] migration: report migration related thread pid to libvirt
in order to control migration thread cgroup,
we need to report migration related thread pid to libvirt
Signed-off-by:zhengchuan<zhengchuan@huawei.com>
---
migration/migration.c | 3 +++
qapi/migration.json | 12 ++++++++++++
2 files changed, 15 insertions(+)
diff --git a/migration/migration.c b/migration/migration.c
index 3ce04b2aaf..7c2fdde26b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3299,6 +3299,9 @@ static void *migration_thread(void *opaque)
MigThrError thr_error;
bool urgent = false;
+ /* report migration thread pid to libvirt */
+ qapi_event_send_migration_pid(qemu_get_thread_id());
+
thread = migration_threads_add("live_migration", qemu_get_thread_id());
rcu_register_thread();
diff --git a/qapi/migration.json b/qapi/migration.json
index 29af841f4e..b442d0d878 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1447,6 +1447,18 @@
{ 'event': 'MIGRATION_PASS',
'data': { 'pass': 'int' } }
+##
+# @MIGRATION_PID:
+#
+# Emitted when migration thread appear
+#
+# @pid: pid of migration thread
+#
+# Since: EulerOS Virtual
+##
+{ 'event': 'MIGRATION_PID',
+ 'data': { 'pid': 'int' } }
+
##
# @COLOMessage:
#
--
2.27.0

View File

@ -0,0 +1,62 @@
From e387eaeef8845993a437ad19eaf988fb101d3fdd Mon Sep 17 00:00:00 2001
From: zhengchuan <zhengchuan@huawei.com>
Date: Mon, 5 Dec 2022 20:56:35 +0800
Subject: [PATCH] migration: report multiFd related thread pid to libvirt
report multiFd related thread pid to libvirt in order to
pin multiFd thread to different cpu.
Signed-off-by:zhengchuan<zhengchuan@huawei.com>
---
migration/multifd.c | 4 ++++
qapi/migration.json | 12 ++++++++++++
2 files changed, 16 insertions(+)
diff --git a/migration/multifd.c b/migration/multifd.c
index 409460684f..7d373a245e 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -17,6 +17,7 @@
#include "exec/ramblock.h"
#include "qemu/error-report.h"
#include "qapi/error.h"
+#include "qapi/qapi-events-migration.h"
#include "ram.h"
#include "migration.h"
#include "migration-stats.h"
@@ -657,6 +658,9 @@ static void *multifd_send_thread(void *opaque)
thread = migration_threads_add(p->name, qemu_get_thread_id());
+ /* report multifd thread pid to libvirt */
+ qapi_event_send_migration_multifd_pid(qemu_get_thread_id());
+
trace_multifd_send_thread_start(p->id);
rcu_register_thread();
diff --git a/qapi/migration.json b/qapi/migration.json
index b442d0d878..5d0855a1d8 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1447,6 +1447,18 @@
{ 'event': 'MIGRATION_PASS',
'data': { 'pass': 'int' } }
+##
+# @MIGRATION_MULTIFD_PID:
+#
+# Emitted when multifd thread appear
+#
+# @pid: pid of multifd thread
+#
+# Since: EulerOS Virtual
+##
+{ 'event': 'MIGRATION_MULTIFD_PID',
+ 'data': { 'pid': 'int' } }
+
##
# @MIGRATION_PID:
#
--
2.27.0

View File

@ -0,0 +1,47 @@
From dfb9372702b2fb994392b8a6e8a39964c2656ae6 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Wed, 9 Feb 2022 08:49:41 +0800
Subject: [PATCH] migration: skip cache_drop for bios bootloader and nvram
template
Qemu enabled page cache dropping for raw device on the destionation host
during shared storage migration.
However, fsync may take 300ms to multiple seconds to return in multiple-migration
scene, because all domains in a host share bios bootloader file, skip cache_drop
for bios bootloader and nvram template to avoid downtime increase.
---
block.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/block.c b/block.c
index b7cb963929..3bfd4be6b4 100644
--- a/block.c
+++ b/block.c
@@ -68,6 +68,9 @@
#define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
+#define DEFAULT_BIOS_BOOT_LOADER_DIR "/usr/share/edk2"
+#define DEFAULT_NVRAM_TEMPLATE_DIR "/var/lib/libvirt/qemu/nvram"
+
/* Protected by BQL */
static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
@@ -7017,7 +7020,13 @@ int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp)
assert(!(bs->open_flags & BDRV_O_INACTIVE));
assert_bdrv_graph_readable();
- if (bs->drv->bdrv_co_invalidate_cache) {
+ /*
+ * It's not necessary for bios bootloader and nvram template to drop cache
+ * when migration, skip this step for them to avoid dowtime increase.
+ */
+ if (bs->drv->bdrv_co_invalidate_cache &&
+ !strstr(bs->filename, DEFAULT_BIOS_BOOT_LOADER_DIR) &&
+ !strstr(bs->filename, DEFAULT_NVRAM_TEMPLATE_DIR)) {
bs->drv->bdrv_co_invalidate_cache(bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
--
2.27.0

View File

@ -0,0 +1,96 @@
From a344d8636168ba5f034a908d3394ef88d36133dd Mon Sep 17 00:00:00 2001
From: Yan Wang <wangyan122@huawei.com>
Date: Thu, 10 Feb 2022 11:18:13 +0800
Subject: [PATCH] monitor: Discard BLOCK_IO_ERROR event when VM rebooted
Throttled event like QAPI_EVENT_BLOCK_IO_ERROR may be queued
to limit event rate. Event may be delivered when VM is rebooted
if the event was queued in the *monitor_qapi_event_state* hash table.
Which may casue VM pause and other related problems.
Such as seabios blocked during virtio-scsi initialization:
vring_add_buf(vq, sg, out_num, in_num, 0, 0);
vring_kick(vp, vq, 1);
------------> VM paused here <-----------
/* Wait for reply */
while (!vring_more_used(vq)) usleep(5);
Signed-off-by: Yan Wang <wangyan122@huawei.com>
---
include/monitor/monitor.h | 2 ++
monitor/monitor.c | 29 +++++++++++++++++++++++++++++
system/runstate.c | 1 +
3 files changed, 32 insertions(+)
diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 965f5d5450..60079086a8 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -63,4 +63,6 @@ void monitor_register_hmp_info_hrt(const char *name,
int error_vprintf_unless_qmp(const char *fmt, va_list ap) G_GNUC_PRINTF(1, 0);
int error_printf_unless_qmp(const char *fmt, ...) G_GNUC_PRINTF(1, 2);
+void monitor_qapi_event_discard_io_error(void);
+
#endif /* MONITOR_H */
diff --git a/monitor/monitor.c b/monitor/monitor.c
index e540c1334a..8d59a76612 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -34,6 +34,8 @@
#include "qemu/option.h"
#include "sysemu/qtest.h"
#include "trace.h"
+#include "qemu/log.h"
+#include "qapi/qmp/qobject.h"
/*
* To prevent flooding clients, events can be throttled. The
@@ -787,6 +789,33 @@ int monitor_init_opts(QemuOpts *opts, Error **errp)
return ret;
}
+void monitor_qapi_event_discard_io_error(void)
+{
+ GHashTableIter event_iter;
+ MonitorQAPIEventState *evstate;
+ gpointer key, value;
+ GString *json;
+
+ qemu_mutex_lock(&monitor_lock);
+ g_hash_table_iter_init(&event_iter, monitor_qapi_event_state);
+ while (g_hash_table_iter_next(&event_iter, &key, &value)) {
+ evstate = key;
+ /* Only QAPI_EVENT_BLOCK_IO_ERROR is discarded */
+ if (evstate->event == QAPI_EVENT_BLOCK_IO_ERROR) {
+ g_hash_table_iter_remove(&event_iter);
+ json = qobject_to_json(QOBJECT(evstate->qdict));
+ qemu_log(" %s event discarded\n", json->str);
+ timer_del(evstate->timer);
+ timer_free(evstate->timer);
+ qobject_unref(evstate->data);
+ qobject_unref(evstate->qdict);
+ g_string_free(json, true);
+ g_free(evstate);
+ }
+ }
+ qemu_mutex_unlock(&monitor_lock);
+}
+
QemuOptsList qemu_mon_opts = {
.name = "mon",
.implied_opt_name = "chardev",
diff --git a/system/runstate.c b/system/runstate.c
index 9d3f627fee..62e6db8d42 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -503,6 +503,7 @@ void qemu_system_reset(ShutdownCause reason)
qapi_event_send_reset(shutdown_caused_by_guest(reason), reason);
}
cpu_synchronize_all_post_reset();
+ monitor_qapi_event_discard_io_error();
}
/*
--
2.27.0

View File

@ -0,0 +1,111 @@
From c6b183a4c3c63454dea39be26b0fb773ec04887e Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Wed, 9 Feb 2022 14:13:05 +0800
Subject: [PATCH] monitor/qmp: drop inflight rsp if qmp client broken
If libvirt restart while qemu is handle qmp message, libvirt will
reconnect qemu monitor socket, and query status of qemu by qmp.
But qemu may return last qmp respond to new connect socket, and libvirt
recv unexpected respond, So libvirt think qemu is abnormal, and will
kill qemu.
This patch add qmp connect id, while reconnect id will change. While
respond to libvirt, judge if id is same, if not, drop this respond.
---
monitor/monitor-internal.h | 1 +
monitor/qmp.c | 19 +++++++++++--------
2 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index 252de85681..d7842fa464 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -144,6 +144,7 @@ typedef struct {
const QmpCommandList *commands;
bool capab_offered[QMP_CAPABILITY__MAX]; /* capabilities offered */
bool capab[QMP_CAPABILITY__MAX]; /* offered and accepted */
+ uint64_t qmp_client_id; /*qmp client id, update if peer disconnect */
/*
* Protects qmp request/response queue.
* Take monitor_lock first when you need both.
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 6eee450fe4..8f7671c5f1 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -149,18 +149,19 @@ void qmp_send_response(MonitorQMP *mon, const QDict *rsp)
* Null @rsp can only happen for commands with QCO_NO_SUCCESS_RESP.
* Nothing is emitted then.
*/
-static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
+static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp, uint64_t req_client_id)
{
- if (rsp) {
- qmp_send_response(mon, rsp);
+ if (!rsp || (mon->qmp_client_id != req_client_id)) {
+ return;
}
+ qmp_send_response(mon, rsp);
}
/*
* Runs outside of coroutine context for OOB commands, but in
* coroutine context for everything else.
*/
-static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
+static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req, uint64_t req_client_id)
{
QDict *rsp;
QDict *error;
@@ -180,7 +181,7 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
}
}
- monitor_qmp_respond(mon, rsp);
+ monitor_qmp_respond(mon, rsp, req_client_id);
qobject_unref(rsp);
}
@@ -340,13 +341,13 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
trace_monitor_qmp_cmd_in_band(id_json->str);
g_string_free(id_json, true);
}
- monitor_qmp_dispatch(mon, req_obj->req);
+ monitor_qmp_dispatch(mon, req_obj->req, mon->qmp_client_id);
} else {
assert(req_obj->err);
trace_monitor_qmp_err_in_band(error_get_pretty(req_obj->err));
rsp = qmp_error_response(req_obj->err);
req_obj->err = NULL;
- monitor_qmp_respond(mon, rsp);
+ monitor_qmp_respond(mon, rsp, mon->qmp_client_id);
qobject_unref(rsp);
}
@@ -402,7 +403,7 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
trace_monitor_qmp_cmd_out_of_band(id_json->str);
g_string_free(id_json, true);
}
- monitor_qmp_dispatch(mon, req);
+ monitor_qmp_dispatch(mon, req, mon->qmp_client_id);
qobject_unref(req);
return;
}
@@ -486,6 +487,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
mon_refcount++;
break;
case CHR_EVENT_CLOSED:
+ mon->qmp_client_id++;
/*
* Note: this is only useful when the output of the chardev
* backend is still open. For example, when the backend is
@@ -539,6 +541,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
}
qemu_chr_fe_set_echo(&mon->common.chr, true);
+ mon->qmp_client_id = 1;
/* Note: we run QMP monitor in I/O thread when @chr supports that */
monitor_data_init(&mon->common, true, false,
qemu_chr_has_feature(chr, QEMU_CHAR_FEATURE_GCONTEXT));
--
2.27.0

View File

@ -0,0 +1,45 @@
From 81b4091eee81fe3871d836b1a684e27828cdc2be Mon Sep 17 00:00:00 2001
From: WangJian <wangjian161@huawei.com>
Date: Wed, 9 Feb 2022 10:42:33 +0800
Subject: [PATCH] nbd/server.c: fix invalid read after client was already free
In the process of NBD equipment pressurization, executing QEMU NBD will
lead to the failure of IO distribution and go to NBD_ Out process of trip().
If two or more IO go to the out process, client NBD will release in nbd_request_put().
The user after free problem that is read again in close().
Through the NBD_ Save the value of client > closing before the out process in trip
to solve the use after free problem.
Signed-off-by: wangjian161 <wangjian161@huawei.com>
---
nbd/server.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/nbd/server.c b/nbd/server.c
index 895cf0a752..e8baed9705 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2939,6 +2939,7 @@ static coroutine_fn void nbd_trip(void *opaque)
NBDRequestData *req;
NBDRequest request = { 0 }; /* GCC thinks it can be used uninitialized */
int ret;
+ bool client_closing;
Error *local_err = NULL;
trace_nbd_trip();
@@ -3023,8 +3024,11 @@ disconnect:
if (local_err) {
error_reportf_err(local_err, "Disconnect client, due to: ");
}
+ client_closing = client->closing;
nbd_request_put(req);
- client_close(client, true);
+ if (!client_closing) {
+ client_close(client, true);
+ }
nbd_client_put(client);
}
--
2.27.0

View File

@ -0,0 +1,51 @@
From 6999f07558308ee6b7d63e46ca554a0b702948d6 Mon Sep 17 00:00:00 2001
From: liuxiangdong <liuxiangdong5@huawei.com>
Date: Tue, 8 Feb 2022 15:10:25 +0800
Subject: [PATCH] net/dump.c: Suppress spurious compiler warning
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Compiling with gcc version 11.2.0 (Ubuntu 11.2.0-13ubuntu1) results in
a (spurious) warning:
In function dump_receive_iov,
inlined from filter_dump_receive_iov at ../net/dump.c:157:5:
../net/dump.c:89:9: error: writev specified size 18446744073709551600
exceeds maximum object size 9223372036854775807 [-Werror=stringop-overflow=]
89 | if (writev(s->fd, dumpiov, cnt + 1) != sizeof(hdr) + caplen) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/ptomsich/qemu/include/qemu/osdep.h:108,
from ../net/dump.c:25:
../net/dump.c: In function filter_dump_receive_iov:
/usr/include/x86_64-linux-gnu/sys/uio.h:52:16: note: in a call to function
writev declared with attribute read_only (2, 3)
52 | extern ssize_t writev (int __fd, const struct iovec *__iovec, int
__count)
| ^~~~~~
cc1: all warnings being treated as errors
This change helps that version of GCC to understand what is going on
and suppresses this warning.
Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu>
---
net/dump.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/dump.c b/net/dump.c
index 16073f2458..d880a7e299 100644
--- a/net/dump.c
+++ b/net/dump.c
@@ -87,7 +87,7 @@ static ssize_t dump_receive_iov(DumpState *s, const struct iovec *iov, int cnt,
dumpiov[0].iov_len = sizeof(hdr);
cnt = iov_copy(&dumpiov[1], cnt, iov, cnt, offset, caplen);
- if (writev(s->fd, dumpiov, cnt + 1) != sizeof(hdr) + caplen) {
+ if (writev(s->fd, &dumpiov[0], cnt + 1) != sizeof(hdr) + caplen) {
error_report("network dump write error - stopping dump");
close(s->fd);
s->fd = -1;
--
2.27.0

View File

@ -0,0 +1,58 @@
From 6e6215b3ad0c8eac918bca9e2b5bb661e27f2fed Mon Sep 17 00:00:00 2001
From: zhouli57 <zhouli57@huawei.com>
Date: Sat, 18 Dec 2021 09:39:57 +0800
Subject: [PATCH] net: eepro100: validate various address
valuesi(CVE-2021-20255)
fix CVE-2021-20255
patch link: https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg06098.html
fix CVE-2021-20255, sync patch from ostms platform.
Signed-off-by: zhouli57 <zhouli57@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
---
hw/net/eepro100.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 69e1c4bb89..f6204ec059 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -279,6 +279,9 @@ typedef struct {
/* Quasi static device properties (no need to save them). */
uint16_t stats_size;
bool has_extended_tcb_support;
+
+ /* Flag to avoid recursions. */
+ bool busy;
} EEPRO100State;
/* Word indices in EEPROM. */
@@ -844,6 +847,14 @@ static void action_command(EEPRO100State *s)
Therefore we limit the number of iterations. */
unsigned max_loop_count = 16;
+ if (s->busy) {
+ /* Prevent recursions. */
+ logout("recursion in %s:%u\n", __FILE__, __LINE__);
+ return;
+ }
+
+ s->busy = true;
+
for (;;) {
bool bit_el;
bool bit_s;
@@ -940,6 +951,7 @@ static void action_command(EEPRO100State *s)
}
TRACE(OTHER, logout("CU list empty\n"));
/* List is empty. Now CU is idle or suspended. */
+ s->busy = false;
}
static void eepro100_cu_command(EEPRO100State * s, uint8_t val)
--
2.27.0

View File

@ -0,0 +1,57 @@
From b6c45f5ea5d1a379ac0a507cf59345c573b27cc8 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Wed, 9 Feb 2022 14:21:39 +0800
Subject: [PATCH] oslib-posix: optimise vm startup time for 1G hugepage
It takes quit a long time to clear 1G-hugepage, which makes glibc
pthread_create quit slow.
Create touch_pages threads in advance, and then handle the touch_pages
callback. Only read lock is held here.
---
util/oslib-posix.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index e86fd64e09..9ca3fee2b8 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -88,6 +88,8 @@ static QemuMutex sigbus_mutex;
static QemuMutex page_mutex;
static QemuCond page_cond;
+static int started_num_threads;
+
int qemu_get_thread_id(void)
{
#if defined(__linux__)
@@ -344,6 +346,10 @@ static void *do_touch_pages(void *arg)
}
qemu_mutex_unlock(&page_mutex);
+ while (started_num_threads != memset_args->context.num_threads) {
+ smp_mb();
+ }
+
/* unblock SIGBUS */
sigemptyset(&set);
sigaddset(&set, SIGBUS);
@@ -448,7 +454,7 @@ static int touch_all_pages(char *area, size_t hpagesize, size_t numpages,
context.threads = g_new0(MemsetThread, context.num_threads);
numpages_per_thread = numpages / context.num_threads;
leftover = numpages % context.num_threads;
- for (i = 0; i < context.num_threads; i++) {
+ for (i = 0, started_num_threads = 0; i < context.num_threads; i++) {
context.threads[i].addr = addr;
context.threads[i].numpages = numpages_per_thread + (i < leftover);
context.threads[i].hpagesize = hpagesize;
@@ -464,6 +470,7 @@ static int touch_all_pages(char *area, size_t hpagesize, size_t numpages,
QEMU_THREAD_JOINABLE);
}
addr += context.threads[i].numpages * hpagesize;
+ started_num_threads++;
}
if (!use_madv_populate_write) {
--
2.27.0

View File

@ -0,0 +1,99 @@
From 3c4b4c4fc3c71b375490233bb9209763d7094ee9 Mon Sep 17 00:00:00 2001
From: Yan Wang <wangyan122@huawei.com>
Date: Tue, 8 Feb 2022 16:10:31 +0800
Subject: [PATCH] pcie: Add pcie-root-port fast plug/unplug feature
If a device is plugged in the pcie-root-port when VM kernel is
booting, the kernel may wrongly disable the device.
This bug was brought in by two patches of the linux kernel:
https://patchwork.kernel.org/patch/10575355/
https://patchwork.kernel.org/patch/10766219/
VM runtime like kata uses this feature to boot microVM,
so we must fix it up. We hack into the pcie native hotplug
patch so that hotplug/unplug will work under this circumstance.
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
---
hw/core/machine.c | 2 ++
hw/pci-bridge/gen_pcie_root_port.c | 2 ++
hw/pci/pcie.c | 13 ++++++++++++-
include/hw/pci/pcie_port.h | 3 +++
4 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 0c17398141..965682619b 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -160,6 +160,8 @@ const size_t hw_compat_4_0_len = G_N_ELEMENTS(hw_compat_4_0);
GlobalProperty hw_compat_3_1[] = {
{ "pcie-root-port", "x-speed", "2_5" },
{ "pcie-root-port", "x-width", "1" },
+ { "pcie-root-port", "fast-plug", "0" },
+ { "pcie-root-port", "fast-unplug", "0" },
{ "memory-backend-file", "x-use-canonical-path-for-ramblock-id", "true" },
{ "memory-backend-memfd", "x-use-canonical-path-for-ramblock-id", "true" },
{ "tpm-crb", "ppi", "false" },
diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c
index 1ce4e7beba..1e1ab5bb19 100644
--- a/hw/pci-bridge/gen_pcie_root_port.c
+++ b/hw/pci-bridge/gen_pcie_root_port.c
@@ -145,6 +145,8 @@ static Property gen_rp_props[] = {
speed, PCIE_LINK_SPEED_16),
DEFINE_PROP_PCIE_LINK_WIDTH("x-width", PCIESlot,
width, PCIE_LINK_WIDTH_32),
+ DEFINE_PROP_UINT8("fast-plug", PCIESlot, fast_plug, 0),
+ DEFINE_PROP_UINT8("fast-unplug", PCIESlot, fast_unplug, 0),
DEFINE_PROP_END_OF_LIST()
};
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index dccf204451..04fbd794a8 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -555,6 +555,7 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
uint8_t *exp_cap = hotplug_pdev->config + hotplug_pdev->exp.exp_cap;
uint32_t sltcap = pci_get_word(exp_cap + PCI_EXP_SLTCAP);
uint16_t sltctl = pci_get_word(exp_cap + PCI_EXP_SLTCTL);
+ PCIESlot *s = PCIE_SLOT(hotplug_pdev);
/* Check if hot-unplug is disabled on the slot */
if ((sltcap & PCI_EXP_SLTCAP_HPC) == 0) {
@@ -600,7 +601,17 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
return;
}
- pcie_cap_slot_push_attention_button(hotplug_pdev);
+ if ((pci_dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) && s->fast_plug) {
+ pci_word_test_and_clear_mask(pci_dev->config + pci_dev->exp.exp_cap + PCI_EXP_LNKSTA,
+ PCI_EXP_LNKSTA_DLLLA);
+ }
+
+ if (s->fast_unplug) {
+ pcie_cap_slot_event(hotplug_pdev,
+ PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP);
+ } else {
+ pcie_cap_slot_push_attention_button(hotplug_pdev);
+ }
}
/* pci express slot for pci express root/downstream port
diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
index 90e6cf45b8..7148a0959b 100644
--- a/include/hw/pci/pcie_port.h
+++ b/include/hw/pci/pcie_port.h
@@ -56,6 +56,9 @@ struct PCIESlot {
uint8_t chassis;
uint16_t slot;
+ uint8_t fast_plug;
+ uint8_t fast_unplug;
+
PCIExpLinkSpeed speed;
PCIExpLinkWidth width;
--
2.27.0

View File

@ -0,0 +1,50 @@
From 6c72e65d57dc2a7d811f76a126a9a006abd0ab75 Mon Sep 17 00:00:00 2001
From: fangying <fangying1@huawei.com>
Date: Wed, 18 Mar 2020 12:51:33 +0800
Subject: [PATCH] pcie: Compat with devices which do not support Link Width,
such as ioh3420
We hack into PCI_EXP_LNKCAP to support device fast plug/unplug
for pcie-root-port. However some devices like ioh3420 does not
suport it, so PCI_EXP_LNKCAP is not set for such devices.
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
---
hw/pci/pcie.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 6db0cf69cd..dccf204451 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -97,13 +97,6 @@ static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
return;
}
- /* Clear and fill LNKCAP from what was configured above */
- pci_long_test_and_clear_mask(exp_cap + PCI_EXP_LNKCAP,
- PCI_EXP_LNKCAP_MLW | PCI_EXP_LNKCAP_SLS);
- pci_long_test_and_set_mask(exp_cap + PCI_EXP_LNKCAP,
- QEMU_PCI_EXP_LNKCAP_MLW(s->width) |
- QEMU_PCI_EXP_LNKCAP_MLS(s->speed));
-
/*
* Link bandwidth notification is required for all root ports and
* downstream ports supporting links wider than x1 or multiple link
@@ -111,6 +104,12 @@ static void pcie_cap_fill_slot_lnk(PCIDevice *dev)
*/
if (s->width > QEMU_PCI_EXP_LNK_X1 ||
s->speed > QEMU_PCI_EXP_LNK_2_5GT) {
+ /* Clear and fill LNKCAP from what was configured above */
+ pci_long_test_and_clear_mask(exp_cap + PCI_EXP_LNKCAP,
+ PCI_EXP_LNKCAP_MLW | PCI_EXP_LNKCAP_SLS);
+ pci_long_test_and_set_mask(exp_cap + PCI_EXP_LNKCAP,
+ QEMU_PCI_EXP_LNKCAP_MLW(s->width) |
+ QEMU_PCI_EXP_LNKCAP_MLS(s->speed));
pci_long_test_and_set_mask(exp_cap + PCI_EXP_LNKCAP,
PCI_EXP_LNKCAP_LBNC);
}
--
2.27.0

View File

@ -0,0 +1,42 @@
From e730214f4485ad444d8a1db9a284da53f407e8da Mon Sep 17 00:00:00 2001
From: Ying Fang <fangying1@huawei.com>
Date: Mon, 29 Jul 2019 16:16:35 +0800
Subject: [PATCH] pl011: reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff
We can enable ACPI when AArch64 Linux is booted with QEMU and UEFI (AAVMF).
When VM is booting and the SBSA driver has not initialized, writting data
that exceds 32 bytes will cause the read FIFO full and proceeding data will
be lost. The searil port appears to be stuck in this abnormal situation.
A hack to reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff appears to
resolve the issue.
The question is fully discussed at
https://www.spinics.net/lists/linux-serial/msg23163.html
Signed-off-by: Haibin Wang <wanghaibin.wang@huawei.com>
Reviewed-by: Shannon Zhao <shannon.zhaosl@gmail.com>
Reviewed-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
---
hw/char/pl011.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/hw/char/pl011.c b/hw/char/pl011.c
index 58edeb9ddb..bc65d778d2 100644
--- a/hw/char/pl011.c
+++ b/hw/char/pl011.c
@@ -314,6 +314,10 @@ static void pl011_write(void *opaque, hwaddr offset,
case 17: /* UARTICR */
s->int_level &= ~value;
pl011_update(s);
+ if (!s->int_enabled && !s->int_level) {
+ s->read_count = 0;
+ s->read_pos = 0;
+ }
break;
case 18: /* UARTDMACR */
s->dmacr = value;
--
2.27.0

View File

@ -0,0 +1,71 @@
From 8e30e81c4268103d502587de565842b9632a7965 Mon Sep 17 00:00:00 2001
From: Jinhao Gao <gaojinhao@huawei.com>
Date: Tue, 15 Feb 2022 17:02:08 +0800
Subject: [PATCH] pl031: support rtc-timer property for pl031
This patch adds the rtc-timer property for pl031, we can get the
rtc time (UTC) through qmp command "qom-get date" with this property.
Signed-off-by: Haibin Wang <wanghaibin.wang@huawei.com>
Reviewed-by: Shannon Zhao <shanon.Zhaosl@gmail.com>
Reviewed-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Jinhao Gao <gaojinhao@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/rtc/pl031.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/hw/rtc/pl031.c b/hw/rtc/pl031.c
index f2e6baebba..57e9a35616 100644
--- a/hw/rtc/pl031.c
+++ b/hw/rtc/pl031.c
@@ -63,6 +63,15 @@ static uint32_t pl031_get_count(PL031State *s)
return s->tick_offset + now / NANOSECONDS_PER_SECOND;
}
+static void pl031_get_date(Object *obj, struct tm *current_tm, Error **errp)
+{
+ PL031State *s = PL031(obj);
+ time_t ti = pl031_get_count(s);
+
+ /* Changed to UTC time */
+ gmtime_r(&ti, current_tm);
+}
+
static void pl031_set_alarm(PL031State *s)
{
uint32_t ticks;
@@ -202,6 +211,20 @@ static void pl031_init(Object *obj)
qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
s->timer = timer_new_ns(rtc_clock, pl031_interrupt, s);
+ object_property_add_tm(OBJECT(s), "date", pl031_get_date);
+}
+
+static void pl031_realize(DeviceState *d, Error **errp)
+{
+ object_property_add_alias(qdev_get_machine(), "rtc-time",
+ OBJECT(d), "date");
+}
+
+static void pl031_unrealize(DeviceState *d)
+{
+ if (object_property_find(qdev_get_machine(), "rtc-time")) {
+ object_property_del(qdev_get_machine(), "rtc-time");
+ }
}
static void pl031_finalize(Object *obj)
@@ -338,6 +361,8 @@ static void pl031_class_init(ObjectClass *klass, void *data)
DeviceClass *dc = DEVICE_CLASS(klass);
dc->vmsd = &vmstate_pl031;
+ dc->realize = pl031_realize;
+ dc->unrealize = pl031_unrealize;
device_class_set_props(dc, pl031_properties);
}
--
2.27.0

View File

@ -0,0 +1,35 @@
From 0a54d68547df3f276dc242b52d54e8549d0a84a0 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Wed, 9 Feb 2022 11:21:28 +0800
Subject: [PATCH] ps2: fix oob in ps2 kbd
fix oob in ps2 kbd
---
hw/input/ps2.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index c8fd23cf36..b647561069 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -167,7 +167,7 @@ void ps2_queue_noirq(PS2State *s, int b)
}
q->data[q->wptr] = b;
- if (++q->wptr == PS2_BUFFER_SIZE) {
+ if (++q->wptr >= PS2_BUFFER_SIZE) {
q->wptr = 0;
}
q->count++;
@@ -557,7 +557,7 @@ uint32_t ps2_read_data(PS2State *s)
val = q->data[index];
} else {
val = q->data[q->rptr];
- if (++q->rptr == PS2_BUFFER_SIZE) {
+ if (++q->rptr >= PS2_BUFFER_SIZE) {
q->rptr = 0;
}
q->count--;
--
2.27.0

View File

@ -0,0 +1,31 @@
From 172d79d8ebb343fa144987d2c50d90655d5aa5f9 Mon Sep 17 00:00:00 2001
From: Kunkun Jiang <jiangkunkun@huawei.com>
Date: Thu, 29 Jul 2021 15:24:48 +0800
Subject: [PATCH] qdev/monitors: Fix reundant error_setg of qdev_add_device
There is an extra log "error_setg" in qdev_add_device(). When
hot-plug a device, if the corresponding bus doesn't exist, it
will trigger an asseration "assert(*errp == NULL)".
Fixes: 515a7970490 (log: Add some logs on VM runtime path)
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
---
system/qdev-monitor.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index c885175b66..b10e483a9a 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -644,7 +644,6 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
if (path != NULL) {
bus = qbus_find(path, errp);
if (!bus) {
- error_setg(errp, "can not find bus for %s", driver);
return NULL;
}
if (!object_dynamic_cast(OBJECT(bus), dc->bus_type)) {
--
2.27.0

View File

@ -0,0 +1,35 @@
From 0e610831d584d9485eb0655168d08d8234bbb555 Mon Sep 17 00:00:00 2001
From: WangJian <wangjian161@huawei.com>
Date: Wed, 9 Feb 2022 10:48:58 +0800
Subject: [PATCH] qemu-nbd: make native as the default aio mode
When the file system is dealing with multithreading concurrent writing to a file,
the performance will be degraded because of the lock.
At present, the default AIO mode of QEMU NBD is threads. In the case of large blocks,
because IO is divided into small pieces and multiple queues, it will become multithreading
concurrent writing the same file. Due to the file system, the performance will be greatly reduced.
If you change to native mode, this problem will not exist.
Signed-off-by: wangjian161 <wangjian161@huawei.com>
---
qemu-nbd.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 186e6468b1..acccf2977f 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -843,6 +843,10 @@ int main(int argc, char **argv)
trace_init_file();
qemu_set_log(LOG_TRACE, &error_fatal);
+ if (!seen_aio && (flags & BDRV_O_NOCACHE)) {
+ flags |= BDRV_O_NATIVE_AIO;
+ }
+
socket_activation = check_socket_activation();
if (socket_activation == 0) {
if (!sockpath) {
--
2.27.0

View File

@ -0,0 +1,42 @@
From d6aa08ac3693be3e08f2c8d3ad5a356ea6e9dead Mon Sep 17 00:00:00 2001
From: WangJian <wangjian161@huawei.com>
Date: Wed, 9 Feb 2022 10:55:08 +0800
Subject: [PATCH] qemu-nbd: set timeout to qemu-nbd socket
In case of insufficient memory and kill-9,
the NBD socket cannot be processed and stuck all the time.
Signed-off-by: wangjian161 <wangjian161@huawei.com>
---
nbd/client.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/nbd/client.c b/nbd/client.c
index 29ffc609a4..987dde43c7 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -24,6 +24,8 @@
#include "nbd-internal.h"
#include "qemu/cutils.h"
+#define NBD_TIMEOUT_SECONDS 30
+
/* Definitions for opaque data types */
static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
@@ -1310,6 +1312,12 @@ int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info,
}
}
+ if (ioctl(fd, NBD_SET_TIMEOUT, NBD_TIMEOUT_SECONDS) < 0) {
+ int serrno = errno;
+ error_setg(errp, "Failed setting timeout");
+ return -serrno;
+ }
+
trace_nbd_init_finish();
return 0;
--
2.27.0

170
qemu.spec
View File

@ -3,7 +3,7 @@
Name: qemu Name: qemu
Version: 8.2.0 Version: 8.2.0
Release: 4 Release: 5
Epoch: 11 Epoch: 11
Summary: QEMU is a generic and open source machine emulator and virtualizer Summary: QEMU is a generic and open source machine emulator and virtualizer
License: GPLv2 and BSD and MIT and CC-BY-SA-4.0 License: GPLv2 and BSD and MIT and CC-BY-SA-4.0
@ -97,6 +97,88 @@ Patch0080: block-disallow-block-jobs-when-there-is-a-BDRV_O_INA.patch
Patch0081: travis-ci-Rename-SOFTMMU-SYSTEM.patch Patch0081: travis-ci-Rename-SOFTMMU-SYSTEM.patch
Patch0082: iotests-adapt-to-output-change-for-recently-introduc.patch Patch0082: iotests-adapt-to-output-change-for-recently-introduc.patch
Patch0083: migration-Skip-only-empty-block-devicesi.patch Patch0083: migration-Skip-only-empty-block-devicesi.patch
Patch0084: vhost-cancel-migration-when-vhost-user-restarted-dur.patch
Patch0085: Currently-while-kvm-and-qemu-can-not-handle-some-kvm.patch
Patch0086: ps2-fix-oob-in-ps2-kbd.patch
Patch0087: monitor-qmp-drop-inflight-rsp-if-qmp-client-broken.patch
Patch0088: oslib-posix-optimise-vm-startup-time-for-1G-hugepage.patch
Patch0089: migration-skip-cache_drop-for-bios-bootloader-and-nv.patch
Patch0090: migration-Add-multi-thread-compress-method.patch
Patch0091: migration-Refactoring-multi-thread-compress-migratio.patch
Patch0092: migration-Add-multi-thread-compress-ops.patch
Patch0093: migration-Add-zstd-support-in-multi-thread-compressi.patch
Patch0094: migration-Add-compress_level-sanity-check.patch
Patch0095: doc-Update-multi-thread-compression-doc.patch
Patch0096: cpu-features-fix-bug-for-memory-leakage.patch
Patch0097: migration-report-migration-related-thread-pid-to-lib.patch
Patch0098: migration-report-multiFd-related-thread-pid-to-libvi.patch
Patch0099: virtio-check-descriptor-numbers.patch
Patch0100: virtio-bugfix-add-rcu_read_lock-when-vring_avail_idx.patch
Patch0101: virtio-print-the-guest-virtio_net-features-that-host.patch
Patch0102: virtio-bugfix-check-the-value-of-caches-before-acces.patch
Patch0103: virtio-scsi-bugfix-fix-qemu-crash-for-hotplug-scsi-d.patch
Patch0104: nbd-server.c-fix-invalid-read-after-client-was-alrea.patch
Patch0105: qemu-nbd-make-native-as-the-default-aio-mode.patch
Patch0106: qemu-nbd-set-timeout-to-qemu-nbd-socket.patch
Patch0107: qdev-monitors-Fix-reundant-error_setg-of-qdev_add_de.patch
Patch0108: pcie-Compat-with-devices-which-do-not-support-Link-W.patch
Patch0109: pcie-Add-pcie-root-port-fast-plug-unplug-feature.patch
Patch0110: net-dump.c-Suppress-spurious-compiler-warning.patch
Patch0111: hw-net-rocker_of_dpa-fix-double-free-bug-of-rocker-d.patch
Patch0112: i6300esb-watchdog-bugfix-Add-a-runstate-transition.patch
Patch0113: vhost-user-Set-the-acked_features-to-vm-s-featrue.patch
Patch0114: vhost-user-Add-support-reconnect-vhost-user-socket.patch
Patch0115: fix-qemu-core-when-vhost-user-net-config-with-server.patch
Patch0116: vhost-user-quit-infinite-loop-while-used-memslots-is.patch
Patch0117: vhost-user-add-vhost_set_mem_table-when-vm-load_setu.patch
Patch0118: vhost-user-add-unregister_savevm-when-vhost-user-cle.patch
Patch0119: monitor-Discard-BLOCK_IO_ERROR-event-when-VM-reboote.patch
Patch0120: virtio-net-bugfix-do-not-delete-netdev-before-virtio.patch
Patch0121: virtio-net-fix-max-vring-buf-size-when-set-ring-num.patch
Patch0122: virtio-net-set-the-max-of-queue-size-to-4096.patch
Patch0123: virtio-net-update-the-default-and-max-of-rx-tx_queue.patch
Patch0124: hw-usb-reduce-the-vpcu-cost-of-UHCI-when-VNC-disconn.patch
Patch0125: vhost-vdpa-add-VHOST_BACKEND_F_BYTEMAPLOG.patch
Patch0126: vhost-vdpa-add-migration-log-ops-for-VhostOps.patch
Patch0127: vhost-introduce-bytemap-for-vhost-backend-logging.patch
Patch0128: vhost-add-vhost_dev_suspend-resume_op.patch
Patch0129: vhost-implement-vhost-vdpa-suspend-resume.patch
Patch0130: vhost-implement-vhost_vdpa_device_suspend-resume.patch
Patch0131: vhost-implement-savevm_handler-for-vdpa-device.patch
Patch0132: vhost-implement-post-resume-bh.patch
Patch0133: vhost-implement-migration-state-notifier-for-vdpa-de.patch
Patch0134: vdpa-implement-vdpa-device-migration.patch
Patch0135: vdpa-move-memory-listener-to-the-realize-stage.patch
Patch0136: vdpa-support-vdpa-device-suspend-resume.patch
Patch0137: vdpa-suspend-function-return-0-when-the-vdpa-device-.patch
Patch0138: vdpa-correct-param-passed-in-when-unregister-save.patch
Patch0139: vdpa-don-t-suspend-resume-device-when-vdpa-device-no.patch
Patch0140: docs-Add-generic-vhost-vdpa-device-documentation.patch
Patch0141: vdpa-set-vring-enable-only-if-the-vring-address-has-.patch
Patch0142: ide-ahci-add-check-to-avoid-null-dereference-CVE-201.patch
Patch0143: net-eepro100-validate-various-address-valuesi-CVE-20.patch
Patch0144: cpu-add-Kunpeng-920-cpu-support.patch
Patch0145: cpu-add-Cortex-A72-processor-kvm-target-support.patch
Patch0146: tests-virt-Allow-changes-to-PPTT-test-table.patch
Patch0147: hw-arm64-add-vcpu-cache-info-support.patch
Patch0148: arm64-Add-the-cpufreq-device-to-show-cpufreq-info-to.patch
Patch0149: tests-virt-Update-expected-ACPI-tables-for-virt-test.patch
Patch0150: pl011-reset-read-FIFO-when-UARTTIMSC-0-UARTICR-0xfff.patch
Patch0151: shadow_dev-introduce-shadow-dev-for-virtio-net-devic.patch
Patch0152: tests-Disable-filemonitor-testcase.patch
Patch0153: freeclock-add-qmp-command-to-get-time-offset-of-vm-i.patch
Patch0154: freeclock-set-rtc_date_diff-for-arm.patch
Patch0155: freeclock-set-rtc_date_diff-for-X86.patch
Patch0156: i386-cache-passthrough-Update-AMD-8000_001D.EAX-25-1.patch
Patch0157: bugfix-irq-Avoid-covering-object-refcount-of-qemu_ir.patch
Patch0158: log-Add-log-at-boot-cpu-init-for-aarch64.patch
Patch0159: feature-Add-log-for-each-modules.patch
Patch0160: feature-Add-logs-for-vm-start-and-destroy.patch
Patch0161: pl031-support-rtc-timer-property-for-pl031.patch
Patch0162: arm-acpi-Fix-when-make-qemu-system-aarch64-at-x86_64.patch
Patch0163: linux-headers-update-against-5.10-and-manual-clear-v.patch
Patch0164: vfio-Maintain-DMA-mapping-range-for-the-container.patch
Patch0165: vfio-migration-Add-support-for-manual-clear-vfio-dir.patch
BuildRequires: flex BuildRequires: flex
BuildRequires: gcc BuildRequires: gcc
@ -311,7 +393,7 @@ qemubuilddir="build"
tar xf %{SOURCE4} tar xf %{SOURCE4}
cd BinDir/ cd BinDir/
\cp -r -a . ../ \cp -r -a * ../
cd ../ cd ../
./configure \ ./configure \
@ -694,6 +776,90 @@ getent passwd qemu >/dev/null || \
%endif %endif
%changelog %changelog
* Sun Apr 7 2024 Jiabo Feng <fengjiabo1@huawei.com> - 11:8.2.0-5
- vfio/migration: Add support for manual clear vfio dirty log
- vfio: Maintain DMA mapping range for the container
- linux-headers: update against 5.10 and manual clear vfio dirty log series
- arm/acpi: Fix when make qemu-system-aarch64 at x86_64 host bios_tables_test fail reason: __aarch64__ macro let build_pptt at x86_64 and aarch64 host build different function that let bios_tables_test fail.
- pl031: support rtc-timer property for pl031
- feature: Add logs for vm start and destroy
- feature: Add log for each modules
- log: Add log at boot & cpu init for aarch64
- bugfix: irq: Avoid covering object refcount of qemu_irq
- i386: cache passthrough: Update AMD 8000_001D.EAX[25:14] based on vCPU topo
- freeclock: set rtc_date_diff for X86
- freeclock: set rtc_date_diff for arm
- freeclock: add qmp command to get time offset of vm in seconds
- tests: Disable filemonitor testcase
- shadow_dev: introduce shadow dev for virtio-net device
- pl011: reset read FIFO when UARTTIMSC=0 & UARTICR=0xffff
- tests: virt: Update expected ACPI tables for virt test(Update BinDir)
- arm64: Add the cpufreq device to show cpufreq info to guest
- hw/arm64: add vcpu cache info support
- tests: virt: Allow changes to PPTT test table
- cpu: add Cortex-A72 processor kvm target support
- cpu: add Kunpeng-920 cpu support
- net: eepro100: validate various address valuesi(CVE-2021-20255)
- ide: ahci: add check to avoid null dereference (CVE-2019-12067)
- vdpa: set vring enable only if the vring address has already been set
- docs: Add generic vhost-vdpa device documentation
- vdpa: don't suspend/resume device when vdpa device not started
- vdpa: correct param passed in when unregister save
- vdpa: suspend function return 0 when the vdpa device is stopped
- vdpa: support vdpa device suspend/resume
- vdpa: move memory listener to the realize stage
- vdpa: implement vdpa device migration
- vhost: implement migration state notifier for vdpa device
- vhost: implement post resume bh
- vhost: implement savevm_handler for vdpa device
- vhost: implement vhost_vdpa_device_suspend/resume
- vhost: implement vhost-vdpa suspend/resume
- vhost: add vhost_dev_suspend/resume_op
- vhost: introduce bytemap for vhost backend logging
- vhost-vdpa: add migration log ops for VhostOps
- vhost-vdpa: add VHOST_BACKEND_F_BYTEMAPLOG
- hw/usb: reduce the vpcu cost of UHCI when VNC disconnect
- virtio-net: update the default and max of rx/tx_queue_size
- virtio-net: set the max of queue size to 4096
- virtio-net: fix max vring buf size when set ring num
- virtio-net: bugfix: do not delete netdev before virtio net
- monitor: Discard BLOCK_IO_ERROR event when VM rebooted
- vhost-user: add unregister_savevm when vhost-user cleanup
- vhost-user: add vhost_set_mem_table when vm load_setup at destination
- vhost-user: quit infinite loop while used memslots is more than the backend limit
- fix qemu-core when vhost-user-net config with server mode
- vhost-user: Add support reconnect vhost-user socket
- vhost-user: Set the acked_features to vm's featrue
- i6300esb watchdog: bugfix: Add a runstate transition
- hw/net/rocker_of_dpa: fix double free bug of rocker device
- net/dump.c: Suppress spurious compiler warning
- pcie: Add pcie-root-port fast plug/unplug feature
- pcie: Compat with devices which do not support Link Width, such as ioh3420
- qdev/monitors: Fix reundant error_setg of qdev_add_device
- qemu-nbd: set timeout to qemu-nbd socket
- qemu-nbd: make native as the default aio mode
- nbd/server.c: fix invalid read after client was already free
- virtio-scsi: bugfix: fix qemu crash for hotplug scsi disk with dataplane
- virtio: bugfix: check the value of caches before accessing it
- virtio: print the guest virtio_net features that host does not support
- virtio: bugfix: add rcu_read_lock when vring_avail_idx is called
- virtio: check descriptor numbers
- migration: report multiFd related thread pid to libvirt
- migration: report migration related thread pid to libvirt
- cpu/features: fix bug for memory leakage
- doc: Update multi-thread compression doc
- migration: Add compress_level sanity check
- migration: Add zstd support in multi-thread compression
- migration: Add multi-thread compress ops
- migration: Refactoring multi-thread compress migration
- migration: Add multi-thread compress method
- migration: skip cache_drop for bios bootloader and nvram template
- oslib-posix: optimise vm startup time for 1G hugepage
- monitor/qmp: drop inflight rsp if qmp client broken
- ps2: fix oob in ps2 kbd
- Currently, while kvm and qemu can not handle some kvm exit, qemu will do vm_stop, which will make vm in pause state. This action make vm unrecoverable, so send guest panic to libvirt instead.
- vhost: cancel migration when vhost-user restarted during migraiton
* Mon Apr 1 2024 Jiabo Feng <fengjiabo1@huawei.com> - 11:8.2.0-4 * Mon Apr 1 2024 Jiabo Feng <fengjiabo1@huawei.com> - 11:8.2.0-4
- migration: Skip only empty block devicesi - migration: Skip only empty block devicesi
- iotests: adapt to output change for recently introduced 'detached hea… - iotests: adapt to output change for recently introduced 'detached hea…

View File

@ -0,0 +1,196 @@
From c4829aa6fce007c995b21cfbd86de0473263c19a Mon Sep 17 00:00:00 2001
From: Dongxu Sun <sundongxu3@huawei.com>
Date: Sat, 30 Mar 2024 12:49:05 +0800
Subject: [PATCH] shadow_dev: introduce shadow dev for virtio-net device
for virtio net devices, create the shadow device for vlpi
bypass inject supported.
Signed-off-by: Wang Haibin <wanghaibin.wang@huawei.com>
Signed-off-by: Yu Zenghui <yuzenghui@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Signed-off-by: KunKun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
hw/virtio/virtio-pci.c | 32 ++++++++++++++++++++++++++
include/sysemu/kvm.h | 5 +++++
linux-headers/linux/kvm.h | 13 +++++++++++
target/arm/kvm.c | 47 +++++++++++++++++++++++++++++++++++++++
4 files changed, 97 insertions(+)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 134a8eaef6..f8adb0520a 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -922,18 +922,44 @@ undo:
}
return ret;
}
+
+#ifdef __aarch64__
+int __attribute__((weak)) kvm_create_shadow_device(PCIDevice *dev)
+{
+ return 0;
+}
+
+int __attribute__((weak)) kvm_delete_shadow_device(PCIDevice *dev)
+{
+ return 0;
+}
+#endif
+
static int kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy *proxy, int nvqs)
{
int queue_no;
int ret = 0;
VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
+#ifdef __aarch64__
+ if (!strcmp(vdev->name, "virtio-net")) {
+ kvm_create_shadow_device(&proxy->pci_dev);
+ }
+#endif
+
for (queue_no = 0; queue_no < nvqs; queue_no++) {
if (!virtio_queue_get_num(vdev, queue_no)) {
return -1;
}
ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
}
+
+#ifdef __aarch64__
+ if (!strcmp(vdev->name, "virtio-net") && ret != 0) {
+ kvm_delete_shadow_device(&proxy->pci_dev);
+ }
+#endif
+
return ret;
}
@@ -976,6 +1002,12 @@ static void kvm_virtio_pci_vector_vq_release(VirtIOPCIProxy *proxy, int nvqs)
}
kvm_virtio_pci_vector_release_one(proxy, queue_no);
}
+
+#ifdef __aarch64__
+ if (!strcmp(vdev->name, "virtio-net")) {
+ kvm_delete_shadow_device(&proxy->pci_dev);
+ }
+#endif
}
static void kvm_virtio_pci_vector_config_release(VirtIOPCIProxy *proxy)
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index d614878164..b46d6203b4 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -538,4 +538,9 @@ bool kvm_arch_cpu_check_are_resettable(void);
bool kvm_dirty_ring_enabled(void);
uint32_t kvm_dirty_ring_size(void);
+
+#ifdef __aarch64__
+int kvm_create_shadow_device(PCIDevice *dev);
+int kvm_delete_shadow_device(PCIDevice *dev);
+#endif
#endif
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 549fea3a97..56f6b2583f 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1198,6 +1198,8 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229
#define KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES 230
+#define KVM_CAP_ARM_VIRT_MSI_BYPASS 799
+
#ifdef KVM_CAP_IRQ_ROUTING
struct kvm_irq_routing_irqchip {
@@ -1524,6 +1526,17 @@ struct kvm_s390_ucas_mapping {
#define KVM_XEN_HVM_CONFIG _IOW(KVMIO, 0x7a, struct kvm_xen_hvm_config)
#define KVM_SET_CLOCK _IOW(KVMIO, 0x7b, struct kvm_clock_data)
#define KVM_GET_CLOCK _IOR(KVMIO, 0x7c, struct kvm_clock_data)
+
+#ifdef __aarch64__
+struct kvm_master_dev_info
+{
+ __u32 nvectors; /* number of msi vectors */
+ struct kvm_msi msi[0];
+};
+#define KVM_CREATE_SHADOW_DEV _IOW(KVMIO, 0xf0, struct kvm_master_dev_info)
+#define KVM_DEL_SHADOW_DEV _IOW(KVMIO, 0xf1, __u32)
+#endif
+
/* Available with KVM_CAP_PIT_STATE2 */
#define KVM_GET_PIT2 _IOR(KVMIO, 0x9f, struct kvm_pit_state2)
#define KVM_SET_PIT2 _IOW(KVMIO, 0xa0, struct kvm_pit_state2)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7903e2ddde..f59f4f81b2 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -26,6 +26,8 @@
#include "trace.h"
#include "internals.h"
#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
#include "exec/memattrs.h"
#include "exec/address-spaces.h"
#include "hw/boards.h"
@@ -1053,6 +1055,51 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
return 0;
}
+int kvm_create_shadow_device(PCIDevice *dev)
+{
+ KVMState *s = kvm_state;
+ struct kvm_master_dev_info *mdi;
+ MSIMessage msg;
+ uint32_t vector, nvectors = msix_nr_vectors_allocated(dev);
+ uint32_t request_id;
+ int ret;
+
+ if (!kvm_vm_check_extension(s, KVM_CAP_ARM_VIRT_MSI_BYPASS) || !nvectors) {
+ return 0;
+ }
+
+ mdi = g_malloc0(sizeof(uint32_t) + sizeof(struct kvm_msi) * nvectors);
+ mdi->nvectors = nvectors;
+ request_id = pci_requester_id(dev);
+
+ for (vector = 0; vector < nvectors; vector++) {
+ msg = msix_get_message(dev, vector);
+ mdi->msi[vector].address_lo = extract64(msg.address, 0, 32);
+ mdi->msi[vector].address_hi = extract64(msg.address, 32, 32);
+ mdi->msi[vector].data = le32_to_cpu(msg.data);
+ mdi->msi[vector].flags = KVM_MSI_VALID_DEVID;
+ mdi->msi[vector].devid = request_id;
+ memset(mdi->msi[vector].pad, 0, sizeof(mdi->msi[vector].pad));
+ }
+
+ ret = kvm_vm_ioctl(s, KVM_CREATE_SHADOW_DEV, mdi);
+ g_free(mdi);
+ return ret;
+}
+
+int kvm_delete_shadow_device(PCIDevice *dev)
+{
+ KVMState *s = kvm_state;
+ uint32_t request_id, nvectors = msix_nr_vectors_allocated(dev);
+
+ if (!kvm_vm_check_extension(s, KVM_CAP_ARM_VIRT_MSI_BYPASS) || !nvectors) {
+ return 0;
+ }
+
+ request_id = pci_requester_id(dev);
+ return kvm_vm_ioctl(s, KVM_DEL_SHADOW_DEV, &request_id);
+}
+
int kvm_arch_add_msi_route_post(struct kvm_irq_routing_entry *route,
int vector, PCIDevice *dev)
{
--
2.27.0

View File

@ -0,0 +1,32 @@
From bad33579c56b73d56e0b220c98faad7893609b85 Mon Sep 17 00:00:00 2001
From: Ying Fang <fangying1@huawei.com>
Date: Mon, 18 Mar 2024 10:21:04 +0800
Subject: [PATCH] tests: Disable filemonitor testcase
Since filemonitor testcase requires that host kernel being a LTS version,
we cannot guarantee that on OBS system. Lets disable it by default.
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Jinhao Gao <gaojinhao@huawei.com>
Signed-off-by: Yuan Zhang <zhangyuan162@huawei.com>
---
tests/unit/meson.build | 3 ---
1 file changed, 3 deletions(-)
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index a05d471090..598ba41bb9 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -142,9 +142,6 @@ if have_system
'test-vmstate': [migration, io],
'test-yank': ['socket-helpers.c', qom, io, chardev]
}
- if config_host_data.get('CONFIG_INOTIFY1')
- tests += {'test-util-filemonitor': []}
- endif
# Some tests: test-char, test-qdev-global-props, and test-qga,
# are not runnable under TSan due to a known issue.
--
2.27.0

View File

@ -0,0 +1,25 @@
From 3402740cb4f6d6b9baabfde0a7667b4990b010a5 Mon Sep 17 00:00:00 2001
From: Kunkun Jiang <jiangkunkun@huawei.com>
Date: Sat, 30 Mar 2024 19:21:59 +0800
Subject: [PATCH] tests: virt: Allow changes to PPTT test table
Allow changes to test/data/acpi/virt/PPTT*, prepare to change the
building policy of the cluster topology.
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
tests/qtest/bios-tables-test-allowed-diff.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..18d02a710d 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,4 @@
/* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/virt/PPTT",
+"tests/data/acpi/virt/PPTT.acpihmatvirt",
+"tests/data/acpi/virt/PPTT.topology",
--
2.27.0

View File

@ -0,0 +1,25 @@
From b062e2f182af4c44fbd3a03eda9c934686037032 Mon Sep 17 00:00:00 2001
From: Kunkun Jiang <jiangkunkun@huawei.com>
Date: Sat, 30 Mar 2024 20:16:32 +0800
Subject: [PATCH] tests: virt: Update expected ACPI tables for virt test
Update the ACPI tables according to the acpi aml_build change, also
empty bios-tables-test-allowed-diff.h.
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
tests/qtest/bios-tables-test-allowed-diff.h | 3 ---
1 files changed, 3 deletions(-)
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 18d02a710d..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,4 +1 @@
/* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/virt/PPTT",
-"tests/data/acpi/virt/PPTT.acpihmatvirt",
-"tests/data/acpi/virt/PPTT.topology",
--
2.27.0

View File

@ -0,0 +1,30 @@
From 5714aaddcbc313e63da435a253d9d472984d7b49 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Thu, 14 Dec 2023 11:22:54 +0800
Subject: [PATCH] vdpa: correct param passed in when unregister save
The idstr passed in the unregister_savevm function is inconsisten
with the idstr passed in when register_savevm_live registration.
Needs to be modified, otherwise migration will fail after hotunplug
all vdpa devices.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev-mig.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
index b889dd4715..1d299019da 100644
--- a/hw/virtio/vdpa-dev-mig.c
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -404,6 +404,6 @@ void vdpa_migration_register(VhostVdpaDevice *vdev)
void vdpa_migration_unregister(VhostVdpaDevice *vdev)
{
migration_remove_notifier(&vdev->migration_state);
- unregister_savevm(VMSTATE_IF(&vdev->parent_obj.parent_obj), "vdpa", DEVICE(vdev));
+ unregister_savevm(NULL, "vdpa", DEVICE(vdev));
qemu_del_vm_change_state_handler(vdev->vmstate);
}
--
2.27.0

View File

@ -0,0 +1,67 @@
From b82f02e93d5efa2ea62dd135c508cb707fdd35a7 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Tue, 19 Dec 2023 20:32:00 +0800
Subject: [PATCH] vdpa: don't suspend/resume device when vdpa device not
started
When vdpa device not started, we don't need to suspend vdpa device
and send vdpa device state information. Therefore, add the suspended
flag of vdpa device to distinguish whether the device is suspended and
use it to determine whether the device needs to resume in dest qemu.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev-mig.c | 23 +++++++++++++++--------
1 file changed, 15 insertions(+), 8 deletions(-)
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
index 1d299019da..887c96a201 100644
--- a/hw/virtio/vdpa-dev-mig.c
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -294,10 +294,13 @@ static int vdpa_save_complete_precopy(QEMUFile *f, void *opaque)
int ret;
qemu_put_be64(f, VDPA_MIG_FLAG_DEV_CONFIG_STATE);
- ret = vhost_vdpa_dev_buffer_save(hdev, f);
- if (ret) {
- error_report("Save vdpa device buffer failed: %d\n", ret);
- return ret;
+ qemu_put_be16(f, (uint16_t)vdev->suspended);
+ if (vdev->suspended) {
+ ret = vhost_vdpa_dev_buffer_save(hdev, f);
+ if (ret) {
+ error_report("Save vdpa device buffer failed: %d\n", ret);
+ return ret;
+ }
}
qemu_put_be64(f, VDPA_MIG_FLAG_END_OF_STATE);
@@ -311,6 +314,7 @@ static int vdpa_load_state(QEMUFile *f, void *opaque, int version_id)
int ret;
uint64_t data;
+ uint16_t suspended;
data = qemu_get_be64(f);
while (data != VDPA_MIG_FLAG_END_OF_STATE) {
@@ -323,10 +327,13 @@ static int vdpa_load_state(QEMUFile *f, void *opaque, int version_id)
return -EINVAL;
}
} else if (data == VDPA_MIG_FLAG_DEV_CONFIG_STATE) {
- ret = vhost_vdpa_dev_buffer_load(hdev, f);
- if (ret) {
- error_report("fail to restore device buffer.\n");
- return ret;
+ suspended = qemu_get_be16(f);
+ if (suspended) {
+ ret = vhost_vdpa_dev_buffer_load(hdev, f);
+ if (ret) {
+ error_report("fail to restore device buffer.\n");
+ return ret;
+ }
}
}
--
2.27.0

View File

@ -0,0 +1,75 @@
From 4688e12c57a34801010abf2a4cf528fcef3b9ec0 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:59:56 +0800
Subject: [PATCH] vdpa: implement vdpa device migration
Integrate the live migration code, call the registered live
migration function, and open the vdpa live migration prototype
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index f22d5d5bc0..6af78a4229 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -28,6 +28,8 @@
#include "hw/virtio/vdpa-dev.h"
#include "sysemu/sysemu.h"
#include "sysemu/runstate.h"
+#include "hw/virtio/vdpa-dev-mig.h"
+#include "migration/migration.h"
static void
vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
@@ -154,6 +156,8 @@ static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
vhost_vdpa_device_dummy_handle_output);
}
+ vdpa_migration_register(v);
+
return;
free_config:
@@ -173,6 +177,7 @@ static void vhost_vdpa_device_unrealize(DeviceState *dev)
VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
int i;
+ vdpa_migration_unregister(s);
virtio_set_status(vdev, 0);
for (i = 0; i < s->num_queues; i++) {
@@ -308,6 +313,7 @@ static void vhost_vdpa_device_stop(VirtIODevice *vdev)
static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
{
VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+ MigrationState *ms = migrate_get_current();
bool should_start = virtio_device_started(vdev, status);
Error *local_err = NULL;
int ret;
@@ -320,6 +326,11 @@ static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
return;
}
+ if (ms->state == RUN_STATE_PAUSED ||
+ ms->state == RUN_STATE_RESTORE_VM) {
+ return;
+ }
+
if (should_start) {
ret = vhost_vdpa_device_start(vdev, &local_err);
if (ret < 0) {
@@ -338,7 +349,7 @@ static Property vhost_vdpa_device_properties[] = {
static const VMStateDescription vmstate_vhost_vdpa_device = {
.name = "vhost-vdpa-device",
- .unmigratable = 1,
+ .unmigratable = 0,
.minimum_version_id = 1,
.version_id = 1,
.fields = (VMStateField[]) {
--
2.27.0

View File

@ -0,0 +1,91 @@
From 587f42300488af4478d7aa1b62e2b351155621db Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 16:01:16 +0800
Subject: [PATCH] vdpa: move memory listener to the realize stage
Move the memory listener registration of vdpa from the start stage
to the realize stage. Avoid that in the start phase, the memory
listener callback function has not yet been processed.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev.c | 4 ++++
hw/virtio/vhost-vdpa.c | 5 -----
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 6af78a4229..877bf7464f 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -30,6 +30,7 @@
#include "sysemu/runstate.h"
#include "hw/virtio/vdpa-dev-mig.h"
#include "migration/migration.h"
+#include "exec/address-spaces.h"
static void
vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
@@ -125,6 +126,7 @@ static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
goto free_vqs;
}
+ memory_listener_register(&v->vdpa.listener, &address_space_memory);
v->config_size = vhost_vdpa_device_get_u32(v->vhostfd,
VHOST_VDPA_GET_CONFIG_SIZE,
errp);
@@ -163,6 +165,7 @@ static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
free_config:
g_free(v->config);
vhost_cleanup:
+ memory_listener_unregister(&v->vdpa.listener);
vhost_dev_cleanup(&v->dev);
free_vqs:
g_free(vqs);
@@ -188,6 +191,7 @@ static void vhost_vdpa_device_unrealize(DeviceState *dev)
g_free(s->config);
g_free(s->dev.vqs);
+ memory_listener_unregister(&s->vdpa.listener);
vhost_dev_cleanup(&s->dev);
qemu_close(s->vhostfd);
s->vhostfd = -1;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 063e941544..30408f2069 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1320,8 +1320,6 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
"IOMMU and try again");
return -1;
}
- memory_listener_register(&v->listener, dev->vdev->dma_as);
-
return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
}
@@ -1515,7 +1513,6 @@ static bool vhost_vdpa_force_iommu(struct vhost_dev *dev)
static int vhost_vdpa_suspend_device(struct vhost_dev *dev)
{
- struct vhost_vdpa *v = dev->opaque;
int ret;
vhost_vdpa_svqs_stop(dev);
@@ -1526,7 +1523,6 @@ static int vhost_vdpa_suspend_device(struct vhost_dev *dev)
}
ret = vhost_vdpa_call(dev, VHOST_VDPA_SUSPEND, NULL);
- memory_listener_unregister(&v->listener);
return ret;
}
@@ -1548,7 +1544,6 @@ static int vhost_vdpa_resume_device(struct vhost_dev *dev)
return 0;
}
- memory_listener_register(&v->listener, &address_space_memory);
return vhost_vdpa_call(dev, VHOST_VDPA_RESUME, NULL);
}
--
2.27.0

View File

@ -0,0 +1,38 @@
From 0f515ff831f46ef34cd83aa145e547e48d8b3b56 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Thu, 14 Dec 2023 11:05:52 +0800
Subject: [PATCH] vdpa: set vring enable only if the vring address has already
been set
Currently, vhost-vdpa does not determine the status of each vring when
performing the enable operation on vring. When the vBIOS(EDK2) is running,
the driver will not enable all vrings. In this case, setting all vrings
to enable is isconsistent with the actual situation.
Add logic when enabling vring, make a judement on the vring status. If the
vring address is not set, the vring will not enabled.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vhost-vdpa.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 30408f2069..d49826845f 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -890,6 +890,11 @@ int vhost_vdpa_set_vring_ready(struct vhost_vdpa *v, unsigned idx)
.index = idx,
.num = 1,
};
+ hwaddr addr = virtio_queue_get_desc_addr(dev->vdev, idx);
+ if (addr == 0) {
+ return 0;
+ }
+
int r = vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
trace_vhost_vdpa_set_vring_ready(dev, idx, r);
--
2.27.0

View File

@ -0,0 +1,120 @@
From e58b48ab2bb679f4c661301019d6f94bd39f93e5 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Tue, 19 Dec 2023 20:18:03 +0800
Subject: [PATCH] vdpa: support vdpa device suspend/resume
only implement suspend and resume interface used for migration. The
current implementation still has bugs when suspend/resume a virtual
machine. Fix it.
Fixes: 4c5a9a0703 (""vhost: implement vhost_vdpa_device_suspend/resume)
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev-mig.c | 16 +++++++++++-----
hw/virtio/vdpa-dev.c | 8 +-------
include/hw/virtio/vdpa-dev.h | 1 +
3 files changed, 13 insertions(+), 12 deletions(-)
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
index 9b47e3ed45..8b13f89c85 100644
--- a/hw/virtio/vdpa-dev-mig.c
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -143,6 +143,7 @@ static int vhost_vdpa_device_suspend(VhostVdpaDevice *vdpa)
}
vdpa->started = false;
+ vdpa->suspended = true;
ret = vhost_dev_suspend(&vdpa->dev, vdev, false);
if (ret) {
@@ -165,6 +166,7 @@ set_guest_notifiers_fail:
}
suspend_fail:
+ vdpa->suspended = false;
vdpa->started = true;
return ret;
}
@@ -201,6 +203,7 @@ static int vhost_vdpa_device_resume(VhostVdpaDevice *vdpa)
goto err_guest_notifiers;
}
vdpa->started = true;
+ vdpa->suspended = false;
/*
* guest_notifier_mask/pending not used yet, so just unmask
@@ -241,7 +244,7 @@ static void vdpa_dev_vmstate_change(void *opaque, bool running, RunState state)
MigrationIncomingState *mis = migration_incoming_get_current();
if (!running) {
- if (ms->state == RUN_STATE_PAUSED) {
+ if (ms->state == MIGRATION_STATUS_ACTIVE || state == RUN_STATE_PAUSED) {
ret = vhost_vdpa_device_suspend(vdpa);
if (ret) {
error_report("suspend vdpa device failed: %d\n", ret);
@@ -251,16 +254,19 @@ static void vdpa_dev_vmstate_change(void *opaque, bool running, RunState state)
}
}
} else {
- if (ms->state == RUN_STATE_RESTORE_VM) {
+ if (vdpa->suspended) {
ret = vhost_vdpa_device_resume(vdpa);
if (ret) {
- error_report("migration dest resume device failed, abort!\n");
- exit(EXIT_FAILURE);
+ error_report("vhost vdpa device resume failed: %d\n", ret);
}
}
if (mis->state == RUN_STATE_RESTORE_VM) {
- vhost_vdpa_call(hdev, VHOST_VDPA_RESUME, NULL);
+ ret = vhost_vdpa_call(hdev, VHOST_VDPA_RESUME, NULL);
+ if (ret) {
+ error_report("migration dest resume device failed: %d\n", ret);
+ exit(EXIT_FAILURE);
+ }
/* post resume */
mis->bh = qemu_bh_new(vdpa_dev_migration_handle_incoming_bh,
hdev);
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 877bf7464f..91e71847b0 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -317,7 +317,6 @@ static void vhost_vdpa_device_stop(VirtIODevice *vdev)
static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
{
VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
- MigrationState *ms = migrate_get_current();
bool should_start = virtio_device_started(vdev, status);
Error *local_err = NULL;
int ret;
@@ -326,12 +325,7 @@ static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
should_start = false;
}
- if (s->started == should_start) {
- return;
- }
-
- if (ms->state == RUN_STATE_PAUSED ||
- ms->state == RUN_STATE_RESTORE_VM) {
+ if (s->started == should_start || s->suspended) {
return;
}
diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
index 20f50c76c6..60e9c3f3fe 100644
--- a/include/hw/virtio/vdpa-dev.h
+++ b/include/hw/virtio/vdpa-dev.h
@@ -37,6 +37,7 @@ struct VhostVdpaDevice {
int config_size;
uint16_t queue_size;
bool started;
+ bool suspended;
int (*post_init)(VhostVdpaDevice *v, Error **errp);
VMChangeStateEntry *vmstate;
Notifier migration_state;
--
2.27.0

View File

@ -0,0 +1,45 @@
From a78602118043eb9923996504d5b2e1b14a1ec38d Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Thu, 21 Dec 2023 11:03:37 +0800
Subject: [PATCH] vdpa: suspend function return 0 when the vdpa device is
stopped
When vhost vdpa device is stopped(vdpa->started is false), suspend
operation do nothing and return success, instead of return failure.
The same goes for resume function.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev-mig.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
index 8b13f89c85..b889dd4715 100644
--- a/hw/virtio/vdpa-dev-mig.c
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -134,8 +134,8 @@ static int vhost_vdpa_device_suspend(VhostVdpaDevice *vdpa)
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
int ret;
- if (!vdpa->started) {
- return -EFAULT;
+ if (!vdpa->started || vdpa->suspended) {
+ return 0;
}
if (!k->set_guest_notifiers) {
@@ -178,6 +178,10 @@ static int vhost_vdpa_device_resume(VhostVdpaDevice *vdpa)
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
int i, ret;
+ if (vdpa->started || !vdpa->suspended) {
+ return 0;
+ }
+
if (!k->set_guest_notifiers) {
error_report("binding does not support guest notifiers\n");
return -ENOSYS;
--
2.27.0

View File

@ -0,0 +1,204 @@
From bd2d81775edf285149346bf793d9b71236d7cf34 Mon Sep 17 00:00:00 2001
From: Zenghui Yu <yuzenghui@huawei.com>
Date: Sat, 8 May 2021 17:31:04 +0800
Subject: [PATCH] vfio: Maintain DMA mapping range for the container
When synchronizing dirty bitmap from kernel VFIO we do it in a
per-iova-range fashion and we allocate the userspace bitmap for each of the
ioctl. This patch introduces `struct VFIODMARange` to describe a range of
the given DMA mapping with respect to a VFIO_IOMMU_MAP_DMA operation, and
make the bitmap cache of this range be persistent so that we don't need to
g_try_malloc0() every time. Note that the new structure is almost a copy of
`struct vfio_iommu_type1_dma_map` but only internally used by QEMU.
More importantly, the cached per-iova-range dirty bitmap will be further
used when we want to add support for the CLEAR_BITMAP and this cached
bitmap will be used to guarantee we don't clear any unknown dirty bits
otherwise that can be a severe data loss issue for migration code.
It's pretty intuitive to maintain a bitmap per container since we perform
log_sync at this granule. But I don't know how to deal with things like
memory hot-{un}plug, sparse DMA mappings, etc. Suggestions welcome.
* yet something to-do:
- can't work with guest viommu
- no locks
- etc
[ The idea and even the commit message are largely inherited from kvm side.
See commit 9f4bf4baa8b820c7930e23c9566c9493db7e1d25. ]
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jinagkunkun@huawei.com>
---
hw/vfio/common.c | 9 +++++--
hw/vfio/container.c | 49 +++++++++++++++++++++++++++++++++++
include/hw/vfio/vfio-common.h | 12 +++++++++
3 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index e70fdf5e0c..564e933135 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1156,6 +1156,7 @@ int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova,
vfio_devices_all_device_dirty_tracking(container);
uint64_t dirty_pages;
VFIOBitmap vbmap;
+ VFIODMARange *qrange;
int ret;
if (!container->dirty_pages_supported && !all_device_dirty_tracking) {
@@ -1165,10 +1166,16 @@ int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova,
return 0;
}
+ qrange = vfio_lookup_match_range(container, iova, size);
+ /* the same as vfio_dma_unmap() */
+ assert(qrange);
+
ret = vfio_bitmap_alloc(&vbmap, size);
if (ret) {
return ret;
}
+ g_free(vbmap.bitmap);
+ vbmap.bitmap = qrange->bitmap;
if (all_device_dirty_tracking) {
ret = vfio_devices_query_dirty_bitmap(container, &vbmap, iova, size);
@@ -1186,8 +1193,6 @@ int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova,
trace_vfio_get_dirty_bitmap(container->fd, iova, size, vbmap.size,
ram_addr, dirty_pages);
out:
- g_free(vbmap.bitmap);
-
return ret;
}
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 242010036a..9a176a0d33 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -112,6 +112,29 @@ unmap_exit:
return ret;
}
+VFIODMARange *vfio_lookup_match_range(VFIOContainer *container,
+ hwaddr start_addr, hwaddr size)
+{
+ VFIODMARange *qrange;
+
+ QLIST_FOREACH(qrange, &container->dma_list, next) {
+ if (qrange->iova == start_addr && qrange->size == size) {
+ return qrange;
+ }
+ }
+ return NULL;
+}
+
+void vfio_dma_range_init_dirty_bitmap(VFIODMARange *qrange)
+{
+ uint64_t pages, size;
+
+ pages = REAL_HOST_PAGE_ALIGN(qrange->size) / qemu_real_host_page_size();
+ size = ROUND_UP(pages, sizeof(__u64) * BITS_PER_BYTE) / BITS_PER_BYTE;
+
+ qrange->bitmap = g_malloc0(size);
+}
+
/*
* DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86
*/
@@ -124,6 +147,7 @@ int vfio_dma_unmap(VFIOContainer *container, hwaddr iova,
.iova = iova,
.size = size,
};
+ VFIODMARange *qrange;
bool need_dirty_sync = false;
int ret;
@@ -136,6 +160,22 @@ int vfio_dma_unmap(VFIOContainer *container, hwaddr iova,
need_dirty_sync = true;
}
+ /*
+ * unregister the DMA range
+ *
+ * It seems that the memory layer will give us the same section as the one
+ * used in region_add(). Otherwise it'll be complicated to manipulate the
+ * bitmap across region_{add,del}. Is there any guarantee?
+ *
+ * But there is really not such a restriction on the kernel interface
+ * (VFIO_IOMMU_DIRTY_PAGES_FLAG_{UN}MAP_DMA, etc).
+ */
+ qrange = vfio_lookup_match_range(container, iova, size);
+ assert(qrange);
+ g_free(qrange->bitmap);
+ QLIST_REMOVE(qrange, next);
+ g_free(qrange);
+
while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
/*
* The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c
@@ -180,6 +220,14 @@ int vfio_dma_map(VFIOContainer *container, hwaddr iova,
.iova = iova,
.size = size,
};
+ VFIODMARange *qrange;
+
+ qrange = g_malloc0(sizeof(*qrange));
+ qrange->iova = iova;
+ qrange->size = size;
+ QLIST_INSERT_HEAD(&container->dma_list, qrange, next);
+ /* XXX allocate the dirty bitmap on demand */
+ vfio_dma_range_init_dirty_bitmap(qrange);
if (!readonly) {
map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
@@ -552,6 +600,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
container->iova_ranges = NULL;
QLIST_INIT(&container->giommu_list);
QLIST_INIT(&container->vrdl_list);
+ QLIST_INIT(&container->dma_list);
ret = vfio_init_container(container, group->fd, errp);
if (ret) {
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index a4a22accb9..b131d04c9c 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -80,6 +80,14 @@ typedef struct VFIOAddressSpace {
struct VFIOGroup;
+typedef struct VFIODMARange {
+ QLIST_ENTRY(VFIODMARange) next;
+ hwaddr iova;
+ size_t size;
+ void *vaddr; /* unused */
+ unsigned long *bitmap; /* dirty bitmap cache for this range */
+} VFIODMARange;
+
typedef struct VFIOContainer {
VFIOAddressSpace *space;
int fd; /* /dev/vfio/vfio, empowered by the attached groups */
@@ -97,6 +105,7 @@ typedef struct VFIOContainer {
QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
QLIST_HEAD(, VFIOGroup) group_list;
QLIST_HEAD(, VFIORamDiscardListener) vrdl_list;
+ QLIST_HEAD(, VFIODMARange) dma_list;
QLIST_ENTRY(VFIOContainer) next;
QLIST_HEAD(, VFIODevice) device_list;
GList *iova_ranges;
@@ -212,6 +221,9 @@ void vfio_put_address_space(VFIOAddressSpace *space);
bool vfio_devices_all_running_and_saving(VFIOContainer *container);
/* container->fd */
+VFIODMARange *vfio_lookup_match_range(VFIOContainer *container,
+ hwaddr start_addr, hwaddr size);
+void vfio_dma_range_init_dirty_bitmap(VFIODMARange *qrange);
int vfio_dma_unmap(VFIOContainer *container, hwaddr iova,
ram_addr_t size, IOMMUTLBEntry *iotlb);
int vfio_dma_map(VFIOContainer *container, hwaddr iova,
--
2.27.0

View File

@ -0,0 +1,229 @@
From 24c3ff779f35b40967d195e4764d4cb605c1a304 Mon Sep 17 00:00:00 2001
From: Zenghui Yu <yuzenghui@huawei.com>
Date: Sat, 8 May 2021 17:31:05 +0800
Subject: [PATCH] vfio/migration: Add support for manual clear vfio dirty log
The new capability VFIO_DIRTY_LOG_MANUAL_CLEAR and the new ioctl
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP have been introduced in
the kernel, tweak the userspace side to use them.
Check if the kernel supports VFIO_DIRTY_LOG_MANUAL_CLEAR and
provide the log_clear() hook for vfio_memory_listener. If the
kernel supports it, deliever the clear message to kernel.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
hw/vfio/common.c | 136 ++++++++++++++++++++++++++++++++++
hw/vfio/container.c | 13 +++-
include/hw/vfio/vfio-common.h | 1 +
3 files changed, 148 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 564e933135..e08b147b3d 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1344,6 +1344,141 @@ static void vfio_listener_log_sync(MemoryListener *listener,
}
}
+/*
+ * I'm not sure if there's any alignment requirement for the CLEAR_BITMAP
+ * ioctl. But copy from kvm side and align {start, size} with 64 pages.
+ *
+ * I think the code can be simplified a lot if no alignment requirement.
+ */
+#define VFIO_CLEAR_LOG_SHIFT 6
+#define VFIO_CLEAR_LOG_ALIGN (qemu_real_host_page_size() << VFIO_CLEAR_LOG_SHIFT)
+#define VFIO_CLEAR_LOG_MASK (-VFIO_CLEAR_LOG_ALIGN)
+
+static int vfio_log_clear_one_range(VFIOContainer *container,VFIODMARange *qrange,
+ uint64_t start, uint64_t size)
+{
+ struct vfio_iommu_type1_dirty_bitmap *dbitmap;
+ struct vfio_iommu_type1_dirty_bitmap_get *range;
+
+ dbitmap = g_malloc0(sizeof(*dbitmap) + sizeof(*range));
+
+ dbitmap->argsz = sizeof(*dbitmap) + sizeof(*range);
+ dbitmap->flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP;
+ range = (struct vfio_iommu_type1_dirty_bitmap_get *)&dbitmap->data;
+
+ /*
+ * Now let's deal with the actual bitmap, which is almost the same
+ * as the kvm side.
+ */
+ uint64_t end, bmap_start, start_delta, bmap_npages;
+ unsigned long *bmap_clear = NULL, psize = qemu_real_host_page_size();
+ int ret;
+
+ bmap_start = start & VFIO_CLEAR_LOG_MASK;
+ start_delta = start - bmap_start;
+ bmap_start /= psize;
+
+ bmap_npages = DIV_ROUND_UP(size + start_delta, VFIO_CLEAR_LOG_ALIGN)
+ << VFIO_CLEAR_LOG_SHIFT;
+ end = qrange->size / psize;
+ if (bmap_npages > end - bmap_start) {
+ bmap_npages = end - bmap_start;
+ }
+ start_delta /= psize;
+
+ if (start_delta) {
+ bmap_clear = bitmap_new(bmap_npages);
+ bitmap_copy_with_src_offset(bmap_clear, qrange->bitmap,
+ bmap_start, start_delta + size / psize);
+ bitmap_clear(bmap_clear, 0, start_delta);
+ range->bitmap.data = (__u64 *)bmap_clear;
+ } else {
+ range->bitmap.data = (__u64 *)(qrange->bitmap + BIT_WORD(bmap_start));
+ }
+
+ range->iova = qrange->iova + bmap_start * psize;
+ range->size = bmap_npages * psize;
+ range->bitmap.size = ROUND_UP(bmap_npages, sizeof(__u64) * BITS_PER_BYTE) /
+ BITS_PER_BYTE;
+ range->bitmap.pgsize = qemu_real_host_page_size();
+
+ ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, dbitmap);
+ if (ret) {
+ error_report("Failed to clear dirty log for iova: 0x%"PRIx64
+ " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova,
+ (uint64_t)range->size, errno);
+ goto err_out;
+ }
+
+ bitmap_clear(qrange->bitmap, bmap_start + start_delta, size / psize);
+err_out:
+ g_free(bmap_clear);
+ g_free(dbitmap);
+ return 0;
+}
+
+static int vfio_physical_log_clear(VFIOContainer *container,
+ MemoryRegionSection *section)
+{
+ uint64_t start, size, offset, count;
+ VFIODMARange *qrange;
+ int ret = 0;
+
+ if (!container->dirty_log_manual_clear) {
+ /* No need to do explicit clear */
+ return ret;
+ }
+
+ start = section->offset_within_address_space;
+ size = int128_get64(section->size);
+
+ if (!size) {
+ return ret;
+ }
+
+ QLIST_FOREACH(qrange, &container->dma_list, next) {
+ /*
+ * Discard ranges that do not overlap the section (e.g., the
+ * Memory BAR regions of the device)
+ */
+ if (qrange->iova > start + size - 1 ||
+ start > qrange->iova + qrange->size - 1) {
+ continue;
+ }
+
+ if (start >= qrange->iova) {
+ /* The range starts before section or is aligned to it. */
+ offset = start - qrange->iova;
+ count = MIN(qrange->size - offset, size);
+ } else {
+ /* The range starts after section. */
+ offset = 0;
+ count = MIN(qrange->size, size - (qrange->iova - start));
+ }
+ ret = vfio_log_clear_one_range(container, qrange, offset, count);
+ if (ret < 0) {
+ break;
+ }
+ }
+
+ return ret;
+}
+
+static void vfio_listener_log_clear(MemoryListener *listener,
+ MemoryRegionSection *section)
+{
+ VFIOContainer *container = container_of(listener, VFIOContainer, listener);
+
+ if (vfio_listener_skipped_section(section) ||
+ !container->dirty_pages_supported) {
+ return;
+ }
+
+ if (vfio_devices_all_dirty_tracking(container)) {
+ vfio_physical_log_clear(container, section);
+ }
+}
+
const MemoryListener vfio_memory_listener = {
.name = "vfio",
.region_add = vfio_listener_region_add,
@@ -1351,6 +1486,7 @@ const MemoryListener vfio_memory_listener = {
.log_global_start = vfio_listener_log_global_start,
.log_global_stop = vfio_listener_log_global_stop,
.log_sync = vfio_listener_log_sync,
+ .log_clear = vfio_listener_log_clear,
};
void vfio_reset_handler(void *opaque)
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 9a176a0d33..d8b9117f4f 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -285,7 +285,9 @@ int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap,
dbitmap = g_malloc0(sizeof(*dbitmap) + sizeof(*range));
dbitmap->argsz = sizeof(*dbitmap) + sizeof(*range);
- dbitmap->flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP;
+ dbitmap->flags = container->dirty_log_manual_clear ?
+ VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR :
+ VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP;
range = (struct vfio_iommu_type1_dirty_bitmap_get *)&dbitmap->data;
range->iova = iova;
range->size = size;
@@ -409,7 +411,7 @@ static int vfio_get_iommu_type(VFIOContainer *container,
static int vfio_init_container(VFIOContainer *container, int group_fd,
Error **errp)
{
- int iommu_type, ret;
+ int iommu_type, dirty_log_manual_clear, ret;
iommu_type = vfio_get_iommu_type(container, errp);
if (iommu_type < 0) {
@@ -438,6 +440,13 @@ static int vfio_init_container(VFIOContainer *container, int group_fd,
}
container->iommu_type = iommu_type;
+
+ dirty_log_manual_clear = ioctl(container->fd, VFIO_CHECK_EXTENSION,
+ VFIO_DIRTY_LOG_MANUAL_CLEAR);
+ if (dirty_log_manual_clear) {
+ container->dirty_log_manual_clear = dirty_log_manual_clear;
+ }
+
return 0;
}
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index b131d04c9c..fd9828d50b 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -97,6 +97,7 @@ typedef struct VFIOContainer {
Error *error;
bool initialized;
bool dirty_pages_supported;
+ bool dirty_log_manual_clear;
uint64_t dirty_pgsizes;
uint64_t max_dirty_bitmap_size;
unsigned long pgsizes;
--
2.27.0

View File

@ -0,0 +1,38 @@
From b0a62a84bd1c6ad5d4c11463371fcf267b56d902 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:13:41 +0800
Subject: [PATCH] vhost: add vhost_dev_suspend/resume_op
Introduce new vhost interface to support vhost device suspend & resume
Signed-off-by: libai <libai12@huawei.com>
---
include/hw/virtio/vhost-backend.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index 71b02e4a12..84b8fa1075 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -155,6 +155,9 @@ typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev,
Error **errp);
typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp);
+typedef int (*vhost_dev_suspend_op)(struct vhost_dev *dev);
+typedef int (*vhost_dev_resume_op)(struct vhost_dev *dev);
+
typedef struct VhostOps {
VhostBackendType backend_type;
vhost_backend_init vhost_backend_init;
@@ -208,6 +211,8 @@ typedef struct VhostOps {
vhost_supports_device_state_op vhost_supports_device_state;
vhost_set_device_state_fd_op vhost_set_device_state_fd;
vhost_check_device_state_op vhost_check_device_state;
+ vhost_dev_suspend_op vhost_dev_suspend;
+ vhost_dev_resume_op vhost_dev_resume;
} VhostOps;
int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
--
2.27.0

View File

@ -0,0 +1,86 @@
From 302401ee7eb437712b69caff44ce684c88573dc6 Mon Sep 17 00:00:00 2001
From: Chuan Zheng <zhengchuan@huawei.com>
Date: Mon, 29 Jul 2019 16:22:12 +0800
Subject: [PATCH] vhost: cancel migration when vhost-user restarted during
migraiton
Qemu will abort when vhost-user process is restarted during migration
when vhost_log_global_start/stop is called. The reason is clear that
vhost_dev_set_log returns -1 because network connection is temporarily
lost. Let's cancel migraiton and report it to user in this abnormal
situation.
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
hw/virtio/vhost.c | 9 +++++++--
migration/migration.c | 2 +-
migration/migration.h | 1 +
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2c9ac79468..a8adc149ad 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -26,6 +26,7 @@
#include "hw/mem/memory-device.h"
#include "migration/blocker.h"
#include "migration/qemu-file-types.h"
+#include "migration/migration.h"
#include "sysemu/dma.h"
#include "trace.h"
@@ -1047,20 +1048,24 @@ check_dev_state:
static void vhost_log_global_start(MemoryListener *listener)
{
int r;
+ Error *errp = NULL;
r = vhost_migration_log(listener, true);
if (r < 0) {
- abort();
+ error_setg(&errp, "Failed to start vhost migration log");
+ migrate_fd_error(migrate_get_current(), errp);
}
}
static void vhost_log_global_stop(MemoryListener *listener)
{
int r;
+ Error *errp = NULL;
r = vhost_migration_log(listener, false);
if (r < 0) {
- abort();
+ error_setg(&errp, "Failed to stop vhost migration log");
+ migrate_fd_error(migrate_get_current(), errp);
}
}
diff --git a/migration/migration.c b/migration/migration.c
index 3ce04b2aaf..71a03b3248 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1377,7 +1377,7 @@ static void migrate_error_free(MigrationState *s)
}
}
-static void migrate_fd_error(MigrationState *s, const Error *error)
+void migrate_fd_error(MigrationState *s, const Error *error)
{
trace_migrate_fd_error(error_get_pretty(error));
assert(s->to_dst_file == NULL);
diff --git a/migration/migration.h b/migration/migration.h
index cf2c9c88e0..6aafa04314 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -482,6 +482,7 @@ bool migration_has_all_channels(void);
uint64_t migrate_max_downtime(void);
+void migrate_fd_error(MigrationState *s, const Error *error);
void migrate_set_error(MigrationState *s, const Error *error);
bool migrate_has_error(MigrationState *s);
--
2.27.0

View File

@ -0,0 +1,87 @@
From 3ef6dc341d6921a95564e9089f41ddbd79cd2a94 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:55:53 +0800
Subject: [PATCH] vhost: implement migration state notifier for vdpa device
Register migration state notifier to support triggered by
migration exceptions
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev-mig.c | 29 +++++++++++++++++++++++++++++
include/hw/virtio/vdpa-dev.h | 1 +
2 files changed, 30 insertions(+)
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
index 1872f11f3f..9b47e3ed45 100644
--- a/hw/virtio/vdpa-dev-mig.c
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -23,6 +23,7 @@
#include "hw/virtio/virtio-bus.h"
#include "migration/register.h"
#include "migration/migration.h"
+#include "migration/misc.h"
#include "qemu/error-report.h"
#include "hw/virtio/vdpa-dev-mig.h"
#include "migration/qemu-file-types.h"
@@ -354,6 +355,31 @@ static SaveVMHandlers savevm_vdpa_handlers = {
.load_setup = vdpa_load_setup,
};
+static void vdpa_migration_state_notifier(Notifier *notifier, void *data)
+{
+ MigrationState *s = data;
+ VhostVdpaDevice *vdev = container_of(notifier,
+ VhostVdpaDevice,
+ migration_state);
+ struct vhost_dev *hdev = &vdev->dev;
+ int ret;
+
+ switch (s->state) {
+ case MIGRATION_STATUS_CANCELLING:
+ case MIGRATION_STATUS_CANCELLED:
+ case MIGRATION_STATUS_FAILED:
+ ret = vhost_vdpa_set_mig_state(hdev, VDPA_DEVICE_CANCEL);
+ if (ret) {
+ error_report("Failed to set state CANCEL\n");
+ }
+
+ break;
+ case MIGRATION_STATUS_COMPLETED:
+ default:
+ break;
+ }
+}
+
void vdpa_migration_register(VhostVdpaDevice *vdev)
{
vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
@@ -361,10 +387,13 @@ void vdpa_migration_register(VhostVdpaDevice *vdev)
DEVICE(vdev));
register_savevm_live("vdpa", -1, 1,
&savevm_vdpa_handlers, DEVICE(vdev));
+ vdev->migration_state.notify = vdpa_migration_state_notifier;
+ migration_add_notifier(&vdev->migration_state, vdpa_migration_state_notifier);
}
void vdpa_migration_unregister(VhostVdpaDevice *vdev)
{
+ migration_remove_notifier(&vdev->migration_state);
unregister_savevm(VMSTATE_IF(&vdev->parent_obj.parent_obj), "vdpa", DEVICE(vdev));
qemu_del_vm_change_state_handler(vdev->vmstate);
}
diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
index 43cbcef81b..20f50c76c6 100644
--- a/include/hw/virtio/vdpa-dev.h
+++ b/include/hw/virtio/vdpa-dev.h
@@ -39,6 +39,7 @@ struct VhostVdpaDevice {
bool started;
int (*post_init)(VhostVdpaDevice *v, Error **errp);
VMChangeStateEntry *vmstate;
+ Notifier migration_state;
};
#endif
--
2.27.0

View File

@ -0,0 +1,57 @@
From 229737ca91d4e81b4a14143da9981bd59b80a539 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:57:35 +0800
Subject: [PATCH] vhost: implement post resume bh
Set vdpa device mig state to post start when vm post start
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev-mig.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
index 662d4a29dc..1872f11f3f 100644
--- a/hw/virtio/vdpa-dev-mig.c
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -26,6 +26,7 @@
#include "qemu/error-report.h"
#include "hw/virtio/vdpa-dev-mig.h"
#include "migration/qemu-file-types.h"
+#include "qemu/main-loop.h"
/*
* Flags used as delimiter:
@@ -218,6 +219,18 @@ err_host_notifiers:
return ret;
}
+static void vdpa_dev_migration_handle_incoming_bh(void *opaque)
+{
+ struct vhost_dev *hdev = opaque;
+ int ret;
+
+ /* Post start device, unsupport rollback if failed! */
+ ret = vhost_vdpa_set_mig_state(hdev, VDPA_DEVICE_POST_START);
+ if (ret) {
+ error_report("Failed to set state: POST_START\n");
+ }
+}
+
static void vdpa_dev_vmstate_change(void *opaque, bool running, RunState state)
{
VhostVdpaDevice *vdpa = VHOST_VDPA_DEVICE(opaque);
@@ -247,6 +260,10 @@ static void vdpa_dev_vmstate_change(void *opaque, bool running, RunState state)
if (mis->state == RUN_STATE_RESTORE_VM) {
vhost_vdpa_call(hdev, VHOST_VDPA_RESUME, NULL);
+ /* post resume */
+ mis->bh = qemu_bh_new(vdpa_dev_migration_handle_incoming_bh,
+ hdev);
+ qemu_bh_schedule(mis->bh);
}
}
}
--
2.27.0

View File

@ -0,0 +1,270 @@
From 556aaa9632862505548d5083d369e92590fb2087 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:53:28 +0800
Subject: [PATCH] vhost: implement savevm_handler for vdpa device
Register savevm_handler ops for vdpa devices to support migration:x
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vdpa-dev-mig.c | 175 +++++++++++++++++++++++++++++++
include/hw/virtio/vdpa-dev-mig.h | 13 +++
linux-headers/linux/vhost.h | 9 ++
3 files changed, 197 insertions(+)
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
index 1d2bed2571..662d4a29dc 100644
--- a/hw/virtio/vdpa-dev-mig.c
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -21,9 +21,21 @@
#include "hw/virtio/vhost.h"
#include "hw/virtio/vdpa-dev.h"
#include "hw/virtio/virtio-bus.h"
+#include "migration/register.h"
#include "migration/migration.h"
#include "qemu/error-report.h"
#include "hw/virtio/vdpa-dev-mig.h"
+#include "migration/qemu-file-types.h"
+
+/*
+ * Flags used as delimiter:
+ * 0xffffffff => MSB 32-bit all 1s
+ * 0xef10 => emulated (virtual) function IO
+ * 0x0000 => 16-bits reserved for flags
+ */
+#define VDPA_MIG_FLAG_END_OF_STATE (0xffffffffef100001ULL)
+#define VDPA_MIG_FLAG_DEV_CONFIG_STATE (0xffffffffef100002ULL)
+#define VDPA_MIG_FLAG_DEV_SETUP_STATE (0xffffffffef100003ULL)
static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
void *arg)
@@ -39,6 +51,80 @@ static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
return ioctl(fd, request, arg);
}
+static int vhost_vdpa_set_mig_state(struct vhost_dev *dev, uint8_t state)
+{
+ return vhost_vdpa_call(dev, VHOST_VDPA_SET_MIG_STATE, &state);
+}
+
+static int vhost_vdpa_dev_buffer_size(struct vhost_dev *dev, uint32_t *size)
+{
+ return vhost_vdpa_call(dev, VHOST_GET_DEV_BUFFER_SIZE, size);
+}
+
+static int vhost_vdpa_dev_buffer_save(struct vhost_dev *dev, QEMUFile *f)
+{
+ struct vhost_vdpa_config *config;
+ unsigned long config_size = offsetof(struct vhost_vdpa_config, buf);
+ uint32_t buffer_size = 0;
+ int ret;
+
+ ret = vhost_vdpa_dev_buffer_size(dev, &buffer_size);
+ if (ret) {
+ error_report("get dev buffer size failed: %d\n", ret);
+ return ret;
+ }
+
+ qemu_put_be32(f, buffer_size);
+
+ config = g_malloc(buffer_size + config_size);
+ config->off = 0;
+ config->len = buffer_size;
+
+ ret = vhost_vdpa_call(dev, VHOST_GET_DEV_BUFFER, config);
+ if (ret) {
+ error_report("get dev buffer failed: %d\n", ret);
+ goto free;
+ }
+
+ qemu_put_buffer(f, config->buf, buffer_size);
+free:
+ g_free(config);
+
+ return ret;
+}
+
+static int vhost_vdpa_dev_buffer_load(struct vhost_dev *dev, QEMUFile *f)
+{
+ struct vhost_vdpa_config *config;
+ unsigned long config_size = offsetof(struct vhost_vdpa_config, buf);
+ uint32_t buffer_size, recv_size;
+ int ret;
+
+ buffer_size = qemu_get_be32(f);
+
+ config = g_malloc(buffer_size + config_size);
+ config->off = 0;
+ config->len = buffer_size;
+
+ recv_size = qemu_get_buffer(f, config->buf, buffer_size);
+ if (recv_size != buffer_size) {
+ error_report("read dev mig buffer failed, buffer_size: %u, "
+ "recv_size: %u\n", buffer_size, recv_size);
+ ret = -EINVAL;
+ goto free;
+ }
+
+ ret = vhost_vdpa_call(dev, VHOST_SET_DEV_BUFFER, config);
+ if (ret) {
+ error_report("set dev buffer failed: %d\n", ret);
+ }
+
+free:
+ g_free(config);
+
+ return ret;
+}
+
static int vhost_vdpa_device_suspend(VhostVdpaDevice *vdpa)
{
VirtIODevice *vdev = VIRTIO_DEVICE(vdpa);
@@ -165,14 +251,103 @@ static void vdpa_dev_vmstate_change(void *opaque, bool running, RunState state)
}
}
+static int vdpa_save_setup(QEMUFile *f, void *opaque)
+{
+ qemu_put_be64(f, VDPA_MIG_FLAG_DEV_SETUP_STATE);
+ qemu_put_be64(f, VDPA_MIG_FLAG_END_OF_STATE);
+
+ return qemu_file_get_error(f);
+}
+
+static int vdpa_save_complete_precopy(QEMUFile *f, void *opaque)
+{
+ VhostVdpaDevice *vdev = VHOST_VDPA_DEVICE(opaque);
+ struct vhost_dev *hdev = &vdev->dev;
+ int ret;
+
+ qemu_put_be64(f, VDPA_MIG_FLAG_DEV_CONFIG_STATE);
+ ret = vhost_vdpa_dev_buffer_save(hdev, f);
+ if (ret) {
+ error_report("Save vdpa device buffer failed: %d\n", ret);
+ return ret;
+ }
+ qemu_put_be64(f, VDPA_MIG_FLAG_END_OF_STATE);
+
+ return qemu_file_get_error(f);
+}
+
+static int vdpa_load_state(QEMUFile *f, void *opaque, int version_id)
+{
+ VhostVdpaDevice *vdev = VHOST_VDPA_DEVICE(opaque);
+ struct vhost_dev *hdev = &vdev->dev;
+
+ int ret;
+ uint64_t data;
+
+ data = qemu_get_be64(f);
+ while (data != VDPA_MIG_FLAG_END_OF_STATE) {
+ if (data == VDPA_MIG_FLAG_DEV_SETUP_STATE) {
+ data = qemu_get_be64(f);
+ if (data == VDPA_MIG_FLAG_END_OF_STATE) {
+ return 0;
+ } else {
+ error_report("SETUP STATE: EOS not found 0x%lx\n", data);
+ return -EINVAL;
+ }
+ } else if (data == VDPA_MIG_FLAG_DEV_CONFIG_STATE) {
+ ret = vhost_vdpa_dev_buffer_load(hdev, f);
+ if (ret) {
+ error_report("fail to restore device buffer.\n");
+ return ret;
+ }
+ }
+
+ ret = qemu_file_get_error(f);
+ if (ret) {
+ error_report("qemu file error: %d\n", ret);
+ return ret;
+ }
+ data = qemu_get_be64(f);
+ }
+
+ return 0;
+}
+
+static int vdpa_load_setup(QEMUFile *f, void *opaque)
+{
+ VhostVdpaDevice *v = VHOST_VDPA_DEVICE(opaque);
+ struct vhost_dev *hdev = &v->dev;
+ int ret = 0;
+
+ ret = vhost_vdpa_set_mig_state(hdev, VDPA_DEVICE_PRE_START);
+ if (ret) {
+ error_report("pre start device failed: %d\n", ret);
+ goto out;
+ }
+
+ return qemu_file_get_error(f);
+out:
+ return ret;
+}
+
+static SaveVMHandlers savevm_vdpa_handlers = {
+ .save_setup = vdpa_save_setup,
+ .save_live_complete_precopy = vdpa_save_complete_precopy,
+ .load_state = vdpa_load_state,
+ .load_setup = vdpa_load_setup,
+};
+
void vdpa_migration_register(VhostVdpaDevice *vdev)
{
vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
vdpa_dev_vmstate_change,
DEVICE(vdev));
+ register_savevm_live("vdpa", -1, 1,
+ &savevm_vdpa_handlers, DEVICE(vdev));
}
void vdpa_migration_unregister(VhostVdpaDevice *vdev)
{
+ unregister_savevm(VMSTATE_IF(&vdev->parent_obj.parent_obj), "vdpa", DEVICE(vdev));
qemu_del_vm_change_state_handler(vdev->vmstate);
}
diff --git a/include/hw/virtio/vdpa-dev-mig.h b/include/hw/virtio/vdpa-dev-mig.h
index 89665ca747..adc1d657f7 100644
--- a/include/hw/virtio/vdpa-dev-mig.h
+++ b/include/hw/virtio/vdpa-dev-mig.h
@@ -9,6 +9,19 @@
#include "hw/virtio/vdpa-dev.h"
+enum {
+ VDPA_DEVICE_START,
+ VDPA_DEVICE_STOP,
+ VDPA_DEVICE_PRE_START,
+ VDPA_DEVICE_PRE_STOP,
+ VDPA_DEVICE_CANCEL,
+ VDPA_DEVICE_POST_START,
+ VDPA_DEVICE_START_ASYNC,
+ VDPA_DEVICE_STOP_ASYNC,
+ VDPA_DEVICE_PRE_START_ASYNC,
+ VDPA_DEVICE_QUERY_OP_STATE,
+};
+
void vdpa_migration_register(VhostVdpaDevice *vdev);
void vdpa_migration_unregister(VhostVdpaDevice *vdev);
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 19dc7fd36c..a08e980a1e 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -231,4 +231,13 @@
*/
#define VHOST_VDPA_GET_VRING_DESC_GROUP _IOWR(VHOST_VIRTIO, 0x7F, \
struct vhost_vring_state)
+
+/* set and get device buffer */
+#define VHOST_GET_DEV_BUFFER _IOR(VHOST_VIRTIO, 0xb0, struct vhost_vdpa_config)
+#define VHOST_SET_DEV_BUFFER _IOW(VHOST_VIRTIO, 0xb1, struct vhost_vdpa_config)
+#define VHOST_GET_DEV_BUFFER_SIZE _IOR(VHOST_VIRTIO, 0xb3, __u32)
+
+/* set device migtration state */
+#define VHOST_VDPA_SET_MIG_STATE _IOW(VHOST_VIRTIO, 0xb2, __u8)
+
#endif
--
2.27.0

View File

@ -0,0 +1,80 @@
From a7f9a67ee98a5261f7639619055034f40bccfef0 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:22:20 +0800
Subject: [PATCH] vhost: implement vhost-vdpa suspend/resume
vhost-vdpa implements the vhost_dev_suspend interface,
which will be called during the shutdown phase of the
live migration source virtual machine to suspend the
device but not reset the device information.
vhost-vdpa implements the vhost_dev_resume interface.
If the live migration fails, it will be called during
the startup phase of the source virtual machine.
Enable the device but set the status, etc.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vhost-vdpa.c | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 037a9c6e4c..063e941544 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1513,6 +1513,45 @@ static bool vhost_vdpa_force_iommu(struct vhost_dev *dev)
return true;
}
+static int vhost_vdpa_suspend_device(struct vhost_dev *dev)
+{
+ struct vhost_vdpa *v = dev->opaque;
+ int ret;
+
+ vhost_vdpa_svqs_stop(dev);
+ vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
+
+ if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
+ return 0;
+ }
+
+ ret = vhost_vdpa_call(dev, VHOST_VDPA_SUSPEND, NULL);
+ memory_listener_unregister(&v->listener);
+ return ret;
+}
+
+static int vhost_vdpa_resume_device(struct vhost_dev *dev)
+{
+ struct vhost_vdpa *v = dev->opaque;
+ bool ok;
+
+ vhost_vdpa_host_notifiers_init(dev);
+ ok = vhost_vdpa_svqs_start(dev);
+ if (unlikely(!ok)) {
+ return -1;
+ }
+ for (int i = 0; i < v->dev->nvqs; ++i) {
+ vhost_vdpa_set_vring_ready(v, v->dev->vq_index + i);
+ }
+
+ if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
+ return 0;
+ }
+
+ memory_listener_register(&v->listener, &address_space_memory);
+ return vhost_vdpa_call(dev, VHOST_VDPA_RESUME, NULL);
+}
+
static int vhost_vdpa_log_sync(struct vhost_dev *dev)
{
struct vhost_vdpa *v = dev->opaque;
@@ -1559,4 +1598,6 @@ const VhostOps vdpa_ops = {
.vhost_log_sync = vhost_vdpa_log_sync,
.vhost_set_config_call = vhost_vdpa_set_config_call,
.vhost_reset_status = vhost_vdpa_reset_status,
+ .vhost_dev_suspend = vhost_vdpa_suspend_device,
+ .vhost_dev_resume = vhost_vdpa_resume_device,
};
--
2.27.0

View File

@ -0,0 +1,447 @@
From 4c5a9a0703e227186639124f09cdf7214e40ea7d Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:27:34 +0800
Subject: [PATCH] vhost: implement vhost_vdpa_device_suspend/resume
Implement vhost device suspend & resume interface
Signed-off-by: jiangdongxu <jiangdongxu1@huawei.com>
Signed-off-by: fangyi <eric.fangyi@huawei.com>
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/meson.build | 2 +-
hw/virtio/vdpa-dev-mig.c | 178 +++++++++++++++++++++++++++++++
hw/virtio/vhost.c | 138 ++++++++++++++++++++++++
include/hw/virtio/vdpa-dev-mig.h | 16 +++
include/hw/virtio/vdpa-dev.h | 1 +
include/hw/virtio/vhost.h | 3 +
migration/migration.c | 3 +-
migration/migration.h | 2 +
8 files changed, 340 insertions(+), 3 deletions(-)
create mode 100644 hw/virtio/vdpa-dev-mig.c
create mode 100644 include/hw/virtio/vdpa-dev-mig.h
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index c0055a7832..596651d113 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -5,7 +5,7 @@ system_virtio_ss.add(when: 'CONFIG_VIRTIO_MMIO', if_true: files('virtio-mmio.c')
system_virtio_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('virtio-crypto.c'))
system_virtio_ss.add(when: 'CONFIG_VHOST_VSOCK_COMMON', if_true: files('vhost-vsock-common.c'))
system_virtio_ss.add(when: 'CONFIG_VIRTIO_IOMMU', if_true: files('virtio-iommu.c'))
-system_virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev.c'))
+system_virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev.c', 'vdpa-dev-mig.c'))
specific_virtio_ss = ss.source_set()
specific_virtio_ss.add(files('virtio.c'))
diff --git a/hw/virtio/vdpa-dev-mig.c b/hw/virtio/vdpa-dev-mig.c
new file mode 100644
index 0000000000..1d2bed2571
--- /dev/null
+++ b/hw/virtio/vdpa-dev-mig.c
@@ -0,0 +1,178 @@
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2023. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <sys/ioctl.h>
+#include <linux/vhost.h>
+#include "qemu/osdep.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "hw/virtio/virtio-bus.h"
+#include "migration/migration.h"
+#include "qemu/error-report.h"
+#include "hw/virtio/vdpa-dev-mig.h"
+
+static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
+ void *arg)
+{
+ struct vhost_vdpa *v = dev->opaque;
+ int fd = v->device_fd;
+
+ if (dev->vhost_ops->backend_type != VHOST_BACKEND_TYPE_VDPA) {
+ error_report("backend type isn't VDPA. Operation not permitted!\n");
+ return -EPERM;
+ }
+
+ return ioctl(fd, request, arg);
+}
+
+static int vhost_vdpa_device_suspend(VhostVdpaDevice *vdpa)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(vdpa);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int ret;
+
+ if (!vdpa->started) {
+ return -EFAULT;
+ }
+
+ if (!k->set_guest_notifiers) {
+ return -EFAULT;
+ }
+
+ vdpa->started = false;
+
+ ret = vhost_dev_suspend(&vdpa->dev, vdev, false);
+ if (ret) {
+ goto suspend_fail;
+ }
+
+ ret = k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, false);
+ if (ret < 0) {
+ error_report("vhost guest notifier cleanup failed: %d\n", ret);
+ goto set_guest_notifiers_fail;
+ }
+
+ vhost_dev_disable_notifiers(&vdpa->dev, vdev);
+ return ret;
+
+set_guest_notifiers_fail:
+ ret = k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, true);
+ if (ret) {
+ error_report("vhost guest notifier restore failed: %d\n", ret);
+ }
+
+suspend_fail:
+ vdpa->started = true;
+ return ret;
+}
+
+static int vhost_vdpa_device_resume(VhostVdpaDevice *vdpa)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(vdpa);
+ BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+ VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+ int i, ret;
+
+ if (!k->set_guest_notifiers) {
+ error_report("binding does not support guest notifiers\n");
+ return -ENOSYS;
+ }
+
+ ret = vhost_dev_enable_notifiers(&vdpa->dev, vdev);
+ if (ret < 0) {
+ error_report("Error enabling host notifiers: %d\n", ret);
+ return ret;
+ }
+
+ ret = k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, true);
+ if (ret < 0) {
+ error_report("Error binding guest notifier: %d\n", ret);
+ goto err_host_notifiers;
+ }
+
+ vdpa->dev.acked_features = vdev->guest_features;
+
+ ret = vhost_dev_resume(&vdpa->dev, vdev, false);
+ if (ret < 0) {
+ error_report("Error starting vhost: %d\n", ret);
+ goto err_guest_notifiers;
+ }
+ vdpa->started = true;
+
+ /*
+ * guest_notifier_mask/pending not used yet, so just unmask
+ * everything here. virtio-pci will do the right thing by
+ * enabling/disabling irqfd.
+ */
+ for (i = 0; i < vdpa->dev.nvqs; i++) {
+ vhost_virtqueue_mask(&vdpa->dev, vdev, i, false);
+ }
+
+ return ret;
+
+err_guest_notifiers:
+ k->set_guest_notifiers(qbus->parent, vdpa->dev.nvqs, false);
+err_host_notifiers:
+ vhost_dev_disable_notifiers(&vdpa->dev, vdev);
+ return ret;
+}
+
+static void vdpa_dev_vmstate_change(void *opaque, bool running, RunState state)
+{
+ VhostVdpaDevice *vdpa = VHOST_VDPA_DEVICE(opaque);
+ struct vhost_dev *hdev = &vdpa->dev;
+ int ret;
+ MigrationState *ms = migrate_get_current();
+ MigrationIncomingState *mis = migration_incoming_get_current();
+
+ if (!running) {
+ if (ms->state == RUN_STATE_PAUSED) {
+ ret = vhost_vdpa_device_suspend(vdpa);
+ if (ret) {
+ error_report("suspend vdpa device failed: %d\n", ret);
+ if (ms->migration_thread_running) {
+ migrate_fd_cancel(ms);
+ }
+ }
+ }
+ } else {
+ if (ms->state == RUN_STATE_RESTORE_VM) {
+ ret = vhost_vdpa_device_resume(vdpa);
+ if (ret) {
+ error_report("migration dest resume device failed, abort!\n");
+ exit(EXIT_FAILURE);
+ }
+ }
+
+ if (mis->state == RUN_STATE_RESTORE_VM) {
+ vhost_vdpa_call(hdev, VHOST_VDPA_RESUME, NULL);
+ }
+ }
+}
+
+void vdpa_migration_register(VhostVdpaDevice *vdev)
+{
+ vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
+ vdpa_dev_vmstate_change,
+ DEVICE(vdev));
+}
+
+void vdpa_migration_unregister(VhostVdpaDevice *vdev)
+{
+ qemu_del_vm_change_state_handler(vdev->vmstate);
+}
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 438182d850..d073a6d5a5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2492,3 +2492,141 @@ bool used_memslots_is_exceeded(void)
{
return used_memslots_exceeded;
}
+
+int vhost_dev_resume(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+{
+ int i, r;
+ EventNotifier *e = &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier;
+
+ /* should only be called after backend is connected */
+ if (!hdev->vhost_ops) {
+ error_report("Missing vhost_ops! Operation not permitted!\n");
+ return -EPERM;
+ }
+
+ vdev->vhost_started = true;
+ hdev->started = true;
+ hdev->vdev = vdev;
+
+ if (vhost_dev_has_iommu(hdev)) {
+ memory_listener_register(&hdev->iommu_listener, vdev->dma_as);
+ }
+
+ r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_mem_table failed");
+ goto fail_mem;
+ }
+ for (i = 0; i < hdev->nvqs; ++i) {
+ r = vhost_virtqueue_start(hdev,
+ vdev,
+ hdev->vqs + i,
+ hdev->vq_index + i);
+ if (r < 0) {
+ goto fail_vq;
+ }
+ }
+
+ r = event_notifier_init(e, 0);
+ if (r < 0) {
+ return r;
+ }
+ event_notifier_test_and_clear(e);
+ if (!vdev->use_guest_notifier_mask) {
+ vhost_config_mask(hdev, vdev, true);
+ }
+ if (vrings) {
+ r = vhost_dev_set_vring_enable(hdev, true);
+ if (r) {
+ goto fail_vq;
+ }
+ }
+ if (hdev->vhost_ops->vhost_dev_resume) {
+ r = hdev->vhost_ops->vhost_dev_resume(hdev);
+ if (r) {
+ goto fail_start;
+ }
+ }
+ if (vhost_dev_has_iommu(hdev)) {
+ hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
+
+ /*
+ * Update used ring information for IOTLB to work correctly,
+ * vhost-kernel code requires for this.
+ */
+ for (i = 0; i < hdev->nvqs; ++i) {
+ struct vhost_virtqueue *vq = hdev->vqs + i;
+ vhost_device_iotlb_miss(hdev, vq->used_phys, true);
+ }
+ }
+ vhost_start_config_intr(hdev);
+ return 0;
+fail_start:
+ if (vrings) {
+ vhost_dev_set_vring_enable(hdev, false);
+ }
+fail_vq:
+ while (--i >= 0) {
+ vhost_virtqueue_stop(hdev,
+ vdev,
+ hdev->vqs + i,
+ hdev->vq_index + i);
+ }
+
+fail_mem:
+ vdev->vhost_started = false;
+ hdev->started = false;
+ return r;
+}
+
+int vhost_dev_suspend(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
+{
+ int i;
+ int ret = 0;
+ EventNotifier *e = &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier;
+
+ /* should only be called after backend is connected */
+ if (!hdev->vhost_ops) {
+ error_report("Missing vhost_ops! Operation not permitted!\n");
+ return -EPERM;
+ }
+
+ event_notifier_test_and_clear(e);
+ event_notifier_test_and_clear(&vdev->config_notifier);
+
+ if (hdev->vhost_ops->vhost_dev_suspend) {
+ ret = hdev->vhost_ops->vhost_dev_suspend(hdev);
+ if (ret) {
+ goto fail_suspend;
+ }
+ }
+ if (vrings) {
+ ret = vhost_dev_set_vring_enable(hdev, false);
+ if (ret) {
+ goto fail_suspend;
+ }
+ }
+ for (i = 0; i < hdev->nvqs; ++i) {
+ vhost_virtqueue_stop(hdev,
+ vdev,
+ hdev->vqs + i,
+ hdev->vq_index + i);
+ }
+
+ if (vhost_dev_has_iommu(hdev)) {
+ hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
+ memory_listener_unregister(&hdev->iommu_listener);
+ }
+ vhost_stop_config_intr(hdev);
+ vhost_log_put(hdev, true);
+ hdev->started = false;
+ vdev->vhost_started = false;
+ hdev->vdev = NULL;
+
+ return ret;
+
+fail_suspend:
+ event_notifier_test_and_clear(e);
+
+ return ret;
+}
diff --git a/include/hw/virtio/vdpa-dev-mig.h b/include/hw/virtio/vdpa-dev-mig.h
new file mode 100644
index 0000000000..89665ca747
--- /dev/null
+++ b/include/hw/virtio/vdpa-dev-mig.h
@@ -0,0 +1,16 @@
+/*
+ * Vhost Vdpa Device Migration Header
+ *
+ * Copyright (c) Huawei Technologies Co., Ltd. 2023. All Rights Reserved.
+ */
+
+#ifndef _VHOST_VDPA_MIGRATION_H
+#define _VHOST_VDPA_MIGRATION_H
+
+#include "hw/virtio/vdpa-dev.h"
+
+void vdpa_migration_register(VhostVdpaDevice *vdev);
+
+void vdpa_migration_unregister(VhostVdpaDevice *vdev);
+
+#endif /* _VHOST_VDPA_MIGRATION_H */
diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
index 4dbf98195c..43cbcef81b 100644
--- a/include/hw/virtio/vdpa-dev.h
+++ b/include/hw/virtio/vdpa-dev.h
@@ -38,6 +38,7 @@ struct VhostVdpaDevice {
uint16_t queue_size;
bool started;
int (*post_init)(VhostVdpaDevice *v, Error **errp);
+ VMChangeStateEntry *vmstate;
};
#endif
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6ae86833e3..9ca5819deb 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -466,4 +466,7 @@ int vhost_save_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp);
*/
int vhost_load_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp);
+int vhost_dev_resume(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+int vhost_dev_suspend(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
+
#endif
diff --git a/migration/migration.c b/migration/migration.c
index 23d9233bbe..dce22c2da5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -99,7 +99,6 @@ static bool migration_object_check(MigrationState *ms, Error **errp);
static int migration_maybe_pause(MigrationState *s,
int *current_active_state,
int new_state);
-static void migrate_fd_cancel(MigrationState *s);
static bool close_return_path_on_source(MigrationState *s);
static void migration_downtime_start(MigrationState *s)
@@ -1386,7 +1385,7 @@ void migrate_fd_error(MigrationState *s, const Error *error)
migrate_set_error(s, error);
}
-static void migrate_fd_cancel(MigrationState *s)
+void migrate_fd_cancel(MigrationState *s)
{
int old_state ;
diff --git a/migration/migration.h b/migration/migration.h
index 6aafa04314..2f26c9509b 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -551,4 +551,6 @@ void migration_rp_kick(MigrationState *s);
int migration_stop_vm(RunState state);
+void migrate_fd_cancel(MigrationState *s);
+
#endif
--
2.27.0

View File

@ -0,0 +1,304 @@
From 962acd498b11ae5ccc040d76ec89990add119dec Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:09:26 +0800
Subject: [PATCH] vhost: introduce bytemap for vhost backend logging
As vhost backend may use bytemap for logging, when get log_size
of vhost device, check whether vhost device support VHOST_BACKEND_F_BYTEMAPLOG.
If vhost device support, use bytemap for logging.
By the way, add log_resize func pointer check and vhost_log_sync return
value check.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vhost.c | 89 ++++++++++++++++++++++++++++++++++++---
include/exec/memory.h | 9 ++++
include/exec/ram_addr.h | 44 +++++++++++++++++++
include/hw/virtio/vhost.h | 1 +
system/physmem.c | 11 +++++
5 files changed, 148 insertions(+), 6 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 038ac37dd0..438182d850 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -29,6 +29,7 @@
#include "migration/migration.h"
#include "sysemu/dma.h"
#include "trace.h"
+#include "qapi/qapi-commands-migration.h"
/* enabled until disconnected backend stabilizes */
#define _VHOST_DEBUG 1
@@ -44,6 +45,11 @@
do { } while (0)
#endif
+static inline bool vhost_bytemap_log_support(struct vhost_dev *dev)
+{
+ return (dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_BYTEMAPLOG));
+}
+
static struct vhost_log *vhost_log;
static struct vhost_log *vhost_log_shm;
@@ -232,12 +238,40 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
return 0;
}
+static int vhost_sync_dirty_bytemap(struct vhost_dev *dev,
+ MemoryRegionSection *section)
+{
+ unsigned long *bytemap = dev->log->log;
+ return memory_section_set_dirty_bytemap(section, bytemap);
+}
+
static void vhost_log_sync(MemoryListener *listener,
MemoryRegionSection *section)
{
struct vhost_dev *dev = container_of(listener, struct vhost_dev,
memory_listener);
- vhost_sync_dirty_bitmap(dev, section, 0x0, ~0x0ULL);
+ MigrationState *ms = migrate_get_current();
+
+ if (!dev->log_enabled || !dev->started) {
+ return;
+ }
+
+ if (dev->vhost_ops->vhost_log_sync) {
+ int r = dev->vhost_ops->vhost_log_sync(dev);
+ if (r < 0) {
+ error_report("Failed to sync dirty log: 0x%x\n", r);
+ if (migration_is_running(ms->state)) {
+ qmp_migrate_cancel(NULL);
+ }
+ return;
+ }
+ }
+
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_sync_dirty_bytemap(dev, section);
+ } else {
+ vhost_sync_dirty_bitmap(dev, section, 0x0, ~0x0ULL);
+ }
}
static void vhost_log_sync_range(struct vhost_dev *dev,
@@ -247,7 +281,11 @@ static void vhost_log_sync_range(struct vhost_dev *dev,
/* FIXME: this is N^2 in number of sections */
for (i = 0; i < dev->n_mem_sections; ++i) {
MemoryRegionSection *section = &dev->mem_sections[i];
- vhost_sync_dirty_bitmap(dev, section, first, last);
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_sync_dirty_bytemap(dev, section);
+ } else {
+ vhost_sync_dirty_bitmap(dev, section, first, last);
+ }
}
}
@@ -255,11 +293,19 @@ static uint64_t vhost_get_log_size(struct vhost_dev *dev)
{
uint64_t log_size = 0;
int i;
+ uint64_t vhost_log_chunk_size;
+
+ if (vhost_bytemap_log_support(dev)) {
+ vhost_log_chunk_size = VHOST_LOG_CHUNK_BYTES;
+ } else {
+ vhost_log_chunk_size = VHOST_LOG_CHUNK;
+ }
+
for (i = 0; i < dev->mem->nregions; ++i) {
struct vhost_memory_region *reg = dev->mem->regions + i;
uint64_t last = range_get_last(reg->guest_phys_addr,
reg->memory_size);
- log_size = MAX(log_size, last / VHOST_LOG_CHUNK + 1);
+ log_size = MAX(log_size, last / vhost_log_chunk_size + 1);
}
return log_size;
}
@@ -377,12 +423,21 @@ static bool vhost_dev_log_is_shared(struct vhost_dev *dev)
dev->vhost_ops->vhost_requires_shm_log(dev);
}
-static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
+static inline int vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
{
struct vhost_log *log = vhost_log_get(size, vhost_dev_log_is_shared(dev));
- uint64_t log_base = (uintptr_t)log->log;
+ uint64_t log_base;
+ int log_fd;
int r;
+ if (!log) {
+ r = -ENOMEM;
+ goto out;
+ }
+
+ log_base = (uint64_t)log->log;
+ log_fd = log_fd;
+
/* inform backend of log switching, this must be done before
releasing the current log, to ensure no logging is lost */
r = dev->vhost_ops->vhost_set_log_base(dev, log_base, log);
@@ -390,9 +445,19 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
VHOST_OPS_DEBUG(r, "vhost_set_log_base failed");
}
+ if (dev->vhost_ops->vhost_set_log_size) {
+ r = dev->vhost_ops->vhost_set_log_size(dev, size, dev->log);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_log_size failed");
+ }
+ }
+
vhost_log_put(dev, true);
dev->log = log;
dev->log_size = size;
+
+out:
+ return r;
}
static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
@@ -1018,7 +1083,11 @@ static int vhost_migration_log(MemoryListener *listener, bool enable)
}
vhost_log_put(dev, false);
} else {
- vhost_dev_log_resize(dev, vhost_get_log_size(dev));
+ r = vhost_dev_log_resize(dev, vhost_get_log_size(dev));
+ if ( r < 0 ) {
+ return r;
+ }
+
r = vhost_dev_set_log(dev, true);
if (r < 0) {
goto check_dev_state;
@@ -2057,6 +2126,14 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
VHOST_OPS_DEBUG(r, "vhost_set_log_base failed");
goto fail_log;
}
+
+ if (hdev->vhost_ops->vhost_set_log_size) {
+ r = hdev->vhost_ops->vhost_set_log_size(hdev, hdev->log_size, hdev->log);
+ if (r < 0) {
+ VHOST_OPS_DEBUG(r, "vhost_set_log_size failed");
+ goto fail_log;
+ }
+ }
}
if (vrings) {
r = vhost_dev_set_vring_enable(hdev, true);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 831f7c996d..e131c2682c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2594,6 +2594,15 @@ MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
MemOp op,
MemTxAttrs attrs);
+/**
+ * memory_section_set_dirty_bytemap: Mark a range of bytes as dirty for a memory section
+ * using a bytemap
+ *
+ * @section: the memory section being dirtied.
+ * @bytemap: bytemap that stores dirty page range information.
+ */
+int64_t memory_section_set_dirty_bytemap(MemoryRegionSection *section, unsigned long *bytemap);
+
/**
* address_space_init: initializes an address space
*
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 90676093f5..ef6988b445 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -535,5 +535,49 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
return num_dirty;
}
+
+#define BYTES_PER_LONG (sizeof(unsigned long))
+#define BYTE_WORD(nr) ((nr) / BYTES_PER_LONG)
+#define BYTES_TO_LONGS(nr) DIV_ROUND_UP(nr, BYTES_PER_LONG)
+
+static inline int64_t _set_dirty_bytemap_atomic(unsigned long *bytemap, unsigned long cur_pfn)
+{
+ char *byte_of_long = (char *)bytemap;
+ int i;
+ int64_t dirty_num = 0;
+
+ for (i = 0; i < BYTES_PER_LONG; i++) {
+ if (byte_of_long[i]) {
+ cpu_physical_memory_set_dirty_range((cur_pfn + i) << TARGET_PAGE_BITS,
+ TARGET_PAGE_SIZE,
+ 1 << DIRTY_MEMORY_MIGRATION);
+ /* Per byte ops, no need to atomic_xchg */
+ byte_of_long[i] = 0;
+ dirty_num++;
+ }
+ }
+
+ return dirty_num;
+}
+
+static inline int64_t cpu_physical_memory_set_dirty_bytemap(unsigned long *bytemap,
+ ram_addr_t start,
+ ram_addr_t pages)
+{
+ unsigned long i;
+ unsigned long len = BYTES_TO_LONGS(pages);
+ unsigned long pfn = (start >> TARGET_PAGE_BITS) /
+ BYTES_PER_LONG * BYTES_PER_LONG;
+ int64_t dirty_mig_bits = 0;
+
+ for (i = 0; i < len; i++) {
+ if (bytemap[i]) {
+ dirty_mig_bits += _set_dirty_bytemap_atomic(&bytemap[i],
+ pfn + BYTES_PER_LONG * i);
+ }
+ }
+
+ return dirty_mig_bits;
+}
#endif
#endif
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 444ca0ad42..6ae86833e3 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -43,6 +43,7 @@ typedef unsigned long vhost_log_chunk_t;
#define VHOST_LOG_PAGE 0x1000
#define VHOST_LOG_BITS (8 * sizeof(vhost_log_chunk_t))
#define VHOST_LOG_CHUNK (VHOST_LOG_PAGE * VHOST_LOG_BITS)
+#define VHOST_LOG_CHUNK_BYTES (VHOST_LOG_PAGE * sizeof(vhost_log_chunk_t))
#define VHOST_INVALID_FEATURE_BIT (0xff)
#define VHOST_QUEUE_NUM_CONFIG_INR 0
diff --git a/system/physmem.c b/system/physmem.c
index f14d64819b..247c252e53 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2602,6 +2602,17 @@ static void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
cpu_physical_memory_set_dirty_range(addr, length, dirty_log_mask);
}
+int64_t memory_section_set_dirty_bytemap(MemoryRegionSection *section, unsigned long *bytemap)
+{
+ ram_addr_t start = section->offset_within_region +
+ memory_region_get_ram_addr(section->mr);
+ ram_addr_t pages = int128_get64(section->size) >> TARGET_PAGE_BITS;
+
+ hwaddr idx = BYTE_WORD(
+ section->offset_within_address_space >> TARGET_PAGE_BITS);
+ return cpu_physical_memory_set_dirty_bytemap(bytemap + idx, start, pages);
+}
+
void memory_region_flush_rom_device(MemoryRegion *mr, hwaddr addr, hwaddr size)
{
/*
--
2.27.0

View File

@ -0,0 +1,168 @@
From 0bc608ab4117818b32d2a1aaf2d4f5c2aeb54af7 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Fri, 11 Feb 2022 18:05:47 +0800
Subject: [PATCH] vhost-user: Add support reconnect vhost-user socket
Add support reconnect vhost-user socket, the reconnect time
is set to be 3 seconds.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
chardev/char-socket.c | 19 ++++++++++++++++++-
hw/net/vhost_net.c | 4 +++-
hw/virtio/vhost-user.c | 6 ++++++
include/chardev/char.h | 16 ++++++++++++++++
net/vhost-user.c | 3 +++
5 files changed, 46 insertions(+), 2 deletions(-)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 034840593d..9c60e15c8e 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -337,6 +337,22 @@ static GSource *tcp_chr_add_watch(Chardev *chr, GIOCondition cond)
return qio_channel_create_watch(s->ioc, cond);
}
+static void tcp_chr_set_reconnect_time(Chardev *chr,
+ int64_t reconnect_time)
+{
+ SocketChardev *s = SOCKET_CHARDEV(chr);
+ s->reconnect_time = reconnect_time;
+}
+
+void qemu_chr_set_reconnect_time(Chardev *chr, int64_t reconnect_time)
+{
+ ChardevClass *cc = CHARDEV_GET_CLASS(chr);
+
+ if (cc->chr_set_reconnect_time) {
+ cc->chr_set_reconnect_time(chr, reconnect_time);
+ }
+}
+
static void remove_hup_source(SocketChardev *s)
{
if (s->hup_source != NULL) {
@@ -537,7 +553,7 @@ static int tcp_chr_sync_read(Chardev *chr, const uint8_t *buf, int len)
if (s->state != TCP_CHARDEV_STATE_DISCONNECTED) {
qio_channel_set_blocking(s->ioc, false, NULL);
}
- if (size == 0) {
+ if (size == 0 && chr->chr_for_flag != CHR_FOR_VHOST_USER) {
/* connection closed */
tcp_chr_disconnect(chr);
}
@@ -1543,6 +1559,7 @@ static void char_socket_class_init(ObjectClass *oc, void *data)
cc->set_msgfds = tcp_set_msgfds;
cc->chr_add_client = tcp_chr_add_client;
cc->chr_add_watch = tcp_chr_add_watch;
+ cc->chr_set_reconnect_time = tcp_chr_set_reconnect_time;
cc->chr_update_read_handler = tcp_chr_update_read_handler;
object_class_property_add(oc, "addr", "SocketAddress",
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 1b08b02477..e48c373b14 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -459,7 +459,9 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
peer = qemu_get_peer(ncs, n->max_queue_pairs);
}
- if (peer->vring_enable) {
+ /* ovs needs to restore all states of vring */
+ if (peer->vring_enable ||
+ ncs[i].peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
/* restore vring enable state */
r = vhost_set_vring_enable(peer, peer->vring_enable);
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index f214df804b..05e14e1eff 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2126,9 +2126,15 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
struct vhost_user *u;
VhostUserState *vus = (VhostUserState *) opaque;
int err;
+ Chardev *chr;
assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
+ chr = qemu_chr_fe_get_driver(((VhostUserState *)opaque)->chr);
+ if (chr) {
+ chr->chr_for_flag = CHR_FOR_VHOST_USER;
+ }
+
u = g_new0(struct vhost_user, 1);
u->user = vus;
u->dev = dev;
diff --git a/include/chardev/char.h b/include/chardev/char.h
index 01df55f9e8..f8bd469466 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -14,6 +14,8 @@
#define IAC_SB 250
#define IAC 255
+#define CHR_FOR_VHOST_USER 0x32a1
+
/* character device */
typedef struct CharBackend CharBackend;
@@ -70,6 +72,7 @@ struct Chardev {
GSource *gsource;
GMainContext *gcontext;
DECLARE_BITMAP(features, QEMU_CHAR_FEATURE_LAST);
+ int chr_for_flag;
};
/**
@@ -227,6 +230,16 @@ int qemu_chr_write(Chardev *s, const uint8_t *buf, int len, bool write_all);
#define qemu_chr_write_all(s, buf, len) qemu_chr_write(s, buf, len, true)
int qemu_chr_wait_connected(Chardev *chr, Error **errp);
+/**
+ * @qemu_chr_set_reconnect_time:
+ *
+ * Set reconnect time for char disconnect.
+ * Currently, only vhost user will call it.
+ *
+ * @reconnect_time the reconnect_time to be set
+ */
+void qemu_chr_set_reconnect_time(Chardev *chr, int64_t reconnect_time);
+
#define TYPE_CHARDEV "chardev"
OBJECT_DECLARE_TYPE(Chardev, ChardevClass, CHARDEV)
@@ -306,6 +319,9 @@ struct ChardevClass {
/* handle various events */
void (*chr_be_event)(Chardev *s, QEMUChrEvent event);
+
+ /* set reconnect time */
+ void (*chr_set_reconnect_time)(Chardev *chr, int64_t reconnect_time);
};
Chardev *qemu_chardev_new(const char *id, const char *typename,
diff --git a/net/vhost-user.c b/net/vhost-user.c
index 12555518e8..51fa8c678f 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -21,6 +21,8 @@
#include "qemu/option.h"
#include "trace.h"
+#define VHOST_USER_RECONNECT_TIME (3)
+
typedef struct NetVhostUserState {
NetClientState nc;
CharBackend chr; /* only queue index 0 */
@@ -292,6 +294,7 @@ static void net_vhost_user_event(void *opaque, QEMUChrEvent event)
trace_vhost_user_event(chr->label, event);
switch (event) {
case CHR_EVENT_OPENED:
+ qemu_chr_set_reconnect_time(chr, VHOST_USER_RECONNECT_TIME);
if (vhost_user_start(queues, ncs, s->vhost_user) < 0) {
qemu_chr_fe_disconnect(&s->chr);
return;
--
2.27.0

View File

@ -0,0 +1,96 @@
From 0154183e118169be5945cb5ebec2b79379071591 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Fri, 11 Feb 2022 18:49:21 +0800
Subject: [PATCH] vhost-user: Set the acked_features to vm's featrue
Fix the problem when vm restart, the ovs restart and lead to the net
unreachable. The soluation is set the acked_features to vm's featrue
just the same as guest virtio-net mod load.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/net/vhost_net.c | 58 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 57 insertions(+), 1 deletion(-)
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index e8e1661646..1b08b02477 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -167,9 +167,26 @@ static int vhost_net_get_fd(NetClientState *backend)
}
}
+static uint64_t vhost_get_mask_features(const int *feature_bits, uint64_t features)
+{
+ const int *bit = feature_bits;
+ uint64_t out_features = 0;
+
+ while (*bit != VHOST_INVALID_FEATURE_BIT) {
+ uint64_t bit_mask = (1ULL << *bit);
+ if (features & bit_mask) {
+ out_features |= bit_mask;
+ }
+ bit++;
+ }
+ return out_features;
+}
+
struct vhost_net *vhost_net_init(VhostNetOptions *options)
{
int r;
+ VirtIONet *n;
+ VirtIODevice *vdev;
bool backend_kernel = options->backend_type == VHOST_BACKEND_TYPE_KERNEL;
struct vhost_net *net = g_new0(struct vhost_net, 1);
uint64_t features = 0;
@@ -195,7 +212,46 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
net->backend = r;
net->dev.protocol_features = 0;
} else {
- net->dev.backend_features = 0;
+ /* for ovs restart when vm start.
+ * Normal situation:
+ * 1.vm start.
+ * 2.vhost_net_init init ok, then dev.acked_features is 0x40000000.
+ * 3.guest virtio-net mod load. qemu will call virtio_net_set_features set
+ * dev.acked_features to 0x40408000.
+ * 4.feature set to ovs's vhostuser(0x40408000).
+ * 5.ovs restart.
+ * 6.vhost_user_stop will save net->dev.acked_features(0x40408000) to
+ * VhostUserState's acked_features(0x40408000).
+ * 7.restart ok.
+ * 8.vhost_net_init fun call vhost_user_get_acked_features get the save
+ * features, and set to net->dev.acked_features.
+ * Abnormal situation:
+ * 1.vm start.
+ * 2.vhost_net_init init ok, then dev.acked_features is 0x40000000.
+ * 3.ovs restart.
+ * 4.vhost_user_stop will save net->dev.acked_features(0x40000000) to
+ * VhostUserState's acked_features(0x40000000).
+ * 5.guest virtio-net mod load. qemu will call virtio_net_set_features set
+ * dev.acked_features to 0x40408000.
+ * 6.restart ok.
+ * 7.vhost_net_init fun call vhost_user_get_acked_features get the save
+ * features(0x40000000), and set to net->dev.acked_features(0x40000000).
+ * 8.feature set to ovs's vhostuser(0x40000000).
+ *
+ * in abnormal situation, qemu set the wrong features to ovs's vhostuser,
+ * then the vm's network will be down.
+ * in abnormal situation, we found it just lost the guest feartures in
+ * acked_features, so hear we set the acked_features to vm's featrue
+ * just the same as guest virtio-net mod load.
+ */
+ if (options->net_backend->peer) {
+ n = qemu_get_nic_opaque(options->net_backend->peer);
+ vdev = VIRTIO_DEVICE(n);
+ net->dev.backend_features = vhost_get_mask_features(vhost_net_get_feature_bits(net),
+ vdev->guest_features);
+ } else {
+ net->dev.backend_features = 0;
+ }
net->dev.protocol_features = 0;
net->backend = -1;
--
2.27.0

View File

@ -0,0 +1,32 @@
From c65ff10063a6c599b88cba27fd70a72e2e0cc0ff Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 20:21:33 +0800
Subject: [PATCH] vhost-user: add unregister_savevm when vhost-user cleanup
commit 12cf5e9ece ("vhost-user: add vhost_set_mem_table
when vm load_setup at destination") only register savevm
handler but not unregister it, which will cause the
number of handers increase when vhost-user devices hotplug,
so this commit add unregister_savevm when vhost-user cleanup.
Fixes: 12cf5e9ece ("vhost-user: add vhost_set_mem_table when vm load_setup at destination")
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/virtio/vhost-user.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 6739dfc98e..e589ee3572 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2310,6 +2310,7 @@ static int vhost_user_backend_cleanup(struct vhost_dev *dev)
u->region_rb_len = 0;
g_free(u);
dev->opaque = 0;
+ unregister_savevm(NULL, "vhost-user", dev);
return 0;
}
--
2.27.0

View File

@ -0,0 +1,130 @@
From 12cf5e9ece9cb0825f14ca80f6b1c5d1eb95c3e5 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Fri, 11 Feb 2022 18:59:34 +0800
Subject: [PATCH] vhost-user: add vhost_set_mem_table when vm load_setup at
destination
When migrate huge vm, packages lost are 90+.
During the load_setup of the destination vm, pass the
vm mem structure to ovs, the netcard could be enabled
when the migration finish state shifting.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/virtio/vhost-user.c | 24 ++++++++++++++++++++++++
tests/qtest/vhost-user-test.c | 35 ++++++++++++++++++-----------------
2 files changed, 42 insertions(+), 17 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index f214df804b..6739dfc98e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -28,6 +28,7 @@
#include "sysemu/cryptodev.h"
#include "migration/migration.h"
#include "migration/postcopy-ram.h"
+#include "migration/register.h"
#include "trace.h"
#include "exec/ramblock.h"
@@ -2119,6 +2120,28 @@ static int vhost_user_postcopy_notifier(NotifierWithReturn *notifier,
return 0;
}
+static int vhost_user_load_setup(QEMUFile *f, void *opaque)
+{
+ struct vhost_dev *hdev = opaque;
+ int r;
+
+ if (hdev->vhost_ops && hdev->vhost_ops->vhost_set_mem_table) {
+ r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
+ if (r < 0) {
+ qemu_log("error: vhost_set_mem_table failed: %s(%d)\n",
+ strerror(errno), errno);
+ return r;
+ } else {
+ qemu_log("info: vhost_set_mem_table OK\n");
+ }
+ }
+ return 0;
+}
+
+SaveVMHandlers savevm_vhost_user_handlers = {
+ .load_setup = vhost_user_load_setup,
+};
+
static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
Error **errp)
{
@@ -2255,6 +2278,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque,
u->postcopy_notifier.notify = vhost_user_postcopy_notifier;
postcopy_add_notifier(&u->postcopy_notifier);
+ register_savevm_live("vhost-user", -1, 1, &savevm_vhost_user_handlers, dev);
return 0;
}
diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index d4e437265f..fadf3f0f2e 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -799,6 +799,23 @@ static void test_read_guest_mem(void *obj, void *arg, QGuestAllocator *alloc)
read_guest_mem_server(global_qtest, server);
}
+static void wait_for_rings_started(TestServer *s, size_t count)
+{
+ gint64 end_time;
+
+ g_mutex_lock(&s->data_mutex);
+ end_time = g_get_monotonic_time() + 5 * G_TIME_SPAN_SECOND;
+ while (ctpop64(s->rings) != count) {
+ if (!g_cond_wait_until(&s->data_cond, &s->data_mutex, end_time)) {
+ /* timeout has passed */
+ g_assert_cmpint(ctpop64(s->rings), ==, count);
+ break;
+ }
+ }
+
+ g_mutex_unlock(&s->data_mutex);
+}
+
static void test_migrate(void *obj, void *arg, QGuestAllocator *alloc)
{
TestServer *s = arg;
@@ -869,6 +886,7 @@ static void test_migrate(void *obj, void *arg, QGuestAllocator *alloc)
qtest_qmp_eventwait(to, "RESUME");
g_assert(wait_for_fds(dest));
+ wait_for_rings_started(dest, 2);
read_guest_mem_server(to, dest);
g_source_destroy(source);
@@ -880,23 +898,6 @@ static void test_migrate(void *obj, void *arg, QGuestAllocator *alloc)
g_string_free(dest_cmdline, true);
}
-static void wait_for_rings_started(TestServer *s, size_t count)
-{
- gint64 end_time;
-
- g_mutex_lock(&s->data_mutex);
- end_time = g_get_monotonic_time() + 5 * G_TIME_SPAN_SECOND;
- while (ctpop64(s->rings) != count) {
- if (!g_cond_wait_until(&s->data_cond, &s->data_mutex, end_time)) {
- /* timeout has passed */
- g_assert_cmpint(ctpop64(s->rings), ==, count);
- break;
- }
- }
-
- g_mutex_unlock(&s->data_mutex);
-}
-
static inline void test_server_connect(TestServer *server)
{
test_server_create_chr(server, ",reconnect=1");
--
2.27.0

View File

@ -0,0 +1,89 @@
From 90d4333d4bbde45a10892bf9004979d239d39e28 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Fri, 11 Feb 2022 19:24:30 +0800
Subject: [PATCH] vhost-user: quit infinite loop while used memslots is more
than the backend limit
When used memslots is more than the backend limit,
the vhost-user netcard would attach fail and quit
infinite loop.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/virtio/vhost.c | 10 ++++++++++
include/hw/virtio/vhost.h | 1 +
net/vhost-user.c | 5 +++++
3 files changed, 16 insertions(+)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a8adc149ad..038ac37dd0 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -56,6 +56,8 @@ static unsigned int used_shared_memslots;
static QLIST_HEAD(, vhost_dev) vhost_devices =
QLIST_HEAD_INITIALIZER(vhost_devices);
+bool used_memslots_exceeded;
+
unsigned int vhost_get_max_memslots(void)
{
unsigned int max = UINT_MAX;
@@ -1569,8 +1571,11 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
error_setg(errp, "vhost backend memory slots limit (%d) is less"
" than current number of used (%d) and reserved (%d)"
" memory slots for memory devices.", limit, used, reserved);
+ used_memslots_exceeded = true;
r = -EINVAL;
goto fail_busyloop;
+ } else {
+ used_memslots_exceeded = false;
}
return 0;
@@ -2405,3 +2410,8 @@ fail:
return ret;
}
+
+bool used_memslots_is_exceeded(void)
+{
+ return used_memslots_exceeded;
+}
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 02477788df..444ca0ad42 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -340,6 +340,7 @@ int vhost_dev_set_inflight(struct vhost_dev *dev,
struct vhost_inflight *inflight);
int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
struct vhost_inflight *inflight);
+bool used_memslots_is_exceeded(void);
bool vhost_dev_has_iommu(struct vhost_dev *dev);
#ifdef CONFIG_VHOST
diff --git a/net/vhost-user.c b/net/vhost-user.c
index 51fa8c678f..86fd5056ab 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -20,6 +20,7 @@
#include "qemu/error-report.h"
#include "qemu/option.h"
#include "trace.h"
+#include "include/hw/virtio/vhost.h"
#define VHOST_USER_RECONNECT_TIME (3)
@@ -373,6 +374,10 @@ static int net_vhost_user_init(NetClientState *peer, const char *device,
qemu_chr_fe_set_handlers(&s->chr, NULL, NULL,
net_vhost_user_event, NULL, nc0->name, NULL,
true);
+ if (used_memslots_is_exceeded()) {
+ error_report("used memslots exceeded the backend limit, quit loop");
+ goto err;
+ }
} while (!s->started);
assert(s->vhost_net);
--
2.27.0

View File

@ -0,0 +1,49 @@
From 3fe9a15feba924675ffcc5b797185091cfb8a007 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 14:49:53 +0800
Subject: [PATCH] vhost-vdpa: add VHOST_BACKEND_F_BYTEMAPLOG
support VHOST_BACKEND_F_BYTEMAPLOG to support vhost
device bytemap logging.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vhost-vdpa.c | 9 +++++----
include/standard-headers/linux/vhost_types.h | 2 ++
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 819b2d811a..ce8ff7f417 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -829,10 +829,11 @@ static int vhost_vdpa_set_features(struct vhost_dev *dev,
static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
{
uint64_t features;
- uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
- 0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
- 0x1ULL << VHOST_BACKEND_F_IOTLB_ASID |
- 0x1ULL << VHOST_BACKEND_F_SUSPEND;
+ uint64_t f = BIT_ULL(VHOST_BACKEND_F_IOTLB_MSG_V2) |
+ BIT_ULL(VHOST_BACKEND_F_IOTLB_BATCH) |
+ BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID) |
+ BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
+ BIT_ULL(VHOST_BACKEND_F_BYTEMAPLOG);
int r;
if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h
index fd54044936..46fc53cd83 100644
--- a/include/standard-headers/linux/vhost_types.h
+++ b/include/standard-headers/linux/vhost_types.h
@@ -192,5 +192,7 @@ struct vhost_vdpa_iova_range {
#define VHOST_BACKEND_F_DESC_ASID 0x7
/* IOTLB don't flush memory mapping across device reset */
#define VHOST_BACKEND_F_IOTLB_PERSIST 0x8
+/* device can use bytemap log */
+#define VHOST_BACKEND_F_BYTEMAPLOG 0x3f
#endif
--
2.27.0

View File

@ -0,0 +1,127 @@
From 3bc7a4e430e01fd90b427bf74a904664eda9ece6 Mon Sep 17 00:00:00 2001
From: libai <libai12@huawei.com>
Date: Mon, 4 Dec 2023 15:04:25 +0800
Subject: [PATCH] vhost-vdpa: add migration log ops for VhostOps
Implement vhost_set_log_size for setting buffer size for logging.
Implement vhost_set_log_fd to specify an eventfd to signal on log write.
Implement vhost_log_sync for getting dirtymap logged by vhost backend.
Signed-off-by: libai <libai12@huawei.com>
---
hw/virtio/vhost-vdpa.c | 37 +++++++++++++++++++++++++++++++
include/hw/virtio/vhost-backend.h | 8 +++++++
linux-headers/linux/vhost.h | 4 ++++
3 files changed, 49 insertions(+)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index ce8ff7f417..037a9c6e4c 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1355,6 +1355,30 @@ static int vhost_vdpa_set_log_base(struct vhost_dev *dev, uint64_t base,
return vhost_vdpa_call(dev, VHOST_SET_LOG_BASE, &base);
}
+static int vhost_vdpa_set_log_fd(struct vhost_dev *dev, int fd,
+ struct vhost_log *log)
+{
+ struct vhost_vdpa *v = dev->opaque;
+ if (v->shadow_vqs_enabled || !vhost_vdpa_first_dev(dev)) {
+ return 0;
+ }
+
+ return vhost_vdpa_call(dev, VHOST_SET_LOG_FD, &fd);
+}
+
+static int vhost_vdpa_set_log_size(struct vhost_dev *dev, uint64_t size,
+ struct vhost_log *log)
+{
+ struct vhost_vdpa *v = dev->opaque;
+ uint64_t logsize = size * sizeof(*(log->log));
+
+ if (v->shadow_vqs_enabled || !vhost_vdpa_first_dev(dev)) {
+ return 0;
+ }
+
+ return vhost_vdpa_call(dev, VHOST_SET_LOG_SIZE, &logsize);
+}
+
static int vhost_vdpa_set_vring_addr(struct vhost_dev *dev,
struct vhost_vring_addr *addr)
{
@@ -1489,11 +1513,23 @@ static bool vhost_vdpa_force_iommu(struct vhost_dev *dev)
return true;
}
+static int vhost_vdpa_log_sync(struct vhost_dev *dev)
+{
+ struct vhost_vdpa *v = dev->opaque;
+ if (v->shadow_vqs_enabled || !vhost_vdpa_first_dev(dev)) {
+ return 0;
+ }
+
+ return vhost_vdpa_call(dev, VHOST_LOG_SYNC, NULL);
+}
+
const VhostOps vdpa_ops = {
.backend_type = VHOST_BACKEND_TYPE_VDPA,
.vhost_backend_init = vhost_vdpa_init,
.vhost_backend_cleanup = vhost_vdpa_cleanup,
.vhost_set_log_base = vhost_vdpa_set_log_base,
+ .vhost_set_log_size = vhost_vdpa_set_log_size,
+ .vhost_set_log_fd = vhost_vdpa_set_log_fd,
.vhost_set_vring_addr = vhost_vdpa_set_vring_addr,
.vhost_set_vring_num = vhost_vdpa_set_vring_num,
.vhost_set_vring_base = vhost_vdpa_set_vring_base,
@@ -1520,6 +1556,7 @@ const VhostOps vdpa_ops = {
.vhost_get_device_id = vhost_vdpa_get_device_id,
.vhost_vq_get_addr = vhost_vdpa_vq_get_addr,
.vhost_force_iommu = vhost_vdpa_force_iommu,
+ .vhost_log_sync = vhost_vdpa_log_sync,
.vhost_set_config_call = vhost_vdpa_set_config_call,
.vhost_reset_status = vhost_vdpa_reset_status,
};
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index a86d103f82..71b02e4a12 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -65,6 +65,11 @@ typedef int (*vhost_scsi_get_abi_version_op)(struct vhost_dev *dev,
int *version);
typedef int (*vhost_set_log_base_op)(struct vhost_dev *dev, uint64_t base,
struct vhost_log *log);
+typedef int (*vhost_set_log_size_op)(struct vhost_dev *dev, uint64_t size,
+ struct vhost_log *log);
+typedef int (*vhost_set_log_fd_op)(struct vhost_dev *dev, int fd,
+ struct vhost_log *log);
+typedef int (*vhost_log_sync_op)(struct vhost_dev *dev);
typedef int (*vhost_set_mem_table_op)(struct vhost_dev *dev,
struct vhost_memory *mem);
typedef int (*vhost_set_vring_addr_op)(struct vhost_dev *dev,
@@ -162,6 +167,9 @@ typedef struct VhostOps {
vhost_scsi_clear_endpoint_op vhost_scsi_clear_endpoint;
vhost_scsi_get_abi_version_op vhost_scsi_get_abi_version;
vhost_set_log_base_op vhost_set_log_base;
+ vhost_set_log_size_op vhost_set_log_size;
+ vhost_set_log_fd_op vhost_set_log_fd;
+ vhost_log_sync_op vhost_log_sync;
vhost_set_mem_table_op vhost_set_mem_table;
vhost_set_vring_addr_op vhost_set_vring_addr;
vhost_set_vring_endian_op vhost_set_vring_endian;
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 649560c685..19dc7fd36c 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -43,6 +43,10 @@
* The bit is set using an atomic 32 bit operation. */
/* Set base address for logging. */
#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
+/* Set buffer size for logging */
+#define VHOST_SET_LOG_SIZE _IOW(VHOST_VIRTIO, 0x05, __u64)
+/* Logging sync */
+#define VHOST_LOG_SYNC _IO(VHOST_VIRTIO, 0x06)
/* Specify an eventfd file descriptor to signal on log write. */
#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
/* By default, a device gets one vhost_worker that its virtqueues share. This
--
2.27.0

View File

@ -0,0 +1,38 @@
From 7b4a9547e68147291e68258db9415ef5a20fe06b Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 11:16:26 +0800
Subject: [PATCH] virtio: bugfix: add rcu_read_lock when vring_avail_idx is
called
viring_avail_idx should be called within rcu_read_lock(),
or may get NULL caches in vring_get_region_caches() and
trigger assert().
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/virtio/virtio.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 27ceab92be..ec09d515c2 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2801,6 +2801,7 @@ static void check_vring_avail_num(VirtIODevice *vdev, int index)
{
uint16_t nheads;
+ rcu_read_lock();
/* Check it isn't doing strange things with descriptor numbers. */
nheads = vring_avail_idx(&vdev->vq[index]) - vdev->vq[index].last_avail_idx;
if (nheads > vdev->vq[index].vring.num) {
@@ -2811,6 +2812,7 @@ static void check_vring_avail_num(VirtIODevice *vdev, int index)
vring_avail_idx(&vdev->vq[index]),
vdev->vq[index].last_avail_idx, nheads);
}
+ rcu_read_unlock();
}
int virtio_save(VirtIODevice *vdev, QEMUFile *f)
--
2.27.0

View File

@ -0,0 +1,42 @@
From f6b3e8ea39d00d25ab979f7b24842dc24e263ed8 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 14:37:52 +0800
Subject: [PATCH] virtio: bugfix: check the value of caches before accessing it
Vring caches may be NULL in check_vring_avail_num() if
virtio_reset() is called at the same time, such as when
the virtual machine starts.
So check it before accessing it in vring_avail_idx().
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/virtio/virtio.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 1f78b74c00..d93ea62723 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2800,8 +2800,19 @@ static const VMStateDescription vmstate_virtio = {
static void check_vring_avail_num(VirtIODevice *vdev, int index)
{
uint16_t nheads;
+ VRingMemoryRegionCaches *caches;
rcu_read_lock();
+ caches = qatomic_rcu_read(&vdev->vq[index].vring.caches);
+ if (caches == NULL) {
+ /*
+ * caches may be NULL if virtio_reset is called at the same time,
+ * such as when the virtual machine starts.
+ */
+ rcu_read_unlock();
+ return;
+ }
+
/* Check it isn't doing strange things with descriptor numbers. */
nheads = vring_avail_idx(&vdev->vq[index]) - vdev->vq[index].last_avail_idx;
if (nheads > vdev->vq[index].vring.num) {
--
2.27.0

View File

@ -0,0 +1,52 @@
From b57e956ea522b487081d1c94aa2e4af6a3314d20 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 11:09:36 +0800
Subject: [PATCH] virtio: check descriptor numbers
Check if the vring num is normal in virtio_save(), and add LOG
the vm push the wrong viring num down through writing IO Port.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/virtio/virtio.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index a9aa0c4f66..27ceab92be 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2797,6 +2797,22 @@ static const VMStateDescription vmstate_virtio = {
}
};
+static void check_vring_avail_num(VirtIODevice *vdev, int index)
+{
+ uint16_t nheads;
+
+ /* Check it isn't doing strange things with descriptor numbers. */
+ nheads = vring_avail_idx(&vdev->vq[index]) - vdev->vq[index].last_avail_idx;
+ if (nheads > vdev->vq[index].vring.num) {
+ qemu_log("VQ %d size 0x%x Guest index 0x%x "
+ "inconsistent with Host index 0x%x: "
+ "delta 0x%x\n",
+ index, vdev->vq[index].vring.num,
+ vring_avail_idx(&vdev->vq[index]),
+ vdev->vq[index].last_avail_idx, nheads);
+ }
+}
+
int virtio_save(VirtIODevice *vdev, QEMUFile *f)
{
BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
@@ -2827,6 +2843,8 @@ int virtio_save(VirtIODevice *vdev, QEMUFile *f)
if (vdev->vq[i].vring.num == 0)
break;
+ check_vring_avail_num(vdev, i);
+
qemu_put_be32(f, vdev->vq[i].vring.num);
if (k->has_variable_vring_alignment) {
qemu_put_be32(f, vdev->vq[i].vring.align);
--
2.27.0

View File

@ -0,0 +1,38 @@
From 3cd74fd83d58aa88f9a006980c73844d6b79d1fb Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 10:31:38 +0800
Subject: [PATCH] virtio-net: bugfix: do not delete netdev before virtio net
For the vhost-user net-card, it is allow to delete its
network backend while the virtio-net device still exists.
However, when the status of the device changes in guest,
QEMU will check whether the network backend exists, otherwise
it will crash.
So do not allowed to delete the network backend directly
without delete virtio-net device.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
net/net.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/net.c b/net/net.c
index 0520bc1681..bcd3d7e04c 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1322,6 +1322,12 @@ void qmp_netdev_del(const char *id, Error **errp)
return;
}
+ if (nc->info->type == NET_CLIENT_DRIVER_VHOST_USER && nc->peer) {
+ error_setg(errp, "Device '%s' is a netdev for vhostuser,"
+ "please delete the peer front-end device (virtio-net) first.", id);
+ return;
+ }
+
qemu_del_net_client(nc);
/*
--
2.27.0

View File

@ -0,0 +1,52 @@
From 4321c9f8b85c6a4c1549399aa11e351b66bd1879 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 10:48:27 +0800
Subject: [PATCH] virtio-net: fix max vring buf size when set ring num
Set the max vring buf size of virtio-net devices to 4096
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/virtio/virtio.c | 9 +++++++--
include/hw/virtio/virtio.h | 1 +
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index d93ea62723..267c1e6fd0 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2196,12 +2196,17 @@ void virtio_queue_set_rings(VirtIODevice *vdev, int n, hwaddr desc,
void virtio_queue_set_num(VirtIODevice *vdev, int n, int num)
{
+ int vq_max_size = VIRTQUEUE_MAX_SIZE;
+
+ if (!strcmp(vdev->name, "virtio-net")) {
+ vq_max_size = VIRTIO_NET_VQ_MAX_SIZE;
+ }
+
/* Don't allow guest to flip queue between existent and
* nonexistent states, or to set it to an invalid size.
*/
if (!!num != !!vdev->vq[n].vring.num ||
- num > VIRTQUEUE_MAX_SIZE ||
- num < 0) {
+ num > vq_max_size || num < 0) {
return;
}
vdev->vq[n].vring.num = num;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 7c35bb841b..e612441357 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -60,6 +60,7 @@ size_t virtio_get_config_size(const VirtIOConfigSizeParams *params,
typedef struct VirtQueue VirtQueue;
#define VIRTQUEUE_MAX_SIZE 1024
+#define VIRTIO_NET_VQ_MAX_SIZE (4096)
typedef struct VirtQueueElement
{
--
2.27.0

View File

@ -0,0 +1,58 @@
From 58fe483bf5824db177843675629ed955051078fd Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Sat, 12 Feb 2022 17:22:38 +0800
Subject: [PATCH] virtio-net: set the max of queue size to 4096
set the max of virtio-net queue size to 4096. Now the
queue_size of virtio-net is set by rx_queue_size and
tx_queue_size
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/net/virtio-net.c | 5 +++--
hw/virtio/virtio.c | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7f69a4b842..0ae2ddc002 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -710,6 +710,7 @@ static int virtio_net_max_tx_queue_size(VirtIONet *n)
switch(peer->info->type) {
case NET_CLIENT_DRIVER_VHOST_USER:
+ return VIRTIO_NET_VQ_MAX_SIZE;
case NET_CLIENT_DRIVER_VHOST_VDPA:
return VIRTQUEUE_MAX_SIZE;
default:
@@ -3638,12 +3639,12 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
* help from us (using virtio 1 and up).
*/
if (n->net_conf.rx_queue_size < VIRTIO_NET_RX_QUEUE_MIN_SIZE ||
- n->net_conf.rx_queue_size > VIRTQUEUE_MAX_SIZE ||
+ n->net_conf.rx_queue_size > VIRTIO_NET_VQ_MAX_SIZE ||
!is_power_of_2(n->net_conf.rx_queue_size)) {
error_setg(errp, "Invalid rx_queue_size (= %" PRIu16 "), "
"must be a power of 2 between %d and %d.",
n->net_conf.rx_queue_size, VIRTIO_NET_RX_QUEUE_MIN_SIZE,
- VIRTQUEUE_MAX_SIZE);
+ VIRTIO_NET_VQ_MAX_SIZE);
virtio_cleanup(vdev);
return;
}
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 267c1e6fd0..d00effe4d5 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2338,7 +2338,7 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
break;
}
- if (i == VIRTIO_QUEUE_MAX || queue_size > VIRTQUEUE_MAX_SIZE) {
+ if (i == VIRTIO_QUEUE_MAX) {
qemu_log("unacceptable queue_size (%d) or num (%d)\n",
queue_size, i);
abort();
--
2.27.0

View File

@ -0,0 +1,110 @@
From c2221815b79be9847c4729709809779b4b0550a7 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 17:28:49 +0800
Subject: [PATCH] virtio-net: update the default and max of rx/tx_queue_size
Set the max of tx_queue_size to 4096 even if the backends
are not vhost-user.
Set the default of rx/tx_queue_size to 2048 if the backends
are vhost-user, otherwise to 4096.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/net/virtio-net.c | 43 ++++++++++++++++++++++++++++++++-----------
1 file changed, 32 insertions(+), 11 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 0ae2ddc002..523d01746d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -50,12 +50,11 @@
#define VIRTIO_NET_VM_VERSION 11
/* previously fixed value */
-#define VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE 256
-#define VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE 256
+#define VIRTIO_NET_VHOST_USER_DEFAULT_SIZE 2048
/* for now, only allow larger queue_pairs; with virtio-1, guest can downsize */
-#define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
-#define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
+#define VIRTIO_NET_RX_QUEUE_MIN_SIZE 256
+#define VIRTIO_NET_TX_QUEUE_MIN_SIZE 256
#define VIRTIO_NET_IP4_ADDR_SIZE 8 /* ipv4 saddr + daddr */
@@ -696,6 +695,28 @@ static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int mergeable_rx_bufs,
}
}
+static void virtio_net_set_default_queue_size(VirtIONet *n)
+{
+ NetClientState *peer = n->nic_conf.peers.ncs[0];
+
+ /* Default value is 0 if not set */
+ if (n->net_conf.rx_queue_size == 0) {
+ if (peer && peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
+ n->net_conf.rx_queue_size = VIRTIO_NET_VHOST_USER_DEFAULT_SIZE;
+ } else {
+ n->net_conf.rx_queue_size = VIRTIO_NET_VQ_MAX_SIZE;
+ }
+ }
+
+ if (n->net_conf.tx_queue_size == 0) {
+ if (peer && peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
+ n->net_conf.tx_queue_size = VIRTIO_NET_VHOST_USER_DEFAULT_SIZE;
+ } else {
+ n->net_conf.tx_queue_size = VIRTIO_NET_VQ_MAX_SIZE;
+ }
+ }
+}
+
static int virtio_net_max_tx_queue_size(VirtIONet *n)
{
NetClientState *peer = n->nic_conf.peers.ncs[0];
@@ -705,16 +726,16 @@ static int virtio_net_max_tx_queue_size(VirtIONet *n)
* size.
*/
if (!peer) {
- return VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE;
+ return VIRTIO_NET_VQ_MAX_SIZE;
}
switch(peer->info->type) {
case NET_CLIENT_DRIVER_VHOST_USER:
return VIRTIO_NET_VQ_MAX_SIZE;
case NET_CLIENT_DRIVER_VHOST_VDPA:
- return VIRTQUEUE_MAX_SIZE;
+ return VIRTIO_NET_VQ_MAX_SIZE;
default:
- return VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE;
+ return VIRTIO_NET_VQ_MAX_SIZE;
};
}
@@ -3633,6 +3654,8 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
virtio_net_set_config_size(n, n->host_features);
virtio_init(vdev, VIRTIO_ID_NET, n->config_size);
+ virtio_net_set_default_queue_size(n);
+
/*
* We set a lower limit on RX queue size to what it always was.
* Guests that want a smaller ring can always resize it without
@@ -3934,10 +3957,8 @@ static Property virtio_net_properties[] = {
TX_TIMER_INTERVAL),
DEFINE_PROP_INT32("x-txburst", VirtIONet, net_conf.txburst, TX_BURST),
DEFINE_PROP_STRING("tx", VirtIONet, net_conf.tx),
- DEFINE_PROP_UINT16("rx_queue_size", VirtIONet, net_conf.rx_queue_size,
- VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE),
- DEFINE_PROP_UINT16("tx_queue_size", VirtIONet, net_conf.tx_queue_size,
- VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE),
+ DEFINE_PROP_UINT16("rx_queue_size", VirtIONet, net_conf.rx_queue_size, 0),
+ DEFINE_PROP_UINT16("tx_queue_size", VirtIONet, net_conf.tx_queue_size, 0),
DEFINE_PROP_UINT16("host_mtu", VirtIONet, net_conf.mtu, 0),
DEFINE_PROP_BOOL("x-mtu-bypass-backend", VirtIONet, mtu_bypass_backend,
true),
--
2.27.0

View File

@ -0,0 +1,112 @@
From b24730e9abe34898483fa62b24c26abb9d98570c Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Thu, 10 Feb 2022 14:16:17 +0800
Subject: [PATCH] virtio: print the guest virtio_net features that host does
not support
print the guest virtio_net features that host does not support
For example:
Please check host config, because host does not support required feature bits 0x1983
virtio_net_feature: csum, guest_csum, guest_tso4, guest_tso6, host_tso4, host_tso6
Features 0xef99a3 unsupported. Allowed features: 0x40ff8024
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/net/virtio-net.c | 41 ++++++++++++++++++++++++++++++++++++++
hw/virtio/virtio.c | 7 +++++++
include/hw/virtio/virtio.h | 1 +
3 files changed, 49 insertions(+)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 80c56f0cfc..7f69a4b842 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3952,6 +3952,46 @@ static Property virtio_net_properties[] = {
DEFINE_PROP_END_OF_LIST(),
};
+static void virtio_net_print_features(uint64_t features)
+{
+ Property *props = virtio_net_properties;
+ int feature_cnt = 0;
+
+ if (!features) {
+ return;
+ }
+ printf("virtio_net_feature: ");
+
+ for (; features && props->name; props++) {
+ /* The bitnr of property may be default(0) besides 'csum' property. */
+ if (props->bitnr == 0 && strcmp(props->name, "csum")) {
+ continue;
+ }
+
+ /* Features only support 64bit. */
+ if (props->bitnr > 63) {
+ continue;
+ }
+
+ if (virtio_has_feature(features, props->bitnr)) {
+ virtio_clear_feature(&features, props->bitnr);
+ if (feature_cnt != 0) {
+ printf(", ");
+ }
+ printf("%s", props->name);
+ feature_cnt++;
+ }
+ }
+
+ if (features) {
+ if (feature_cnt != 0) {
+ printf(", ");
+ }
+ printf("unkown bits 0x%." PRIx64, features);
+ }
+ printf("\n");
+}
+
static void virtio_net_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
@@ -3966,6 +4006,7 @@ static void virtio_net_class_init(ObjectClass *klass, void *data)
vdc->set_config = virtio_net_set_config;
vdc->get_features = virtio_net_get_features;
vdc->set_features = virtio_net_set_features;
+ vdc->print_features = virtio_net_print_features;
vdc->bad_features = virtio_net_bad_features;
vdc->reset = virtio_net_reset;
vdc->queue_reset = virtio_net_queue_reset;
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index ec09d515c2..1f78b74c00 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2905,6 +2905,13 @@ static int virtio_set_features_nocheck(VirtIODevice *vdev, uint64_t val)
{
VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
bool bad = (val & ~(vdev->host_features)) != 0;
+ uint64_t feat = val & ~(vdev->host_features);
+
+ if (bad && k->print_features) {
+ qemu_log("error: Please check host config, "\
+ "because host does not support required feature bits 0x%" PRIx64 "\n", feat);
+ k->print_features(feat);
+ }
val &= vdev->host_features;
if (k->set_features) {
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c8f72850bc..7c35bb841b 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -182,6 +182,7 @@ struct VirtioDeviceClass {
int (*validate_features)(VirtIODevice *vdev);
void (*get_config)(VirtIODevice *vdev, uint8_t *config);
void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
+ void (*print_features)(uint64_t features);
void (*reset)(VirtIODevice *vdev);
void (*set_status)(VirtIODevice *vdev, uint8_t val);
/* Device must validate queue_index. */
--
2.27.0

View File

@ -0,0 +1,37 @@
From 4e5de00fb124d82f9c4ce2ac433ed3d691783c01 Mon Sep 17 00:00:00 2001
From: Jinhua Cao <caojinhua1@huawei.com>
Date: Wed, 9 Feb 2022 19:58:21 +0800
Subject: [PATCH] virtio-scsi: bugfix: fix qemu crash for hotplug scsi disk
with dataplane
The vm will trigger a disk sweep operation after plugging
a controller who's io type is iothread. If attach a scsi
disk immediately, the sg_inqury request in vm will trigger
the assert in virtio_scsi_ctx_check(), which is called by
virtio_scsi_handle_cmd_req_prepare().
Add judgment in virtio_scsi_handle_cmd_req_prepare() and
return IO Error directly if the device has not been
initialized.
Signed-off-by: Jinhua Cao <caojinhua1@huawei.com>
---
hw/scsi/virtio-scsi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 9c751bf296..bc7feb404a 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -781,7 +781,7 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req)
req->req.cmd.tag, req->req.cmd.cdb[0]);
d = virtio_scsi_device_get(s, req->req.cmd.lun);
- if (!d) {
+ if (!d || !d->qdev.realized) {
req->resp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
virtio_scsi_complete_cmd_req(req);
return -ENOENT;
--
2.27.0