Update the script to import the new iommu.h uapi header.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 683ab306ab1d149c11d4326e5459a5312405c4de)
The new capability VFIO_DIRTY_LOG_MANUAL_CLEAR and the new ioctl
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP have been introduced in
the kernel, tweak the userspace side to use them.
Check if the kernel supports VFIO_DIRTY_LOG_MANUAL_CLEAR and
provide the log_clear() hook for vfio_memory_listener. If the
kernel supports it, deliever the clear message to kernel.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit eb15c358d8310a03e5eb4cf957f30314fa41d4a0)
When synchronizing dirty bitmap from kernel VFIO we do it in a
per-iova-range fashion and we allocate the userspace bitmap for each of the
ioctl. This patch introduces `struct VFIODMARange` to describe a range of
the given DMA mapping with respect to a VFIO_IOMMU_MAP_DMA operation, and
make the bitmap cache of this range be persistent so that we don't need to
g_try_malloc0() every time. Note that the new structure is almost a copy of
`struct vfio_iommu_type1_dma_map` but only internally used by QEMU.
More importantly, the cached per-iova-range dirty bitmap will be further
used when we want to add support for the CLEAR_BITMAP and this cached
bitmap will be used to guarantee we don't clear any unknown dirty bits
otherwise that can be a severe data loss issue for migration code.
It's pretty intuitive to maintain a bitmap per container since we perform
log_sync at this granule. But I don't know how to deal with things like
memory hot-{un}plug, sparse DMA mappings, etc. Suggestions welcome.
* yet something to-do:
- can't work with guest viommu
- no locks
- etc
[ The idea and even the commit message are largely inherited from kvm side.
See commit 9f4bf4baa8b820c7930e23c9566c9493db7e1d25. ]
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jinagkunkun@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 54787195fc22365d254d8843f6d154fb0ee07ee9)
The new capability VFIO_DIRTY_LOG_MANUAL_CLEAR and the new ioctl
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP have been introduced in
the kernel, update the header to add them.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 7518c53b639053d5535b3c4e3aeb4a21950f9042)
log: Add some logs on VM runtime path
qdev/monitors: Fix reundant error_setg of qdev_add_device
bios-tables-test: Allow changes to q35/SSDT.dimmpxm file
smbios: Add missing member of type 4 for smbios 3.0
bios-tables-test: Update expected q35/SSDT.dimmpxm file
net: eepro100: validate various address valuesi(CVE-2021-20255)
pci: check bus pointer before dereference
ide: ahci: add check to avoid null dereference (CVE-2019-12067)
tap: return err when tap TUNGETIFF fail
xhci: check reg to avoid OOB read
monitor: Discard BLOCK_IO_ERROR event when VM rebooted
monitor: limit io error qmp event to at most once per 60s
Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 3cc842b5237fe9681d6eb2f59fca0652eb0ab0c3)
The speed of BLOCK IO ERROR event maybe very high (thousands per
second). If we report all BLOCK IO ERRORs, the log file will be flooded
with BLOCK IO ERROR event. So throttle it to at most once per 60s.
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 381b95fdf20ab5326ca1811155134a23fbc2046e)
Throttled event like QAPI_EVENT_BLOCK_IO_ERROR may be queued
to limit event rate. Event may be delivered when VM is rebooted
if the event was queued in the *monitor_qapi_event_state* hash table.
Which may casue VM pause and other related problems.
Such as seabios blocked during virtio-scsi initialization:
vring_add_buf(vq, sg, out_num, in_num, 0, 0);
vring_kick(vp, vq, 1);
------------> VM paused here <-----------
/* Wait for reply */
while (!vring_more_used(vq)) usleep(5);
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 42aa18057deead287b570fc44caa8ed4f897c878)
Add a sanity check to fix OOB read access.
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 9d077b427a8779826def993be0c36f365e072f67)
When hotplug ovs kernel netcard, even tap TUNGETIFF failed,
the hotplug would go on and would lead to qemu assert.
The failure should lead to the free_fail.
Signed-off-by: miaoyubo <miaoyubo@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit fcfc664bacbb7d51d667dd6d0c20ce088bc7effe)
Fix CVE-2019-12067
AHCI emulator while committing DMA buffer in ahci_commit_buf()
may do a NULL dereference if the command header 'ad->cur_cmd'
is null. Add check to avoid it.
Reported-by: Bugs SysSec <address@hidden>
Signed-off-by: Prasad J Pandit <address@hidden>
Signed-off-by: Jiajie Li <lijiajie11@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 51b23b8b7cc4aac66e472f5ac448084981b0cc3b)
fix CVE-2020-25742
patch link: https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg05294.html
While mapping IRQ level in pci_change_irq_level() routine,
it does not check if pci_get_bus() returned a valid pointer.
It may lead to a NULL pointer dereference issue. Add check to
avoid it.
-> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1
==1183858==Hint: address points to the zero page.
#0 pci_change_irq_level hw/pci/pci.c:259
#1 pci_irq_handler hw/pci/pci.c:1445
#2 pci_set_irq hw/pci/pci.c:1463
#3 lsi_set_irq hw/scsi/lsi53c895a.c:488
#4 lsi_update_irq hw/scsi/lsi53c895a.c:523
#5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554
#6 lsi_execute_script hw/scsi/lsi53c895a.c:1149
#7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984
#8 lsi_io_write hw/scsi/lsi53c895a.c:2146
...
Reported-by: Ruhr-University <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit da4953b1dfdacc1a60c48e5de2146795725e1155)
According to smbios 3.0 spec, for processor information (type 4),
it adds three new members (Core Count 2, Core enabled 2, thread count 2) for 3.0, Without this three members, we can not get correct cpu frequency from dmi,
Because it will failed to check the length of Processor Infomation in DMI.
The corresponding codes in kernel is like:
if (dm->type == DMI_ENTRY_PROCESSOR &&
dm->length >= DMI_ENTRY_PROCESSOR_MIN_LENGTH) {
u16 val = (u16)get_unaligned((const u16 *)
(dmi_data + DMI_PROCESSOR_MAX_SPEED));
*mhz = val > *mhz ? val : *mhz;
}
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 457ab195e6fed9e1971e10547e1a6d550c0d0b3a)
List test/data/acpi/q35/SSDT.dimmpxm as the expected files allowed to
be changed in tests/qtest/bios-tables-test-allowed-diff.h
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit bbbc6a1a9ca0ae046d5f43e5e5005dbe00796cd6)
There is an extra log "error_setg" in qdev_add_device(). When
hot-plug a device, if the corresponding bus doesn't exist, it
will trigger an asseration "assert(*errp == NULL)".
Fixes: 515a7970490 (log: Add some logs on VM runtime path)
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 4a946ee5713758ec120126384e76e8eb8f6059a0)
Add logs on VM runtime path, to make it easier to do trouble shooting.
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit dfca9d4ba6b13b1b939a97fa7127821799593185)
Using CONFIG_DISABLE_QEMU_LOG macro to control
qemu_log function.
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
(cherry picked from commit 0cea596fd445015a851dbd2bfe634644ae30883a)
seabios-convert-value-of-be16_to_cpu-to-u64-before-s.patch:
be16_to_cpu(scsi_lun->lun[i]) is 16 bits and left shifting by more than 16
will have undefined behaviour. convert it to u64 before shifting.
seabios-do-not-give-back-high-ram.patch:
fix bug of Oracle 6 and 7 series virtual machines using the high ram returned by
sebios.
seabios-drop-yield-in-smp_setup.patch:
Fix SeaBIOS stuck problem becuase SeaBIOS open hardware interrupt
by invoking yield(). That's dangerous and unnecessary. Let's drop
it, and make the processing of setup smp more security in SeaBIOS.
seabios-fix-memory-leak-when-pci-check.patch:
fix code memory leak when pci check failed
free busses memory when pci_bios_check_devices function returns error in pci_setup()
seabios-increase-the-seabios-high-mem-zone-size.patch:
In terms of version and specification, under the maximum configuration
specification of the number of vcpus, virtio blocks and other features,
there exists bottleneck in seabios high_mem_zone, which results in the
memory application failure and causes the vm to fail to start.
Increase BUILD_MAX_HIGHTABLE to 512k.
seabios-increase-the-seabios-minibiostable.patch:
Increase the BUILD_MIN_BIOSTABLE to 4096;
support 25 virtio-blk(data) + 1 virtio-scsi(sys) + 1 virtio-net
Increase the BUILD_MIN_BIOSTABLE to 5120;
support 18 virtio-scsi while vm starts with IDE boot disk
Signed-off-by: jiangdongxu <jiangdongxu1@huawei.com>
Since filemonitor testcase requires that host kernel being a LTS version,
we cannot guarantee that on OBS system. Lets disable it by default.
Signed-off-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Jinhao Gao <gaojinhao@huawei.com>
Reduce the vpcu cost by set a lower FRAME_TIMER_FREQ of the UHCI
when VNC client disconnected. This can reduce about 3% cost of
vcpu thread.
Signed-off-by: eillon <yezhenyu2@huawei.com>
freeclock: add qmp command to get time offset of vm in seconds
freeclock: set rtc_date_diff for arm
freeclock: set rtc_date_diff for X86
Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
When setting the system time in VM, a RTC_CHANGE event will be reported.
However, if libvirt is restarted while the event is be reporting, the
event will be lost and we will get the old time (not the time we set in
VM) after rebooting the VM.
We save the delta time in QEMU and add a rtc-date-diff qmp to get the
delta time so that libvirt can get the latest time in VM according to
the qmp after libvirt is restarted.
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: zhangxinhao <zhangxinhao1@huawei.com>
target/arm: convert isar regs to array
target/arm: parse cpu feature related options
target/arm: register CPU features for property
target/arm: Allow ID registers to synchronize to KVM
target/arm: introduce CPU feature dependency mechanism
target/arm: introduce KVM_CAP_ARM_CPU_FEATURE
target/arm: Add CPU features to query-cpu-model-expansion
target/arm: Add more CPU features
target/arm: ignore evtstrm and cpuid CPU features
target/arm: only set ID_PFR1_EL1.GIC for AArch32 guest
target/arm: Fix write redundant values to kvm
target/arm: clear EL2 and EL3 only when kvm is not enabled
target/arm: Update the ID registers of Kunpeng-920
Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
The values of some ID registers in Kunpeng-920 are not exactly correct.
Let's update them. The values are read from Kunpeng-920 by calling
read_sysreg_s.
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
When has_el2 and has_el3 are disabled, which is the default value for
virt machine, QEMU will clear the corresponding field in ID_PFR1_EL1 and
ID_AA64PFR0_EL1 to not expose EL3 and EL2 to guest. Because KVM doesn't
support to emulate ID registers in AArch64 before, it will not take
effect. Hence, clear EL2 and EL3 only when kvm is not enabled for
backwards compatibility.
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
After modifying the value of a ID register, we'd better to try to write
it to KVM so that we can known the value is acceptable for KVM.
Because it may modify the registers' values of KVM, it's not suitable
for other registers.
(cherry-picked from a0d7a9de807639fcfcbe1fe037cb8772d459a9cf)
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Some AArch64 CPU doesn't support AArch32 mode, and the values of AArch32
registers are all 0. Hence, We'd better not to modify AArch32 registers
in AArch64 mode.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
evtstrm and cpuid cann't be controlled by VMM:
1. evtstrm: The generic timer is configured to generate events at a
frequency of approximately 100KHz. It's controlled by the linux
kernel config CONFIG_ARM_ARCH_TIMER_EVTSTREAM.
2. cpuid: EL0 access to certain ID registers is available. It's always
set by linux kernel after 77c97b4ee2129 ("arm64: cpufeature: Expose
CPUID registers by emulation").
However, they are exposed by getauxval() and /proc/cpuinfo. Hence,
let's report and ignore the CPU features if someone set them.
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Add i8mm, bf16, and dgh CPU features for AArch64.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Add CPU features to the result of query-cpu-model-expansion so that
other applications (such as libvirt) can know the supported CPU
features.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Introduce KVM_CAP_ARM_CPU_FEATURE to check whether KVM supports to set
CPU features in ARM.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Some CPU features are dependent on other CPU features. For example,
ID_AA64PFR0_EL1.FP field and ID_AA64PFR0_EL1.AdvSIMD must have the same
value, which means FP and ADVSIMD are dependent on each other, FPHP and
ADVSIMDHP are dependent on each other.
This commit introduces a mechanism for CPU feature dependency in
AArch64. We build a directed graph from the CPU feature dependency
relationship, each edge from->to means the `to` CPU feature is dependent
on the `from` CPU feature. And we will automatically enable/disable CPU
feature according to the directed graph.
For example, a, b, and c CPU features are in relationship a->b->c, which
means c is dependent on b and b is dependent on a. If c is enabled by
user, then a and b is enabled automatically. And if a is disabled by
user, then b and c is disabled automatically.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
There are 2 steps to synchronize the values of system registers from
CPU state to KVM:
1. write to the values of system registers from CPU state to
(index,value) list by write_cpustate_to_list;
2. write the values in (index,value) list to KVM by
write_list_to_kvmstate;
In step 1, the values of constant system registers are not allowed to
write to (index,value) list. However, a constant system register is
CONSTANT for guest but not for QEMU, which means, QEMU can set/modify
the value of constant system registers that is different from phsical
registers when startup. But if KVM is enabled, guest can not read the
values of the system registers which QEMU set unless they can be written
to (index,value) list. And why not try to write to KVM if kvm_sync is
true?
At the moment we call write_cpustate_to_list, all ID registers are
contant, including ID_PFR1_EL1 and ID_AA64PFR0_EL1 because GIC has been
initialized. Hence, let's give all ID registers a chance to write to
KVM. If the write is successful, then write to (index,value) list.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
The Arm architecture specifies a number of ID registers that are
characterized as comprising a set of 4-bit ID fields. Each ID field
identifies the presence, and possibly the level of support for, a
particular feature in an implementation of the architecture. [1]
For most of the ID fields, there is a minimum presence value, equal to
or higher than which means the corresponding CPU feature is implemented.
Hence, we can use the minimum presence value to determine whether a CPU
feature is enabled and enable a CPU feature.
To disable a CPU feature, setting the corresponding ID field to 0x0/0xf
(for unsigned/signed field) seems as a good idea. However, it maybe
lead to some problems. For example, ID_AA64PFR0_EL1.FP is a signed ID
field. ID_AA64PFR0_EL1.FP == 0x0 represents the implementation of FP
(floating-point) and ID_AA64PFR0_EL1.FP == 0x1 represents the
implementation of FPHP (half-precision floating-point). If
ID_AA64PFR0_EL1.FP is set to 0xf when FPHP is disabled (which is also
disable FP), guest kernel maybe stuck. Hence, we add a ni_value (means
not-implemented value) to disable a CPU feature safely.
[1] D13.1.3 Principles of the ID scheme for fields in ID registers in
DDI.0487
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
The implementation of CPUClass::parse_features only supports CPU
features in "feature=value" format. However, libvirt maybe send us a
CPU feature string in "+feature/-feature" format. Hence, we need to
override CPUClass::parse_features to support CPU feature string in both
"feature=value" and "+feature/-feature" format.
The logic of AArch64CPUClass::parse_features is similar to that of
X86CPUClass::parse_features.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
The isar in ARMCPU is a struct, each field of which represents an ID
register. It's not convenient for us to support CPU feature in AArch64.
So let's change it to an array first and add an enum as the index of the
array for convenience. Since we will never access high 32-bits of ID
registers in AArch32, it's harmless to change the ID registers in
AArch32 to 64-bits.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
On AMD target, when host cache passthrough is disabled we will
emulate the guest caches with default values and initialize the
shared cpu list of the caches based on vCPU topology. However
when host cache passthrough is enabled, the shared cpu list is
consistent with host regardless what the vCPU topology is.
For example, when cache passthrough is enabled, running a guest
with vThreads=1 on a host with pThreads=2, we will get that there
are every *two* logical vCPUs sharing a L1/L2 cache, which is not
consistent with the vCPU topology (vThreads=1).
So let's reinitialize BITs[25:14] of AMD CPUID 8000_001D.EAX
based on the actual vCPU topology instead of host pCPU topology.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
On Intel target, when host cache passthrough is disabled we will
emulate the guest caches with default values and initialize the
shared cpu list of the caches based on vCPU topology. However when
host cache passthrough is enabled, the shared cpu list is consistent
with host regardless what the vCPU topology is.
For example, when cache passthrough is enabled, running a guest
with vThreads=1 on a host with pThreads=2, we will get that there
are every *two* logical vCPUs sharing a L1/L2 cache, which is not
consistent with the vCPU topology (vThreads=1).
So let's reinitialize BITs[25:14] of Intel CPUID 4 based on the
actual vCPU topology instead of host pCPU topology.
Signed-off-by: Jian Wang <wangjian161@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
nbd/server.c: fix invalid read after client was already free
qemu-nbd: make native as the default aio mode
qemu-nbd: set timeout to qemu-nbd socket
qemu-pr: fixed ioctl failed for multipath disk
block: enable cache mode of empty cdrom
block: disallow block jobs when there is a BDRV_O_INACTIVE flag
scsi: cdrom: Fix crash after remote cdrom detached
block: bugfix: disable process AIO when attach scsi disk
block: bugfix: Don't pause vm when NOSPACE EIO happened
scsi: bugfix: fix division by zero
Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
When backend disk is FULL and disk IO type is 'dataplane',
QEMU will pause the vm, and this may cause endless-loop in
QEMU main thread if we do the snapshot merge now.
When backend disk is FULL, only reporting an error rather
than pausing the virtual machine.
Signed-off-by: wangjian161 <wangjian161@huawei.com>