537 Commits

Author SHA1 Message Date
tangzhongrui
29a47b91a6 Remove duplicate and disorderly changelogs which will cause compilation errors.
Signed-off-by:  Zhongrui Tang <tangzhongrui@cmss.chinamobile.com>
2021-08-30 16:51:48 +08:00
openeuler-ci-bot
2a25f30bbe !354 【SP1分支同步】block_curl: add bolck_curl package
From: @lijiajie128
Reviewed-by: @imxcc
Signed-off-by: @imxcc
2021-08-20 02:27:12 +00:00
Jiajie Li
0978d96786 block_curl: add bolck_curl package
Signed-off-by: Jiajie Li <lijiajie11@huawei.com>
2021-08-19 13:43:41 +08:00
openeuler-ci-bot
d60ae9a499 !349 Automatically generate code patches with openeuler !185
From: @kuhnchen18
Reviewed-by: @imxcc
Signed-off-by: @imxcc
2021-08-16 10:45:43 +00:00
Chen Qun
0e5958c788 spec: Update release version with !185
increase release verison by one

Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
2021-08-16 11:29:58 +08:00
Chen Qun
ce72a2174d spec: Update patch and changelog with !185 fix CVE-2021-3682 #I45H4H !185
usbredir: fix free call

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-08-16 11:29:37 +08:00
Chen Qun
47c21ec4a9 usbredir: fix free call
data might point into the middle of a larger buffer, there is a separate
free_on_destroy pointer passed into bufp_alloc() to handle that.  It is
only used in the normal workflow though, not when dropping packets due
to the queue being full.  Fix that.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/491
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210722072756.647673-1-kraxel@redhat.com>
Signed-off-by: imxcc <xingchaochao@huawei.com>
2021-08-16 11:29:37 +08:00
openeuler-ci-bot
d1cc9da786 !346 Automatically generate code patches with openeuler !183
From: @kuhnchen18
Reviewed-by: @imxcc
Signed-off-by: @imxcc
2021-08-05 07:54:15 +00:00
Chen Qun
216918bb04 spec: Update release version with !183
increase release verison by one

Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
b94d8926ee spec: Update patch and changelog with !183 Support VFIO migration manual clear interface & vSMMUv3/pSMMUv3 2 stage VFIO integration & Support migration in SMMUv3 nested mode !183
vfio: Support host translation granule size
vfio/migrate: Move switch of dirty tracking into vfio_memory_listener
vfio: Fix unregister SaveVMHandler in vfio_migration_finalize
migration/ram: Reduce unnecessary rate limiting
migration/ram: Optimize ram_save_host_page()
qdev/monitors: Fix reundant error_setg of qdev_add_device
linux-headers: update against 5.10 and manual clear vfio dirty log series
vfio: Maintain DMA mapping range for the container
vfio/migration: Add support for manual clear vfio dirty log
hw/arm/smmuv3: Support 16K translation granule
hw/arm/smmuv3: Set the restoration priority of the vSMMUv3 explicitly
hw/vfio/common: trace vfio_connect_container operations
update-linux-headers: Import iommu.h
vfio.h and iommu.h header update against 5.10
memory: Add new fields in IOTLBEntry
hw/arm/smmuv3: Improve stage1 ASID invalidation
hw/arm/smmu-common: Allow domain invalidation for NH_ALL/NSNH_ALL
memory: Add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute
memory: Add IOMMU_ATTR_MSI_TRANSLATE IOMMU memory region attribute
memory: Introduce IOMMU Memory Region inject_faults API
iommu: Introduce generic header
pci: introduce PCIPASIDOps to PCIDevice
vfio: Force nested if iommu requires it
vfio: Introduce hostwin_from_range helper
vfio: Introduce helpers to DMA map/unmap a RAM section
vfio: Set up nested stage mappings
vfio: Pass stage 1 MSI bindings to the host
vfio: Helper to get IRQ info including capabilities
vfio/pci: Register handler for iommu fault
vfio/pci: Set up the DMA FAULT region
vfio/pci: Implement the DMA fault handler
hw/arm/smmuv3: Advertise MSI_TRANSLATE attribute
hw/arm/smmuv3: Store the PASID table GPA in the translation config
hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation
hw/arm/smmuv3: Fill the IOTLBEntry leaf field on NH_VA invalidation
hw/arm/smmuv3: Pass stage 1 configurations to the host
hw/arm/smmuv3: Implement fault injection
hw/arm/smmuv3: Allow MAP notifiers
pci: Add return_page_response pci ops
vfio/pci: Implement return_page_response page response callback
vfio/common: Avoid unmap ram section at vfio_listener_region_del() in nested mode
vfio: Introduce helpers to mark dirty pages of a RAM section
vfio: Add vfio_prereg_listener_log_sync in nested stage
vfio: Add vfio_prereg_listener_log_clear to re-enable mark dirty pages
vfio: Add vfio_prereg_listener_global_log_start/stop in nested stage
hw/arm/smmuv3: Post-load stage 1 configurations to the host

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
b06e551676 hw/arm/smmuv3: Post-load stage 1 configurations to the host
In nested mode, we call the set_pasid_table() callback on each
STE update to pass the guest stage 1 configuration to the host
and apply it at physical level.

In the case of live migration, we need to manually call the
set_pasid_table() to load the guest stage 1 configurations to
the host. If this operation fails, the migration fails.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
7644dd1549 vfio: Add vfio_prereg_listener_global_log_start/stop in nested stage
In nested mode, we set up the stage 2 and stage 1 separately. In my
opinion, vfio_memory_prereg_listener is used for stage 2 and
vfio_memory_listener is used for stage 1. So it feels weird to call
the global_log_start/stop interface in vfio_memory_listener to switch
dirty tracking, although this won't cause any errors. Add
global_log_start/stop interface in vfio_memory_prereg_listener
can separate stage 2 from stage 1.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
38c3954435 vfio: Add vfio_prereg_listener_log_clear to re-enable mark dirty pages
When tracking dirty pages, we just need to pay attention to stage 2
mappings. Legacy vfio_listener_log_clear cannot be used in nested
stage. This patch adds vfio_prereg_listener_log_clear to re-enable
dirty pages in nested mode.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
eae456de7c vfio: Add vfio_prereg_listener_log_sync in nested stage
In nested mode, we set up the stage 2 (gpa->hpa)and stage 1
(giova->gpa) separately by vfio_prereg_listener_region_add()
and vfio_listener_region_add(). So when marking dirty pages
we just need to pay attention to stage 2 mappings.

Legacy vfio_listener_log_sync cannot be used in nested stage.
This patch adds vfio_prereg_listener_log_sync to mark dirty
pages in nested mode.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
e4d4275433 vfio: Introduce helpers to mark dirty pages of a RAM section
Extract part of the code from vfio_sync_dirty_bitmap to form a
new helper, which allows to mark dirty pages of a RAM section.
This helper will be called for nested stage.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
48e4f1552b vfio/common: Avoid unmap ram section at vfio_listener_region_del() in nested mode
The ram section will be unmapped at vfio_prereg_listener_region_del()
in nested mode. So let's avoid unmap ram section at
vfio_listener_region_dev().

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
518ab37de3 vfio/pci: Implement return_page_response page response callback
This patch implements the page response path. The
response is written into the page response ring buffer and then
update header's head index is updated. This path is not used
by this series. It is introduced here as a POC for vSVA/ARM
integration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
2117b42cb1 pci: Add return_page_response pci ops
Add a new PCI operation that allows to return page responses
to registered VFIO devices

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
c2ecdaca13 hw/arm/smmuv3: Allow MAP notifiers
We now have all bricks to support nested paging. This
uses MAP notifiers to map the MSIs. So let's allow MAP
notifiers to be registered.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
b81289b550 hw/arm/smmuv3: Implement fault injection
We convert iommu_fault structs received from the kernel
into the data struct used by the emulation code and record
the evnts into the virtual event queue.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
1e1a34cb30 hw/arm/smmuv3: Pass stage 1 configurations to the host
In case PASID PciOps are set for the device we call
the set_pasid_table() callback on each STE update.

This allows to pass the guest stage 1 configuration
to the host and apply it at physical level.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
bcfe3f1d19 hw/arm/smmuv3: Fill the IOTLBEntry leaf field on NH_VA invalidation
Let's propagate the leaf attribute throughout the invalidation path.
This hint is used to reduce the scope of the invalidations to the
last level of translation. Not enforcing it induces large performance
penalties in nested mode.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
4b4532b880 hw/arm/smmuv3: Fill the IOTLBEntry arch_id on NH_VA invalidation
When the guest invalidates one S1 entry, it passes the asid.
When propagating this invalidation downto the host, the asid
information also must be passed. So let's fill the arch_id field
introduced for that purpose and accordingly set the flags to
indicate its presence.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
cf3014ddc1 hw/arm/smmuv3: Store the PASID table GPA in the translation config
For VFIO integration we will need to pass the Context Descriptor (CD)
table GPA to the host. The CD table is also referred to as the PASID
table. Its GPA corresponds to the s1ctrptr field of the Stream Table
Entry. So let's decode and store it in the configuration structure.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
a782347b03 hw/arm/smmuv3: Advertise MSI_TRANSLATE attribute
The SMMUv3 has the peculiarity to translate MSI
transactionss. let's advertise the corresponding
attribute.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
93c6df7e5a vfio/pci: Implement the DMA fault handler
Whenever the eventfd is triggered, we retrieve the DMA fault(s)
from the mmapped fault region and inject them in the iommu
memory region.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
58476d1b47 vfio/pci: Set up the DMA FAULT region
Set up the fault region which is composed of the actual fault
queue (mmappable) and a header used to handle it. The fault
queue is mmapped.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
98dfb30ca5 vfio/pci: Register handler for iommu fault
We use the new extended IRQ VFIO_IRQ_TYPE_NESTED type and
VFIO_IRQ_SUBTYPE_DMA_FAULT subtype to set/unset
a notifier for physical DMA faults. The associated eventfd is
triggered, in nested mode, whenever a fault is detected at IOMMU
physical level.

The actual handler will be implemented in subsequent patches.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
f4fce522f0 vfio: Helper to get IRQ info including capabilities
As done for vfio regions, add helpers to retrieve irq info
including their optional capabilities.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:28 +08:00
Chen Qun
d159b4edba vfio: Pass stage 1 MSI bindings to the host
We register the stage1 MSI bindings when enabling the vectors
and we unregister them on msi disable.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
39db503d4d vfio: Set up nested stage mappings
In nested mode, legacy vfio_iommu_map_notify cannot be used as
there is no "caching" mode and we do not trap on map.

On Intel, vfio_iommu_map_notify was used to DMA map the RAM
through the host single stage.

With nested mode, we need to setup the stage 2 and the stage 1
separately. This patch introduces a prereg_listener to setup
the stage 2 mapping.

The stage 1 mapping, owned by the guest, is passed to the host
when the guest invalidates the stage 1 configuration, through
a dedicated PCIPASIDOps callback. Guest IOTLB invalidations
are cascaded downto the host through another IOMMU MR UNMAP
notifier.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
89a04f5725 vfio: Introduce helpers to DMA map/unmap a RAM section
Let's introduce two helpers that allow to DMA map/unmap a RAM
section. Those helpers will be called for nested stage setup in
another call site. Also the vfio_listener_region_add/del()
structure may be clearer.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
6c756b3010 vfio: Introduce hostwin_from_range helper
Let's introduce a hostwin_from_range() helper that returns the
hostwin encapsulating an IOVA range or NULL if none is found.

This improves the readibility of callers and removes the usage
of hostwin_found.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
cfcd9dd6b1 vfio: Force nested if iommu requires it
In case we detect the address space is translated by
a virtual IOMMU which requires HW nested paging to
integrate with VFIO, let's set up the container with
the VFIO_TYPE1_NESTING_IOMMU iommu_type.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
a0fc43becb pci: introduce PCIPASIDOps to PCIDevice
This patch introduces PCIPASIDOps for IOMMU related operations.

https://lists.gnu.org/archive/html/qemu-devel/2018-03/msg00078.html
https://lists.gnu.org/archive/html/qemu-devel/2018-03/msg00940.html

So far, to setup virt-SVA for assigned SVA capable device, needs to
configure host translation structures for specific pasid. (e.g. bind
guest page table to host and enable nested translation in host).
Besides, vIOMMU emulator needs to forward guest's cache invalidation
to host since host nested translation is enabled. e.g. on VT-d, guest
owns 1st level translation table, thus cache invalidation for 1st
level should be propagated to host.

This patch adds two functions: alloc_pasid and free_pasid to support
guest pasid allocation and free. The implementations of the callbacks
would be device passthru modules. Like vfio.

Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Yi Sun <yi.y.sun@linux.intel.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
3ff7455793 iommu: Introduce generic header
This header is meant to exposes data types used by
several IOMMU devices such as struct for SVA and
nested stage configuration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
42c9d0a3d0 memory: Introduce IOMMU Memory Region inject_faults API
This new API allows to inject @count iommu_faults into
the IOMMU memory region.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
f026ae6d35 memory: Add IOMMU_ATTR_MSI_TRANSLATE IOMMU memory region attribute
We introduce a new IOMMU Memory Region attribute, IOMMU_ATTR_MSI_TRANSLATE
which tells whether the virtual IOMMU translates MSIs. ARM SMMU
will expose this attribute since, as opposed to Intel DMAR, MSIs
are translated as any other DMA requests.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
6ddda50d8a memory: Add IOMMU_ATTR_VFIO_NESTED IOMMU memory region attribute
We introduce a new IOMMU Memory Region attribute,
IOMMU_ATTR_VFIO_NESTED that tells whether the virtual IOMMU
requires HW nested paging for VFIO integration.

Current Intel virtual IOMMU device supports "Caching
Mode" and does not require 2 stages at physical level to be
integrated with VFIO. However SMMUv3 does not implement such
"caching mode" and requires to use HW nested paging.

As such SMMUv3 is the first IOMMU device to advertise this
attribute.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
30cc257bc5 hw/arm/smmu-common: Allow domain invalidation for NH_ALL/NSNH_ALL
NH_ALL/NSNH_ALL corresponds to a domain granularity invalidation,
ie. all the notifier range gets invalidation, whatever the ASID.
So let's set the granularity to IOMMU_INV_GRAN_DOMAIN to allow
the consumer to benefit from the info if it can.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: chenxiang (M) <chenxiang66@hisilicon.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
be7575f6ad hw/arm/smmuv3: Improve stage1 ASID invalidation
At the moment ASID invalidation command (CMD_TLBI_NH_ASID) is
propagated as a domain invalidation (the whole notifier range
is invalidated independently on any ASID information).

The new granularity field now allows to be more precise and
restrict the invalidation to a peculiar ASID. Set the corresponding
fields and flag.

We still keep the iova and addr_mask settings for consumers that
do not support the new fields, like VHOST.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
24678ea00b memory: Add new fields in IOTLBEntry
The current IOTLBEntry becomes too simple to interact with
some physical IOMMUs. IOTLBs can be invalidated with different
granularities: domain, pasid, addr. Current IOTLB entry only offers
page selective invalidation. Let's add a granularity field
that conveys this information.

TLB entries are usually tagged with some ids such as the asid
or pasid. When propagating an invalidation command from the
guest to the host, we need to pass those IDs.

Also we add a leaf field which indicates, in case of invalidation
notification, whether only cache entries for the last level of
translation are required to be invalidated.

A flag field is introduced to inform whether those fields are set.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
3e52dd6203 vfio.h and iommu.h header update against 5.10
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
12e533eab4 update-linux-headers: Import iommu.h
Update the script to import the new iommu.h uapi header.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
483badb73d hw/vfio/common: trace vfio_connect_container operations
We currently trace vfio_disconnect_container() but we do not trace
the container <-> group creation, which can be useful to understand
the VFIO topology.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
f5fd61400b hw/arm/smmuv3: Set the restoration priority of the vSMMUv3 explicitly
Ensure the vSMMUv3 will be restored before all PCIe devices so that DMA
translation can work properly during migration.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Message-id: 20201019091508.197-1-yuzenghui@huawei.com
Acked-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
3c9cc492f2 hw/arm/smmuv3: Support 16K translation granule
The driver can query some bits in SMMUv3 IDR5 to learn which
translation granules are supported. Arm recommends that SMMUv3
implementations support at least 4K and 64K granules. But in
the vSMMUv3, there seems to be no reason not to support 16K
translation granule. In addition, if 16K is not supported,
vSVA will failed to be enabled in the future for 16K guest
kernel. So it'd better to support it.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-08-04 11:28:27 +08:00
Chen Qun
c127034a6d vfio/migration: Add support for manual clear vfio dirty log
The new capability VFIO_DIRTY_LOG_MANUAL_CLEAR and the new ioctl
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP have been introduced in
the kernel, tweak the userspace side to use them.

Check if the kernel supports VFIO_DIRTY_LOG_MANUAL_CLEAR and
provide the log_clear() hook for vfio_memory_listener. If the
kernel supports it, deliever the clear message to kernel.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
be1ff354a7 vfio: Maintain DMA mapping range for the container
When synchronizing dirty bitmap from kernel VFIO we do it in a
per-iova-range fashion and we allocate the userspace bitmap for each of the
ioctl. This patch introduces `struct VFIODMARange` to describe a range of
the given DMA mapping with respect to a VFIO_IOMMU_MAP_DMA operation, and
make the bitmap cache of this range be persistent so that we don't need to
g_try_malloc0() every time. Note that the new structure is almost a copy of
`struct vfio_iommu_type1_dma_map` but only internally used by QEMU.

More importantly, the cached per-iova-range dirty bitmap will be further
used when we want to add support for the CLEAR_BITMAP and this cached
bitmap will be used to guarantee we don't clear any unknown dirty bits
otherwise that can be a severe data loss issue for migration code.

It's pretty intuitive to maintain a bitmap per container since we perform
log_sync at this granule. But I don't know how to deal with things like
memory hot-{un}plug, sparse DMA mappings, etc. Suggestions welcome.

* yet something to-do:
  - can't work with guest viommu
  - no locks
  - etc

[ The idea and even the commit message are largely inherited from kvm side.
  See commit 9f4bf4baa8b820c7930e23c9566c9493db7e1d25. ]

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jinagkunkun@huawei.com>
2021-08-04 11:28:27 +08:00
Chen Qun
18c2252e32 linux-headers: update against 5.10 and manual clear vfio dirty log series
The new capability VFIO_DIRTY_LOG_MANUAL_CLEAR and the new ioctl
VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP_NOCLEAR and
VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP have been introduced in
the kernel, update the header to add them.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
2021-08-04 11:28:27 +08:00