qemu/docs-devel-Add-VFIO-iommufd-backend-documentation.patch

221 lines
8.2 KiB
Diff
Raw Permalink Normal View History

QEMU update to version 8.2.0-30: - Revert "linux-user: Print tid not pid with strace" - gpex-acpi: Remove duplicate DSM #5 - smmuv3: Use default bus for arm-smmuv3-accel - smmuv3: Change arm-smmuv3-nested name to arm-smmuv3-accel - smmu-common: Return sysmem address space only for vfio-pci - smmuv3: realize get_pasid_cap and set ssidsize with pasid - vfio: Synthesize vPASID capability to VM - backend/iommufd: Report PASID capability - pci: Get pasid capability from vIOMMU - smmuv3: Add support for page fault handling - kvm: Translate MSI doorbell address only if it is valid - hw/arm/smmuv3: Enable sva/stall IDR features - iommufd.h: Updated to openeuler olk-6.6 kernel - tests/data/acpi/virt: Update IORT acpi table - hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested binding - tests/qtest: Allow IORT acpi table to change - hw/arm/virt-acpi-build: Build IORT with multiple SMMU nodes - hw/arm/smmuv3: Associate a pci bus with a SMMUv3 Nested device - hw/arm/smmuv3: Add initial support for SMMUv3 Nested device - hw/arm/virt: Add an SMMU_IO_LEN macro - hw/pci-host/gpex: [needs kernel fix] Allow to generate preserve boot config DSM #5 - tests/data/acpi: Update DSDT acpi tables - acpi/gpex: Fix PCI Express Slot Information function 0 returned value - tests/qtest: Allow DSDT acpi tables to change - hw/arm/smmuv3: Forward cache invalidate commands via iommufd - hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdev - hw/arm/smmuv3: Add missing STE invalidation - hw/arm/smmuv3: Add smmu_dev_install_nested_ste() for CFGI_STE - hw/arm/smmuv3: Check idr registers for STE_S1CDMAX and STE_S1STALLD - hw/arm/smmuv3: Read host SMMU device info - hw/arm/smmuv3: Ignore IOMMU_NOTIFIER_MAP for nested-smmuv3 - hw/arm/smmu-common: Return sysmem if stage-1 is bypassed - hw/arm/smmu-common: Add iommufd helpers - hw/arm/smmu-common: Add set/unset_iommu_device callback - hw/arm/smmu-common: Extract smmu_get_sbus and smmu_get_sdev helpers - hw/arm/smmu-common: Bypass emulated IOTLB for a nested SMMU - hw/arm/smmu-common: Add a nested flag to SMMUState - backends/iommufd: Introduce iommufd_viommu_invalidate_cache - backends/iommufd: Introduce iommufd_vdev_alloc - backends/iommufd: Introduce iommufd_backend_alloc_viommu - vfio/iommufd: Implement [at|de]tach_hwpt handlers - vfio/iommufd: Implement HostIOMMUDeviceClass::realize_late() handler - HostIOMMUDevice: Introduce realize_late callback - vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD - backends/iommufd: Add helpers for invalidating user-managed HWPT - Update iommufd.h header for vSVA - vfio/common: Allow disabling device dirty page tracking - vfio/migration: Don't block migration device dirty tracking is unsupported - vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support - vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support - vfio/iommufd: Probe and request hwpt dirty tracking capability - vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device() - vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps - vfio/{iommufd,container}: Remove caps::aw_bits - HostIOMMUDevice: Store the VFIO/VDPA agent - vfio/iommufd: Introduce auto domain creation - vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev - vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev - vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt() - backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities - vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev - vfio/pci: Extract mdev check into an helper - intel_iommu: Check compatibility with host IOMMU capabilities - intel_iommu: Implement [set|unset]_iommu_device() callbacks - intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap - vfio/pci: Pass HostIOMMUDevice to vIOMMU - hw/pci: Introduce pci_device_[set|unset]_iommu_device() - hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() - vfio: Create host IOMMU device instance - backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler - vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler - vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler - backends/iommufd: Introduce helper function iommufd_backend_get_device_info() - vfio/container: Implement HostIOMMUDeviceClass::realize() handler - range: Introduce range_get_last_bit() - backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices - vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device - backends/host_iommu_device: Introduce HostIOMMUDeviceCaps - backends: Introduce HostIOMMUDevice abstract - vfio/iommufd: Remove CONFIG_IOMMUFD usage - vfio/spapr: Extend VFIOIOMMUOps with a release handler - vfio/spapr: Only compile sPAPR IOMMU support when needed - vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interface - vfio/spapr: Introduce a sPAPR VFIOIOMMU QOM interface - vfio/container: Intoduce a new VFIOIOMMUClass::setup handler - vfio/container: Introduce a VFIOIOMMU legacy QOM interface - vfio/container: Introduce a VFIOIOMMU QOM interface - vfio/container: Initialize VFIOIOMMUOps under vfio_init_container() - vfio/container: Introduce vfio_legacy_setup() for further cleanups - docs/devel: Add VFIO iommufd backend documentation - vfio: Introduce a helper function to initialize VFIODevice - vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init - vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init - vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init - vfio/pci: Move VFIODevice initializations in vfio_instance_init - hw/i386: Activate IOMMUFD for q35 machines - kconfig: Activate IOMMUFD for s390x machines - hw/arm: Activate IOMMUFD for virt machines - vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks - vfio/ccw: Make vfio cdev pre-openable by passing a file handle - vfio/ccw: Allow the selection of a given iommu backend - vfio/ap: Make vfio cdev pre-openable by passing a file handle - vfio/ap: Allow the selection of a given iommu backend - vfio/platform: Make vfio cdev pre-openable by passing a file handle - vfio/platform: Allow the selection of a given iommu backend - vfio/pci: Make vfio cdev pre-openable by passing a file handle - vfio/pci: Allow the selection of a given iommu backend - vfio/iommufd: Enable pci hot reset through iommufd cdev interface - vfio/pci: Introduce a vfio pci hot reset interface - vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info - vfio/iommufd: Add support for iova_ranges and pgsizes - vfio/iommufd: Relax assert check for iommufd backend - vfio/iommufd: Implement the iommufd backend - vfio/common: return early if space isn't empty - util/char_dev: Add open_cdev() - backends/iommufd: Introduce the iommufd object - vfio/spapr: Move hostwin_list into spapr container - vfio/spapr: Move prereg_listener into spapr container - vfio/spapr: switch to spapr IOMMU BE add/del_section_window - vfio/spapr: Introduce spapr backend and target interface - vfio/container: Implement attach/detach_device - vfio/container: Move iova_ranges to base container - vfio/container: Move dirty_pgsizes and max_dirty_bitmap_size to base container - vfio/container: Move listener to base container - vfio/container: Move vrdl_list to base container - vfio/container: Move pgsizes and dma_max_mappings to base container - vfio/container: Convert functions to base container - vfio/container: Move per container device list in base container - vfio/container: Switch to IOMMU BE set_dirty_page_tracking/query_dirty_bitmap API - vfio/container: Move space field to base container - vfio/common: Move giommu_list in base container - vfio/common: Introduce vfio_container_init/destroy helper - vfio/container: Switch to dma_map|unmap API - vfio/container: Introduce a empty VFIOIOMMUOps - vfio: Introduce base object for VFIOContainer and targeted interface - cryptodev: Fix error handling in cryptodev_lkcf_execute_task() - hw/xen: Fix xen_bus_realize() error handling - hw/misc/aspeed_hace: Fix buffer overflow in has_padding function - target/s390x: Fix a typo in s390_cpu_class_init() - hw/sd/sdhci: free irq on exit - hw/ufs: free irq on exit - hw/pci-host/designware: Fix ATU_UPPER_TARGET register access - target/i386: Make invtsc migratable when user sets tsc-khz explicitly - target/i386: Construct CPUID 2 as stateful iff times > 1 - target/i386: Enable fdp-excptn-only and zero-fcs-fds - target/i386: Don't construct a all-zero entry for CPUID[0xD 0x3f] - i386/cpuid: Remove subleaf constraint on CPUID leaf 1F - target/i386: pass X86CPU to x86_cpu_get_supported_feature_word - target/i386: Raise the highest index value used for any VMCS encoding - target/i386: Add VMX control bits for nested FRED support - target/i386: Delete duplicated macro definition CR4_FRED_MASK - target/i386: Add get/set/migrate support for FRED MSRs - target/i386: enumerate VMX nested-exception support - vmxcap: add support for VMX FRED controls - target/i386: mark CR4.FRED not reserved - target/i386: add support for FRED in CPUID enumeration - target/i386: fix feature dependency for WAITPKG - target/i386: Add more features enumerated by CPUID.7.2.EDX - net: fix build when libbpf is disabled, but libxdp is enabled - hw/nvme: fix invalid endian conversion - hw/nvme: fix invalid check on mcl - backends/cryptodev: Do not ignore throttle/backends Errors - backends/cryptodev: Do not abort for invalid session ID - virtcca: add kvm isolation when get tmi version. - qga: Don't daemonize before channel is initialized - qga: Add log to guest-fsfreeze-thaw command - backends: VirtCCA: cvm_gpa_start supports both 1GB and 3GB - BUGFIX: Enforce isolation for virtcca_shared_hugepage - arm: VirtCCA: qemu CoDA support UEFI boot - arm: VirtCCA: Compatibility with older versions of TMM and the kernel - arm: VirtCCA: qemu uefi boot support kae - arm: VirtCCA: CVM support UEFI boot Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com> (cherry picked from commit 85fd7a435d8203dde56fedc4c8f500e41faf132c)
2025-04-22 14:34:58 +08:00
From fd1d6d64803a052adcab8c7993ca40cabc9c926d Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Sat, 11 Jan 2025 10:53:03 +0800
Subject: [PATCH] docs/devel: Add VFIO iommufd backend documentation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
MAINTAINERS | 1 +
docs/devel/index-internals.rst | 1 +
docs/devel/vfio-iommufd.rst | 166 +++++++++++++++++++++++++++++++++
3 files changed, 168 insertions(+)
create mode 100644 docs/devel/vfio-iommufd.rst
diff --git a/MAINTAINERS b/MAINTAINERS
index ca70bb4e64..0ddb20a35f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2176,6 +2176,7 @@ F: backends/iommufd.c
F: include/sysemu/iommufd.h
F: include/qemu/chardev_open.h
F: util/chardev_open.c
+F: docs/devel/vfio-iommufd.rst
vhost
M: Michael S. Tsirkin <mst@redhat.com>
diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 6f81df92bc..3def4a138b 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -18,5 +18,6 @@ Details about QEMU's various subsystems including how to add features to them.
s390-dasd-ipl
tracing
vfio-migration
+ vfio-iommufd
writing-monitor-commands
virtio-backends
diff --git a/docs/devel/vfio-iommufd.rst b/docs/devel/vfio-iommufd.rst
new file mode 100644
index 0000000000..3d1c11f175
--- /dev/null
+++ b/docs/devel/vfio-iommufd.rst
@@ -0,0 +1,166 @@
+===============================
+IOMMUFD BACKEND usage with VFIO
+===============================
+
+(Same meaning for backend/container/BE)
+
+With the introduction of iommufd, the Linux kernel provides a generic
+interface for user space drivers to propagate their DMA mappings to kernel
+for assigned devices. While the legacy kernel interface is group-centric,
+the new iommufd interface is device-centric, relying on device fd and iommufd.
+
+To support both interfaces in the QEMU VFIO device, introduce a base container
+to abstract the common part of VFIO legacy and iommufd container. So that the
+generic VFIO code can use either container.
+
+The base container implements generic functions such as memory_listener and
+address space management whereas the derived container implements callbacks
+specific to either legacy or iommufd. Each container has its own way to setup
+secure context and dma management interface. The below diagram shows how it
+looks like with both containers.
+
+::
+
+ VFIO AddressSpace/Memory
+ +-------+ +----------+ +-----+ +-----+
+ | pci | | platform | | ap | | ccw |
+ +---+---+ +----+-----+ +--+--+ +--+--+ +----------------------+
+ | | | | | AddressSpace |
+ | | | | +------------+---------+
+ +---V-----------V-----------V--------V----+ /
+ | VFIOAddressSpace | <------------+
+ | | | MemoryListener
+ | VFIOContainerBase list |
+ +-------+----------------------------+----+
+ | |
+ | |
+ +-------V------+ +--------V----------+
+ | iommufd | | vfio legacy |
+ | container | | container |
+ +-------+------+ +--------+----------+
+ | |
+ | /dev/iommu | /dev/vfio/vfio
+ | /dev/vfio/devices/vfioX | /dev/vfio/$group_id
+ Userspace | |
+ ============+============================+===========================
+ Kernel | device fd |
+ +---------------+ | group/container fd
+ | (BIND_IOMMUFD | | (SET_CONTAINER/SET_IOMMU)
+ | ATTACH_IOAS) | | device fd
+ | | |
+ | +-------V------------V-----------------+
+ iommufd | | vfio |
+ (map/unmap | +---------+--------------------+-------+
+ ioas_copy) | | | map/unmap
+ | | |
+ +------V------+ +-----V------+ +------V--------+
+ | iommfd core | | device | | vfio iommu |
+ +-------------+ +------------+ +---------------+
+
+* Secure Context setup
+
+ - iommufd BE: uses device fd and iommufd to setup secure context
+ (bind_iommufd, attach_ioas)
+ - vfio legacy BE: uses group fd and container fd to setup secure context
+ (set_container, set_iommu)
+
+* Device access
+
+ - iommufd BE: device fd is opened through ``/dev/vfio/devices/vfioX``
+ - vfio legacy BE: device fd is retrieved from group fd ioctl
+
+* DMA Mapping flow
+
+ 1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
+ 2. VFIO populates DMA map/unmap via the container BEs
+ * iommufd BE: uses iommufd
+ * vfio legacy BE: uses container fd
+
+Example configuration
+=====================
+
+Step 1: configure the host device
+---------------------------------
+
+It's exactly same as the VFIO device with legacy VFIO container.
+
+Step 2: configure QEMU
+----------------------
+
+Interactions with the ``/dev/iommu`` are abstracted by a new iommufd
+object (compiled in with the ``CONFIG_IOMMUFD`` option).
+
+Any QEMU device (e.g. VFIO device) wishing to use ``/dev/iommu`` must
+be linked with an iommufd object. It gets a new optional property
+named iommufd which allows to pass an iommufd object. Take ``vfio-pci``
+device for example:
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0
+ -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
+
+Note the ``/dev/iommu`` and VFIO cdev can be externally opened by a
+management layer. In such a case the fd is passed, the fd supports a
+string naming the fd or a number, for example:
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0,fd=22
+ -device vfio-pci,iommufd=iommufd0,fd=23
+
+If the ``fd`` property is not passed, the fd is opened by QEMU.
+
+If no ``iommufd`` object is passed to the ``vfio-pci`` device, iommufd
+is not used and the user gets the behavior based on the legacy VFIO
+container:
+
+.. code-block:: bash
+
+ -device vfio-pci,host=0000:02:00.0
+
+Supported platform
+==================
+
+Supports x86, ARM and s390x currently.
+
+Caveats
+=======
+
+Dirty page sync
+---------------
+
+Dirty page sync with iommufd backend is unsupported yet, live migration is
+disabled by default. But it can be force enabled like below, low efficient
+though.
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0
+ -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0,enable-migration=on
+
+P2P DMA
+-------
+
+PCI p2p DMA is unsupported as IOMMUFD doesn't support mapping hardware PCI
+BAR region yet. Below warning shows for assigned PCI device, it's not a bug.
+
+.. code-block:: none
+
+ qemu-system-x86_64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
+ qemu-system-x86_64: vfio_container_dma_map(0x560cb6cb1620, 0xe000000021000, 0x3000, 0x7f32ed55c000) = -14 (Bad address)
+
+FD passing with mdev
+--------------------
+
+``vfio-pci`` device checks sysfsdev property to decide if backend is a mdev.
+If FD passing is used, there is no way to know that and the mdev is treated
+like a real PCI device. There is an error as below if user wants to enable
+RAM discarding for mdev.
+
+.. code-block:: none
+
+ qemu-system-x86_64: -device vfio-pci,iommufd=iommufd0,x-balloon-allowed=on,fd=9: vfio VFIO_FD9: x-balloon-allowed only potentially compatible with mdev devices
+
+``vfio-ap`` and ``vfio-ccw`` devices don't have same issue as their backend
+devices are always mdev and RAM discarding is force enabled.
--
2.41.0.windows.1