qemu/vfio-pci-Make-vfio-cdev-pre-openable-by-passing-a-fi.patch
Jiabo Feng 4aa730192e QEMU update to version 8.2.0-30:
- Revert "linux-user: Print tid not pid with strace"
- gpex-acpi: Remove duplicate DSM #5
- smmuv3: Use default bus for arm-smmuv3-accel
- smmuv3: Change arm-smmuv3-nested name to arm-smmuv3-accel
- smmu-common: Return sysmem address space only for vfio-pci
- smmuv3: realize get_pasid_cap and set ssidsize with pasid
- vfio: Synthesize vPASID capability to VM
- backend/iommufd: Report PASID capability
- pci: Get pasid capability from vIOMMU
- smmuv3: Add support for page fault handling
- kvm: Translate MSI doorbell address only if it is valid
- hw/arm/smmuv3: Enable sva/stall IDR features
- iommufd.h: Updated to openeuler olk-6.6 kernel
- tests/data/acpi/virt: Update IORT acpi table
- hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested binding
- tests/qtest: Allow IORT acpi table to change
- hw/arm/virt-acpi-build: Build IORT with multiple SMMU nodes
- hw/arm/smmuv3: Associate a pci bus with a SMMUv3 Nested device
- hw/arm/smmuv3: Add initial support for SMMUv3 Nested device
- hw/arm/virt: Add an SMMU_IO_LEN macro
- hw/pci-host/gpex: [needs kernel fix] Allow to generate preserve boot config DSM #5
- tests/data/acpi: Update DSDT acpi tables
- acpi/gpex: Fix PCI Express Slot Information function 0 returned value
- tests/qtest: Allow DSDT acpi tables to change
- hw/arm/smmuv3: Forward cache invalidate commands via iommufd
- hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdev
- hw/arm/smmuv3: Add missing STE invalidation
- hw/arm/smmuv3: Add smmu_dev_install_nested_ste() for CFGI_STE
- hw/arm/smmuv3: Check idr registers for STE_S1CDMAX and STE_S1STALLD
- hw/arm/smmuv3: Read host SMMU device info
- hw/arm/smmuv3: Ignore IOMMU_NOTIFIER_MAP for nested-smmuv3
- hw/arm/smmu-common: Return sysmem if stage-1 is bypassed
- hw/arm/smmu-common: Add iommufd helpers
- hw/arm/smmu-common: Add set/unset_iommu_device callback
- hw/arm/smmu-common: Extract smmu_get_sbus and smmu_get_sdev helpers
- hw/arm/smmu-common: Bypass emulated IOTLB for a nested SMMU
- hw/arm/smmu-common: Add a nested flag to SMMUState
- backends/iommufd: Introduce iommufd_viommu_invalidate_cache
- backends/iommufd: Introduce iommufd_vdev_alloc
- backends/iommufd: Introduce iommufd_backend_alloc_viommu
- vfio/iommufd: Implement [at|de]tach_hwpt handlers
- vfio/iommufd: Implement HostIOMMUDeviceClass::realize_late() handler
- HostIOMMUDevice: Introduce realize_late callback
- vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
- backends/iommufd: Add helpers for invalidating user-managed HWPT
- Update iommufd.h header for vSVA
- vfio/common: Allow disabling device dirty page tracking
- vfio/migration: Don't block migration device dirty tracking is unsupported
- vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
- vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
- vfio/iommufd: Probe and request hwpt dirty tracking capability
- vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
- vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
- vfio/{iommufd,container}: Remove caps::aw_bits
- HostIOMMUDevice: Store the VFIO/VDPA agent
- vfio/iommufd: Introduce auto domain creation
- vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev
- vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
- vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
- backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
- vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
- vfio/pci: Extract mdev check into an helper
- intel_iommu: Check compatibility with host IOMMU capabilities
- intel_iommu: Implement [set|unset]_iommu_device() callbacks
- intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
- vfio/pci: Pass HostIOMMUDevice to vIOMMU
- hw/pci: Introduce pci_device_[set|unset]_iommu_device()
- hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
- vfio: Create host IOMMU device instance
- backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
- vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
- vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
- backends/iommufd: Introduce helper function iommufd_backend_get_device_info()
- vfio/container: Implement HostIOMMUDeviceClass::realize() handler
- range: Introduce range_get_last_bit()
- backends/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD[_VFIO] devices
- vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
- backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
- backends: Introduce HostIOMMUDevice abstract
- vfio/iommufd: Remove CONFIG_IOMMUFD usage
- vfio/spapr: Extend VFIOIOMMUOps with a release handler
- vfio/spapr: Only compile sPAPR IOMMU support when needed
- vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interface
- vfio/spapr: Introduce a sPAPR VFIOIOMMU QOM interface
- vfio/container: Intoduce a new VFIOIOMMUClass::setup handler
- vfio/container: Introduce a VFIOIOMMU legacy QOM interface
- vfio/container: Introduce a VFIOIOMMU QOM interface
- vfio/container: Initialize VFIOIOMMUOps under vfio_init_container()
- vfio/container: Introduce vfio_legacy_setup() for further cleanups
- docs/devel: Add VFIO iommufd backend documentation
- vfio: Introduce a helper function to initialize VFIODevice
- vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init
- vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init
- vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init
- vfio/pci: Move VFIODevice initializations in vfio_instance_init
- hw/i386: Activate IOMMUFD for q35 machines
- kconfig: Activate IOMMUFD for s390x machines
- hw/arm: Activate IOMMUFD for virt machines
- vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks
- vfio/ccw: Make vfio cdev pre-openable by passing a file handle
- vfio/ccw: Allow the selection of a given iommu backend
- vfio/ap: Make vfio cdev pre-openable by passing a file handle
- vfio/ap: Allow the selection of a given iommu backend
- vfio/platform: Make vfio cdev pre-openable by passing a file handle
- vfio/platform: Allow the selection of a given iommu backend
- vfio/pci: Make vfio cdev pre-openable by passing a file handle
- vfio/pci: Allow the selection of a given iommu backend
- vfio/iommufd: Enable pci hot reset through iommufd cdev interface
- vfio/pci: Introduce a vfio pci hot reset interface
- vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info
- vfio/iommufd: Add support for iova_ranges and pgsizes
- vfio/iommufd: Relax assert check for iommufd backend
- vfio/iommufd: Implement the iommufd backend
- vfio/common: return early if space isn't empty
- util/char_dev: Add open_cdev()
- backends/iommufd: Introduce the iommufd object
- vfio/spapr: Move hostwin_list into spapr container
- vfio/spapr: Move prereg_listener into spapr container
- vfio/spapr: switch to spapr IOMMU BE add/del_section_window
- vfio/spapr: Introduce spapr backend and target interface
- vfio/container: Implement attach/detach_device
- vfio/container: Move iova_ranges to base container
- vfio/container: Move dirty_pgsizes and max_dirty_bitmap_size to base container
- vfio/container: Move listener to base container
- vfio/container: Move vrdl_list to base container
- vfio/container: Move pgsizes and dma_max_mappings to base container
- vfio/container: Convert functions to base container
- vfio/container: Move per container device list in base container
- vfio/container: Switch to IOMMU BE set_dirty_page_tracking/query_dirty_bitmap API
- vfio/container: Move space field to base container
- vfio/common: Move giommu_list in base container
- vfio/common: Introduce vfio_container_init/destroy helper
- vfio/container: Switch to dma_map|unmap API
- vfio/container: Introduce a empty VFIOIOMMUOps
- vfio: Introduce base object for VFIOContainer and targeted interface
- cryptodev: Fix error handling in cryptodev_lkcf_execute_task()
- hw/xen: Fix xen_bus_realize() error handling
- hw/misc/aspeed_hace: Fix buffer overflow in has_padding function
- target/s390x: Fix a typo in s390_cpu_class_init()
- hw/sd/sdhci: free irq on exit
- hw/ufs: free irq on exit
- hw/pci-host/designware: Fix ATU_UPPER_TARGET register access
- target/i386: Make invtsc migratable when user sets tsc-khz explicitly
- target/i386: Construct CPUID 2 as stateful iff times > 1
- target/i386: Enable fdp-excptn-only and zero-fcs-fds
- target/i386: Don't construct a all-zero entry for CPUID[0xD 0x3f]
- i386/cpuid: Remove subleaf constraint on CPUID leaf 1F
- target/i386: pass X86CPU to x86_cpu_get_supported_feature_word
- target/i386: Raise the highest index value used for any VMCS encoding
- target/i386: Add VMX control bits for nested FRED support
- target/i386: Delete duplicated macro definition CR4_FRED_MASK
- target/i386: Add get/set/migrate support for FRED MSRs
- target/i386: enumerate VMX nested-exception support
- vmxcap: add support for VMX FRED controls
- target/i386: mark CR4.FRED not reserved
- target/i386: add support for FRED in CPUID enumeration
- target/i386: fix feature dependency for WAITPKG
- target/i386: Add more features enumerated by CPUID.7.2.EDX
- net: fix build when libbpf is disabled, but libxdp is enabled
- hw/nvme: fix invalid endian conversion
- hw/nvme: fix invalid check on mcl
- backends/cryptodev: Do not ignore throttle/backends Errors
- backends/cryptodev: Do not abort for invalid session ID
- virtcca: add kvm isolation when get tmi version.
- qga: Don't daemonize before channel is initialized
- qga: Add log to guest-fsfreeze-thaw command
- backends: VirtCCA: cvm_gpa_start supports both 1GB and 3GB
- BUGFIX: Enforce isolation for virtcca_shared_hugepage
- arm: VirtCCA: qemu CoDA support UEFI boot
- arm: VirtCCA: Compatibility with older versions of TMM and the kernel
- arm: VirtCCA: qemu uefi boot support kae
- arm: VirtCCA: CVM support UEFI boot

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
(cherry picked from commit 85fd7a435d8203dde56fedc4c8f500e41faf132c)
2025-05-14 15:07:14 +08:00

225 lines
7.9 KiB
Diff

From 008d4e37fe67c7f81920efe862352c4b1f3cd1b0 Mon Sep 17 00:00:00 2001
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Sat, 11 Jan 2025 10:52:47 +0800
Subject: [PATCH] vfio/pci: Make vfio cdev pre-openable by passing a file
handle
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.
Together with the earlier support of pre-opening /dev/iommu device,
now we have full support of passing a vfio device to unprivileged
qemu by management tool. This mode is no more considered for the
legacy backend. So let's remove the "TODO" comment.
Add helper functions vfio_device_set_fd() and vfio_device_get_name()
to set fd and get device name, they will also be used by other vfio
devices.
There is no easy way to check if a device is mdev with FD passing,
so fail the x-balloon-allowed check unconditionally in this case.
There is also no easy way to get BDF as name with FD passing, so
we fake a name by VFIO_FD[fd].
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
hw/vfio/helpers.c | 43 +++++++++++++++++++++++++++++++++++
hw/vfio/iommufd.c | 12 ++++++----
hw/vfio/pci.c | 28 +++++++++++++----------
include/hw/vfio/vfio-common.h | 4 ++++
4 files changed, 71 insertions(+), 16 deletions(-)
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index 168847e7c5..3592c3d54e 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -27,6 +27,7 @@
#include "trace.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
+#include "monitor/monitor.h"
/*
* Common VFIO interrupt disable
@@ -609,3 +610,45 @@ bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type)
return ret;
}
+
+int vfio_device_get_name(VFIODevice *vbasedev, Error **errp)
+{
+ struct stat st;
+
+ if (vbasedev->fd < 0) {
+ if (stat(vbasedev->sysfsdev, &st) < 0) {
+ error_setg_errno(errp, errno, "no such host device");
+ error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev);
+ return -errno;
+ }
+ /* User may specify a name, e.g: VFIO platform device */
+ if (!vbasedev->name) {
+ vbasedev->name = g_path_get_basename(vbasedev->sysfsdev);
+ }
+ } else {
+ if (!vbasedev->iommufd) {
+ error_setg(errp, "Use FD passing only with iommufd backend");
+ return -EINVAL;
+ }
+ /*
+ * Give a name with fd so any function printing out vbasedev->name
+ * will not break.
+ */
+ if (!vbasedev->name) {
+ vbasedev->name = g_strdup_printf("VFIO_FD%d", vbasedev->fd);
+ }
+ }
+
+ return 0;
+}
+
+void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp)
+{
+ int fd = monitor_fd_param(monitor_cur(), str, errp);
+
+ if (fd < 0) {
+ error_prepend(errp, "Could not parse remote object fd %s:", str);
+ return;
+ }
+ vbasedev->fd = fd;
+}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 6e53e013ef..5accd26484 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -320,11 +320,15 @@ static int iommufd_cdev_attach(const char *name, VFIODevice *vbasedev,
uint32_t ioas_id;
Error *err = NULL;
- devfd = iommufd_cdev_getfd(vbasedev->sysfsdev, errp);
- if (devfd < 0) {
- return devfd;
+ if (vbasedev->fd < 0) {
+ devfd = iommufd_cdev_getfd(vbasedev->sysfsdev, errp);
+ if (devfd < 0) {
+ return devfd;
+ }
+ vbasedev->fd = devfd;
+ } else {
+ devfd = vbasedev->fd;
}
- vbasedev->fd = devfd;
ret = iommufd_cdev_connect_and_bind(vbasedev, errp);
if (ret) {
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c5984b0598..445d58c8e5 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2944,17 +2944,19 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
VFIODevice *vbasedev = &vdev->vbasedev;
char *tmp, *subsys;
Error *err = NULL;
- struct stat st;
int i, ret;
bool is_mdev;
char uuid[UUID_STR_LEN];
char *name;
- if (!vbasedev->sysfsdev) {
+ if (vbasedev->fd < 0 && !vbasedev->sysfsdev) {
if (!(~vdev->host.domain || ~vdev->host.bus ||
~vdev->host.slot || ~vdev->host.function)) {
error_setg(errp, "No provided host device");
error_append_hint(errp, "Use -device vfio-pci,host=DDDD:BB:DD.F "
+#ifdef CONFIG_IOMMUFD
+ "or -device vfio-pci,fd=DEVICE_FD "
+#endif
"or -device vfio-pci,sysfsdev=PATH_TO_DEVICE\n");
return;
}
@@ -2964,13 +2966,9 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
vdev->host.slot, vdev->host.function);
}
- if (stat(vbasedev->sysfsdev, &st) < 0) {
- error_setg_errno(errp, errno, "no such host device");
- error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev);
+ if (vfio_device_get_name(vbasedev, errp) < 0) {
return;
}
-
- vbasedev->name = g_path_get_basename(vbasedev->sysfsdev);
vbasedev->ops = &vfio_pci_ops;
vbasedev->type = VFIO_DEVICE_TYPE_PCI;
vbasedev->dev = DEVICE(vdev);
@@ -3330,6 +3328,7 @@ static void vfio_instance_init(Object *obj)
vdev->host.bus = ~0U;
vdev->host.slot = ~0U;
vdev->host.function = ~0U;
+ vdev->vbasedev.fd = -1;
vdev->nv_gpudirect_clique = 0xFF;
@@ -3383,11 +3382,6 @@ static Property vfio_pci_dev_properties[] = {
qdev_prop_nv_gpudirect_clique, uint8_t),
DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo,
OFF_AUTOPCIBAR_OFF),
- /*
- * TODO - support passed fds... is this necessary?
- * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
- * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name),
- */
#ifdef CONFIG_IOMMUFD
DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd,
TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
@@ -3395,6 +3389,13 @@ static Property vfio_pci_dev_properties[] = {
DEFINE_PROP_END_OF_LIST(),
};
+#ifdef CONFIG_IOMMUFD
+static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp)
+{
+ vfio_device_set_fd(&VFIO_PCI(obj)->vbasedev, str, errp);
+}
+#endif
+
static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
@@ -3402,6 +3403,9 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
dc->reset = vfio_pci_reset;
device_class_set_props(dc, vfio_pci_dev_properties);
+#ifdef CONFIG_IOMMUFD
+ object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd);
+#endif
dc->desc = "VFIO-based PCI device assignment";
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
pdc->realize = vfio_realize;
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 9b9fd7b461..5f35f2900b 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -265,4 +265,8 @@ int vfio_devices_query_dirty_bitmap(VFIOContainerBase *bcontainer,
hwaddr size);
int vfio_get_dirty_bitmap(VFIOContainerBase *bcontainer, uint64_t iova,
uint64_t size, ram_addr_t ram_addr);
+
+/* Returns 0 on success, or a negative errno. */
+int vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
+void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp);
#endif /* HW_VFIO_VFIO_COMMON_H */
--
2.41.0.windows.1