qemu/hw-vfio-pci-quirks-Support-alternate-offset-for-GPUD.patch
Jiabo Feng 0b0e518312 QEMU update to version 6.2.0-82(master)
- thread-pool: optimize scheduling of completion bottom half
- migration/rdma: zore out head.repeat to make the error more clear
- vhost-user-fs: Back up vqs before cleaning up vhost_dev
- hw/vfio/pci-quirks: Sanitize capability pointer
- hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques
- replay: fix event queue flush for qemu shutdown
- hw/net: Fix read of uninitialized memory in ftgmac100
- target/ppc: Fix tlbie
- target/i386: fix INVD vmexit
- qtest/npcm7xx_pwm-test: Fix memory leak in mft_qom_set
- aio-posix: zero out io_uring sqe user_data

Signed-off-by: Jiabo Feng <fengjiabo1@huawei.com>
2023-10-30 16:56:48 +08:00

96 lines
3.6 KiB
Diff

From d672e2f137933b26bd9b3488a873830435eadba5 Mon Sep 17 00:00:00 2001
From: tangzhongrui <tangzhongrui@cmss.chinamobile.com>
Date: Thu, 3 Aug 2023 15:10:16 +0800
Subject: [PATCH] hw/vfio/pci-quirks: Support alternate offset for GPUDirect
Cliques
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
NVIDIA Turing and newer GPUs implement the MSI-X capability at the offset
previously reserved for use by hypervisors to implement the GPUDirect
Cliques capability. A revised specification provides an alternate
location. Add a config space walk to the quirk to check for conflicts,
allowing us to fall back to the new location or generate an error at the
quirk setup rather than when the real conflicting capability is added
should there be no available location.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhongrui Tang <tangzhongrui_yewu@cmss.chinamobile.com>
---
hw/vfio/pci-quirks.c | 41 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 1222ccff0b..7a8e6efcdc 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1677,6 +1677,9 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev)
* +---------------------------------+---------------------------------+
*
* https://lists.gnu.org/archive/html/qemu-devel/2017-08/pdfUda5iEpgOS.pdf
+ *
+ * Specification for Turning and later GPU architectures:
+ * https://lists.gnu.org/archive/html/qemu-devel/2023-06/pdf142OR4O4c2.pdf
*/
static void get_nv_gpudirect_clique_id(Object *obj, Visitor *v,
const char *name, void *opaque,
@@ -1717,7 +1720,9 @@ const PropertyInfo qdev_prop_nv_gpudirect_clique = {
static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
{
PCIDevice *pdev = &vdev->pdev;
- int ret, pos = 0xC8;
+ int ret, pos;
+ bool c8_conflict = false, d4_conflict = false;
+ uint8_t tmp;
if (vdev->nv_gpudirect_clique == 0xFF) {
return 0;
@@ -1734,6 +1739,40 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
return -EINVAL;
}
+ /*
+ * Per the updated specification above, it's recommended to use offset
+ * D4h for Turing and later GPU architectures due to a conflict of the
+ * MSI-X capability at C8h. We don't know how to determine the GPU
+ * architecture, instead we walk the capability chain to mark conflicts
+ * and choose one or error based on the result.
+ *
+ * NB. Cap list head in pdev->config is already cleared, read from device.
+ */
+ ret = pread(vdev->vbasedev.fd, &tmp, 1,
+ vdev->config_offset + PCI_CAPABILITY_LIST);
+ if (ret != 1 || !tmp) {
+ error_setg(errp, "NVIDIA GPUDirect Clique ID: error getting cap list");
+ return -EINVAL;
+ }
+
+ do {
+ if (tmp == 0xC8) {
+ c8_conflict = true;
+ } else if (tmp == 0xD4) {
+ d4_conflict = true;
+ }
+ tmp = pdev->config[tmp + PCI_CAP_LIST_NEXT];
+ } while (tmp);
+
+ if (!c8_conflict) {
+ pos = 0xC8;
+ } else if (!d4_conflict) {
+ pos = 0xD4;
+ } else {
+ error_setg(errp, "NVIDIA GPUDirect Clique ID: invalid config space");
+ return -EINVAL;
+ }
+
ret = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8, errp);
if (ret < 0) {
error_prepend(errp, "Failed to add NVIDIA GPUDirect cap: ");
--
2.41.0.windows.1