Bugfix for wrong timing of modifying ibv_qp state to err

Currently the QPC state in HW is modified inside the critical section of
spinlock but the ibv_qp state is modified outside. There will be a short
period when QPC state has been modified to err with ibv_qp state still
remaining RTS. WQEs during this period will still be post-send by RTS-state
ibv_qp but then dropped by err-state HW with no flush CQEs generated.

To fix this problem, the QPC state in HW and ibv_qp state should be both
modified to err inside the critical section of spinlock.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
(cherry picked from commit e221c2d5c69b00dc1f1aca09781cc3ebed23b5f3)
This commit is contained in:
Ran Zhou 2023-12-01 17:51:21 +08:00 committed by openeuler-sync-bot
parent d9b79ed5fe
commit 3a00a8a05a
2 changed files with 51 additions and 1 deletions

View File

@ -0,0 +1,43 @@
From 324cd24a22256d964689bf528b643ae06d5a4e58 Mon Sep 17 00:00:00 2001
From: Yangyang Li <liyangyang20@huawei.com>
Date: Fri, 1 Dec 2023 10:43:23 +0800
Subject: [PATCH] libhns: Bugfix for wrong timing of modifying ibv_qp state to
err
driver inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/I8L4YU
--------------------------------------------------------------------------
Currently the QPC state in HW is modified inside the critical section of
spinlock but the ibv_qp state is modified outside. There will be a short
period when QPC state has been modified to err with ibv_qp state still
remaining RTS. WQEs during this period will still be post-send by RTS-state
ibv_qp but then dropped by err-state HW with no flush CQEs generated.
To fix this problem, the QPC state in HW and ibv_qp state should be both
modified to err inside the critical section of spinlock.
Fixes: f1a80cc3dfe2 ("libhns: Bugfix for flush cqe in case multi-process")
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
---
providers/hns/hns_roce_u_hw_v2.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
index 2fb738d..68d7110 100644
--- a/providers/hns/hns_roce_u_hw_v2.c
+++ b/providers/hns/hns_roce_u_hw_v2.c
@@ -1936,6 +1936,8 @@ static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
sizeof(resp_ex));
if (flag) {
+ if (!ret)
+ qp->state = IBV_QPS_ERR;
hns_roce_spin_unlock(&hr_qp->sq.hr_lock);
hns_roce_spin_unlock(&hr_qp->rq.hr_lock);
}
--
2.25.1

View File

@ -1,6 +1,6 @@
Name: rdma-core Name: rdma-core
Version: 41.0 Version: 41.0
Release: 21 Release: 22
Summary: RDMA core userspace libraries and daemons Summary: RDMA core userspace libraries and daemons
License: GPLv2 or BSD License: GPLv2 or BSD
Url: https://github.com/linux-rdma/rdma-core Url: https://github.com/linux-rdma/rdma-core
@ -81,6 +81,7 @@ patch72: 0072-libhns-Add-input-parameter-check-for-hnsdv_query_dev.patch
patch73: 0073-libhns-Fix-uninitialized-qp-attr-when-flush-cqe.patch patch73: 0073-libhns-Fix-uninitialized-qp-attr-when-flush-cqe.patch
patch74: 0074-libhns-Fix-possible-overflow-in-cq-clean.patch patch74: 0074-libhns-Fix-possible-overflow-in-cq-clean.patch
patch75: 0075-libhns-Fix-unnecessary-dca-memory-detach.patch patch75: 0075-libhns-Fix-unnecessary-dca-memory-detach.patch
patch76: 0076-libhns-Bugfix-for-wrong-timing-of-modifying-ibv_qp-s.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0) BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) valgrind-devel systemd systemd-devel BuildRequires: pkgconfig(libnl-route-3.0) valgrind-devel systemd systemd-devel
@ -328,6 +329,12 @@ fi
%{_mandir}/* %{_mandir}/*
%changelog %changelog
* Fri Dec 1 2023 Ran Zhou <zhouran10@h-partners.com> - 41.0-22
- Type: bugfix
- ID: NA
- SUG: NA
- DESC: Bugfix for wrong timing of modifying ibv_qp state to err
* Mon Nov 27 2023 Ran Zhou <zhouran10@h-partners.com> - 41.0-21 * Mon Nov 27 2023 Ran Zhou <zhouran10@h-partners.com> - 41.0-21
- Type: bugfix - Type: bugfix
- ID: NA - ID: NA