Bugfix for wrong timing of modifying ibv_qp state to err
Currently the QPC state in HW is modified inside the critical section of spinlock but the ibv_qp state is modified outside. There will be a short period when QPC state has been modified to err with ibv_qp state still remaining RTS. WQEs during this period will still be post-send by RTS-state ibv_qp but then dropped by err-state HW with no flush CQEs generated. To fix this problem, the QPC state in HW and ibv_qp state should be both modified to err inside the critical section of spinlock. Signed-off-by: Ran Zhou <zhouran10@h-partners.com> Signed-off-by: Yangyang Li <liyangyang20@huawei.com> (cherry picked from commit e221c2d5c69b00dc1f1aca09781cc3ebed23b5f3)
This commit is contained in:
parent
d9b79ed5fe
commit
3a00a8a05a
@ -0,0 +1,43 @@
|
||||
From 324cd24a22256d964689bf528b643ae06d5a4e58 Mon Sep 17 00:00:00 2001
|
||||
From: Yangyang Li <liyangyang20@huawei.com>
|
||||
Date: Fri, 1 Dec 2023 10:43:23 +0800
|
||||
Subject: [PATCH] libhns: Bugfix for wrong timing of modifying ibv_qp state to
|
||||
err
|
||||
|
||||
driver inclusion
|
||||
category: bugfix
|
||||
bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/I8L4YU
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Currently the QPC state in HW is modified inside the critical section of
|
||||
spinlock but the ibv_qp state is modified outside. There will be a short
|
||||
period when QPC state has been modified to err with ibv_qp state still
|
||||
remaining RTS. WQEs during this period will still be post-send by RTS-state
|
||||
ibv_qp but then dropped by err-state HW with no flush CQEs generated.
|
||||
|
||||
To fix this problem, the QPC state in HW and ibv_qp state should be both
|
||||
modified to err inside the critical section of spinlock.
|
||||
|
||||
Fixes: f1a80cc3dfe2 ("libhns: Bugfix for flush cqe in case multi-process")
|
||||
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
|
||||
---
|
||||
providers/hns/hns_roce_u_hw_v2.c | 2 ++
|
||||
1 file changed, 2 insertions(+)
|
||||
|
||||
diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
|
||||
index 2fb738d..68d7110 100644
|
||||
--- a/providers/hns/hns_roce_u_hw_v2.c
|
||||
+++ b/providers/hns/hns_roce_u_hw_v2.c
|
||||
@@ -1936,6 +1936,8 @@ static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
|
||||
sizeof(resp_ex));
|
||||
|
||||
if (flag) {
|
||||
+ if (!ret)
|
||||
+ qp->state = IBV_QPS_ERR;
|
||||
hns_roce_spin_unlock(&hr_qp->sq.hr_lock);
|
||||
hns_roce_spin_unlock(&hr_qp->rq.hr_lock);
|
||||
}
|
||||
--
|
||||
2.25.1
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
Name: rdma-core
|
||||
Version: 41.0
|
||||
Release: 21
|
||||
Release: 22
|
||||
Summary: RDMA core userspace libraries and daemons
|
||||
License: GPLv2 or BSD
|
||||
Url: https://github.com/linux-rdma/rdma-core
|
||||
@ -81,6 +81,7 @@ patch72: 0072-libhns-Add-input-parameter-check-for-hnsdv_query_dev.patch
|
||||
patch73: 0073-libhns-Fix-uninitialized-qp-attr-when-flush-cqe.patch
|
||||
patch74: 0074-libhns-Fix-possible-overflow-in-cq-clean.patch
|
||||
patch75: 0075-libhns-Fix-unnecessary-dca-memory-detach.patch
|
||||
patch76: 0076-libhns-Bugfix-for-wrong-timing-of-modifying-ibv_qp-s.patch
|
||||
|
||||
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
|
||||
BuildRequires: pkgconfig(libnl-route-3.0) valgrind-devel systemd systemd-devel
|
||||
@ -328,6 +329,12 @@ fi
|
||||
%{_mandir}/*
|
||||
|
||||
%changelog
|
||||
* Fri Dec 1 2023 Ran Zhou <zhouran10@h-partners.com> - 41.0-22
|
||||
- Type: bugfix
|
||||
- ID: NA
|
||||
- SUG: NA
|
||||
- DESC: Bugfix for wrong timing of modifying ibv_qp state to err
|
||||
|
||||
* Mon Nov 27 2023 Ran Zhou <zhouran10@h-partners.com> - 41.0-21
|
||||
- Type: bugfix
|
||||
- ID: NA
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user