60 Commits

Author SHA1 Message Date
Xin Tian
cfb78c8d80 libxscale: update to version 2412GA
new feature:
- support diamond products
- support ibv_wr apis
- support extended CQ poll apis

bugfix:
- imm data endian error

Signed-off-by: Xin Tian <tianx@yunsilicon.com>
(cherry picked from commit d12a87881cb9254bb46cfdde6131428a139f15bb)
2025-05-09 15:10:36 +08:00
Xinghai Cen
c35fab9925 libhns: Bugfixes and one debug improvement
The last commit was found when I created a XRC SRQ in
lock-free mode but failed to destroy it because of the
refcnt check added in the previous commit.

The failure was because the PAD was acquired through
ibv_srq->pd in destroy_srq(), while ibv_srq->pd wasn't
assigned when the SRQ was created by ibv_create_srq_ex().
So let's assign ibv_srq->pd in the common ibv_icmd_create_srq() ,
so that drivers can get the correct pd no matter
which api the SRQ is created by.

Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com>
(cherry picked from commit 3ac30fc125c7cff122f21ff8593294060c92429f)
2025-04-29 09:55:30 +08:00
Xinghai Cen
cfd3a00018 libhns: Add support for LTTng tracing
Add support for LTTng tracing. For now it is used for post_send, post_recv and poll_cq.

Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com>
(cherry picked from commit 76d4645e3b8a9e57b723e3f7297bda62e934d6f2)
2025-04-25 15:53:05 +08:00
Xinghai Cen
bf019764ad libhns: Cleanup and Bugfixes
Cleanup and Bugfixes:
        0053-libhns-Clean-up-data-type-issues.patch
        0054-libhns-Fix-wrong-max-inline-data-value.patch
        0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch

Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com>
2025-04-23 15:04:31 +08:00
Xinghai Cen
c72f2000d2 libxscale: Match dev by vid and did
Match dev by vid and did.

Signed-off-by: Xin Tian <tianx@yunsilicon.com>
2025-04-23 15:01:36 +08:00
Xinghai Cen
dff5546a58 libhns: Fixes some bugs for libhns
driver inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IB66RT

------------------------------------------------------------------

Changes to be committed:
      modified:   0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
      new file:   0045-libhns-fix-incorrectly-using-fixed-pagesize.patch
      new file:   0046-libhns-fix-missing-new-IO-support-for-DCA.patch

Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com>
2025-03-31 18:14:11 +08:00
Xin Tian
e30985bd56 libxscale: Add Yunsilicon User Space RDMA Driver
Introduce xscale provider for Yunsilicon devices.

Signed-off-by: Xin Tian <tianx@yunsilicon.com>
2025-03-04 10:11:56 +08:00
Xinghai Cen
101e41698a libhns: Fix missing fields for SRQ WC
mainline inclusion:
libhns: Fix missing fields for SRQ WC

Modify the information of some patch Fixes

Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com>
2025-01-23 14:41:44 +08:00
Funda Wang
3c9e565e44 Try sync with master codebase
(cherry picked from commit 6ff361b3e9766a469223a94dc4aad32a7dddc20e)
2025-01-10 10:58:41 +08:00
Xinghai Cen
b6ac52e2e9 libhns: Fixed several bugs in libhns to openEuler-24.03-LTS
Fixed several bugs in libhns:
libhns: Add error logs to help diagnosis
libhns: Fix coredump during QP destruction when send_cq == recv_cq
libhns: Fix memory leakage when DCA is enabled
libhns: Fix the exception branch of wr_start() is not locked
libhns: Fix reference to uninitialized cq pointer
libhns: Fix out-of-order issue of requester when setting FENCE
2025-01-09 20:02:07 +08:00
dufuhang
e785d29dff Fix the stride calculation for MSN/PSN area
[ Upstream commit 65197a4 ]

Library expects ilog2 of psn_size while calculating the stride.
ilog32 returns log2(v) + 1 and the calculation fails since
the psn size is a power of 2 value. Fix by passing psn_size - 1.

Fixes: 0a0e0d0 ("bnxt_re/lib: Adds MSN table capability for Gen P7 adapters")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Nicolas Morey <nmorey@suse.com>
(cherry picked from commit ac4be7ab029ef79b45c1055fb7504643fd129194)
2024-07-24 10:47:32 +08:00
zhangyaqi
658eb46f18 Fix an overflow bug in qsort comparison function
(cherry picked from commit 7bbc9554fe6ed4bf441bbcde2c1b72b04ce88d8c)
2024-07-24 09:38:17 +08:00
Yinsist
505cb66c30 first check if Valgrind supports the architecture 2024-05-12 07:43:31 +00:00
Juan Zhou
fea67b4274 Some bugfixes and cleanups
1#2. Replace private patch
3. Remove unused return value
4. Fix several context locks issue
5. libhns: Clean up signed-unsigned mix with relational issue
6. libhns: Fix missing flag when creating qp with hnsdv interface

Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit 43ec513a2eec4e13e258257bf1daa1a1b71ff1e4)
2024-05-11 23:02:02 +08:00
Juan Zhou
058cb01e3d Fix flexible WQE buffer page related issues
1. Fix missing flexible WQE buffer page flag
2. Fix ext_sge page size

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit e8a0671cf69d32baa72c4430f2b3ac279fbce147)
2024-05-07 16:43:00 +08:00
Ke Chen
8146c3b058 Support hns ROH mode
These patches support running the roce function in hns roh mode

Signed-off-by: Ke Chen <chenke54@huawei.com>
(cherry picked from commit 1938be0036f3cfe14d1b5a77b03884a06f9009e5)
2024-04-12 17:07:53 +08:00
Ran Zhou
b95203b1fa Support hns roce DCA
DCA(Dynamic context attachment) support many RC QPs to share the WQE
buffer in a memory pool, this help reducing the memory consumption
when there are many QPs are inactive.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 994c08d7e68ba906b7f7c16e8528700508af94b1)
2024-04-11 20:05:17 +08:00
Ran Zhou
e56042b4e2 Support reporting wc as software mode.
When HW is in resetting stage, we could not poll back all the
expected work completions as the HW won't generate cqe anymore.
This patch allows driver to compose the expected wc instead of the HW
during resetting stage. Once the hardware finished resetting, we can
poll cq from hardware again.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 5494e44cf97e65d858c8f7376c0424a833dc8323)
2024-03-28 20:21:14 +08:00
Ran Zhou
4e8e4d8653 Support thread domain and parent domain for lock-free
Add support for thread domain (TD) and parent domain (PAD).
When a parent domain holds a thread domain, the associated
data path will be set to lock-free mode to improve performance.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 60b829d79704e6b4611d7040265a7cf852057931)
2024-03-28 19:19:24 +08:00
Ran Zhou
5e22b65799 Backport congestion control from mainline
Add support for configuration of congestion control algorithms in QP
granularity with direct verbs hnsdv_create_qp().

Reference: https://github.com/linux-rdma/rdma-core/pull/1426/commits

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit f4a8396bcf41ea12bf3e7b73793e60bfba097377)
2024-03-15 17:17:10 +08:00
Ran Zhou
0ecff9e585 Support DSCP
Add user mode DSCP function throughthe
mapping of dscp-tc configured in kernel mode.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
2024-02-22 17:35:22 +08:00
Ran Zhou
66a7e0b9a7 Update to v50.0
Update version of rdma-core to v50.0.The subsequent maintenance
and upgrade will be performed based on this baseline.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
2024-02-06 20:18:17 +08:00
Ran Zhou
652f065dfe Add neccessary dependencies for rdma-core-devel
Add neccessary dependencies for the rdma-core-devel to
avoid missing link to shared object after packaging.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
2024-01-25 20:48:02 +08:00
孙苏皖
daa7746018 Separate some packages from rdma-core
Separate some functions from the rdma-core main package to reduce the
dependencies required by the main package.

Signed-off-by: 孙苏皖 <sunsuwan3@huawei.com>
Singed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
2023-12-22 13:48:17 +08:00
Ran Zhou
4a7be9bb67 Fix congest type flags error and replace a corrupt patch
Currently, there is a repeated judgement in check_qp_congest_type
whenever enable LDCP or HC3, the congest type flags all will be set
on LDCP.

This patch fixes this bug and replace a corrupt patch--0077, which
has a change that directly acts on patch but not code. This act
will disrupt the patch format.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
2023-12-12 19:03:06 +08:00
Ran Zhou
19429f69c9 Fix missing DB when compiler does not support SVE
Currently, if compiler does not support SVE, hns_roce_sve_write512() will
be a empty function, which means that this doorbell will be missed when
HNS_ROCE_QP_CAP_SVE_DIRECT_WQE is set in qp flag.

This patch ensures that driver will at least generate the DB regardless
of whether SVE DWQE is supported or not.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com
Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit e5fcbc2552eda0d654e55ae0758280d6e51804ea)
2023-12-08 17:23:51 +08:00
Ran Zhou
626267239b Bugfix for lock and owner bit
Correct the return of error code, add init of pthread spinlock and mutex
judgement, remove a repeated init of pthread lock init, fix owner bit
when SQ wrqps.

Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com
Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 794f3792a7267d0586bfac7d67507a27a5e61305)
2023-12-07 19:10:53 +08:00
Ran Zhou
3a00a8a05a Bugfix for wrong timing of modifying ibv_qp state to err
Currently the QPC state in HW is modified inside the critical section of
spinlock but the ibv_qp state is modified outside. There will be a short
period when QPC state has been modified to err with ibv_qp state still
remaining RTS. WQEs during this period will still be post-send by RTS-state
ibv_qp but then dropped by err-state HW with no flush CQEs generated.

To fix this problem, the QPC state in HW and ibv_qp state should be both
modified to err inside the critical section of spinlock.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
(cherry picked from commit e221c2d5c69b00dc1f1aca09781cc3ebed23b5f3)
2023-12-03 15:26:46 +08:00
Ran Zhou
ba7a351bb3 Corrects several minor issues found in review
The issues mainly lies in the memory empty check, variable range
inconsistency, parameter verification, and print format.

Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com
Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 918525387673e173835fd287995470cbaccad784)
2023-11-28 13:19:20 +08:00
Ran Zhou
e6e062a6d7 Get dmac from kernel driver
As dmac is already resolved in kernel while creating AH, there is no
need to repeat the resolving in userspace. Prioritizes getting dmac
from kernel driver, unless kernel driver didn't response one.

Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit e6ea204613586ab8d53dadee2b83eccadde447ec)
2023-11-23 13:52:17 +08:00
Ran Zhou
c350a79ab7 STARS is a HW scheduler. These patches support hns RoCE working in STARS mode which means RoCE will be scheduled by STARS.
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com
Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 6407ae1c796015fecebc9c82cfbc8f2988e23d43)
2023-11-03 11:54:55 +08:00
Juan Zhou
4cc4e9a1ad Skip resolving MAC for RDMA over UBLink
For RDMA over UBLink, MAC Layer if replaced by UBLink, and thus the
MAC addr is not nedded. So skip the MAC addr resolving for this mode.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit 333b7848bd0c6a33c5bcfdef18fa6bae578fd7cc)
2023-11-03 11:34:53 +08:00
Ran Zhou
ed62e6ed00 Support SRQ record doorbell
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I8A08Z

Compared with normal doorbell, using record doorbell can shorten the
process of ringing the doorbell and reduce the latency.

Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 23f6e3ca5e9fa66b41e3d33a9c4a88429cfc61ab)
2023-11-03 10:55:35 +08:00
Ran Zhou
98e759e379 Support flexible WQE buffer page size
In order to improve performance, we allow user-mode drivers to use a
larger page size to allocate WQE buffers, thereby reducing the latency
introduced by HW page switching. User-mode drivers will be allowed to
allocate WQE buffers between 4K to system page size. During
ibv_create_qp(), the driver will dynamically select the appropriate page
size based on ibv_qp_cap, thus reducing memory consumption while improving
performance.

Signed-off-by: Ran Zhou <zhouran10@h-partners.com>
(cherry picked from commit 1a21f45d978a8c469d128838bfd6ef5a72d335e8)
2023-11-03 10:11:54 +08:00
Juan Zhou
9169b77cd3 Support reporting wc as software mode
1.libhns: Support reporting wc as software mode
2.libhns: return error when post send in reset state
3.libhns: separate the initialization steps of lock
4.libhns: assign doorbell to zero when allocate it
5.libhns: Fix missing reset notification

Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit e1b479184479d826a5f78b43e832c667e138ca72)
2023-11-03 09:46:44 +08:00
Juan Zhou
39730a19bc Two patchs are uploaded from rdma-core mainline
1.Remove unnecessary QP checks
2.Fix reference to uninitialized cq pointer

Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit 3ab0271a03e49392018be33a06b3078559250a1c)
2023-11-03 09:18:50 +08:00
Zhou Juan
ef2166f6b8 Support user to choose using UD sl or pktype to adapt MPI APP
According to Annex17_RoCEv2 (A17.4.5.2), for RoCEv2 UD, a CQE should
carry a flag that indicates if the received frame is an IPv4, IPv6 or
RoCE packet. But currently, the values of the flag corresponding to
these packet types haven't been defined yet in WC.

In UCX, 'sl' in ibv_wc for UD is used as the packet type flag, and the
packet type values have already been defined in the UCX patch of
ed28845b88

Therefore, to adapt UCX, add a create flag to hnsdv_create_qp() to allow
users to choose whether they use 'sl' in ibv_wc as service level or
packet type for UD. For the latter, obtain and translate the packet type
from CQE and fill it to 'sl' in ibv_wc.

Singed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit e102d4c9aa2992c125b26ad5cc237ae002bc6541)
2023-11-02 20:18:35 +08:00
Zhou Juan
dabf4d530e Backport bugfix for hns
1.Fix the owner bit error of sq in new io
2.Fix incorrect post-send with direct wqe of
3.Add a judgment to the congestion control algorithm

Singed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit 092143ba858a7aba0630fadd416faa2a4e7eaf06)
2023-10-31 19:20:53 +08:00
Zhou Juan
bade32c716 Fix the sge number related errors and remove local invalidate operation
1. The hns hardware logic requires wr->num_sge to be 1 when
performing atomic operations. The code does not judge this
condition, and the current patch adds this constraint.

2. In the sq inline scenario, when num_sge in post_send is not 1, sge
array appears in the for loop without rotation and directly copy
out of bounds.

3. Currently local invalidate operation don't work properly.
Disable it for the time being.
HIP08 and HIP09 hardware does not support this feature, so
delete the associated code.

Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit 43c14b73409cf6e63278d5ff68e2694e592e9015)
2023-10-31 16:20:15 +08:00
Zhou Juan
97581b9828 Add support for SVE Direct WQE
Some Kunpeng SoCs do not support the DWQE through NEON
instructions. In this case, the IO path works normally,
but the performance will deteriorate.

For these SoCs that do not support NEON DWQE, they support
DWQE through SVE instructions. This patch supports SVE DWQE
to guarantee the performance of these SoCs. In addition, in
this scenario, DWQE only supports acceleration through SVE's
ldr and str instructions. Other load and store instructions
also cause performance degradation.

Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit 268e25f9374021fc4c0d6dabd62e0f360193081f)
2023-07-28 14:50:22 +08:00
Zhou Juan
e73844b3ec Support congestion control algorithm configuration
Added the use of direct verbs to implement QP-level
user-configurable congestion control algorithms. Among them,
the user mode driver mainly provides interfaces for users to
choose, and the kernel mode driver is responsible for filling
the resources of different algorithms and providing the
supported algorithm types for user mode.

At the same time, provide a direct verbs interface for users to
query the type of congestion control algorithm.

Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com>
(cherry picked from commit 48ccef6ba011468fe91056398aad863f998569d2)
2023-06-02 11:57:20 +08:00
Yixing Liu
8b56ab8b70 Support libhns stop sending db mechanism after reset
Add an interface to the user space, which is used to receive
the kernel reset state. After receiving the reset flag, the
user space stops sending db.

Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
2022-12-14 20:23:39 +08:00
Chengchang Tang
b88a370b79 Support hns roce DCA
DCA(Dynamic context attachment) support many RC QPs to share the WQE
buffer in a memory pool, this help reducing the memory consumption
when there are many QPs are inactive.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2022-11-30 20:34:42 +08:00
Yixing Liu
1c38175fa1 Add supoort libhns td unlock
This patch add libhns td unlock function.

Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
2022-11-29 14:53:10 +08:00
Guofeng Yue
648d17f1ef Support hns RoH mode
These patches support running the roce function in hns roh mode

Signed-off-by: Guofeng Yue <yueguofeng@hisilicon.com>
2022-11-29 14:25:05 +08:00
Chengchang Tang
6f27f67e51 Backport patches from 41.1
Backport patches from rdma-core 41.1.

And bugfix patches reported by #I5Q3S5 has also been included.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2022-11-06 23:17:09 +08:00
Chengchang Tang
cc1006c917 Fix missing patch list in spec
Patches for hns DSCP was forgotten to add to be added to the spec. And
some mistake in DSCP patch is fixed.

Fixes: 68f61fd0a1a8 ("Add support for hns DSCP")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2022-11-01 20:59:41 +08:00
Chengchang Tang
484504d625 Add support for hns DSCP
Support DSCP for hns RoCE.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2022-10-31 14:28:29 +08:00
Luoyouming
0759ff83eb Bugfix for sge num and support inline feature
Fix sge num bug, add compatibility for rq inline, support cqe inline

Signed-off-by: Luoyouming <luoyouming@huawei.com>
2022-10-31 14:28:19 +08:00
Chengchang Tang
060786cc7e Update to 41.0
Update rdma-core version from 35.1 to 41.0.

Version 41.0 is the latest version in community until
2022/7/27. It includes some new bugfixes and new features,
we choose this version to facilitate future development.

The patches added to this repo has already included in the
new version, so remove them.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
2022-07-27 18:26:26 +08:00