aarch64: strcmp delete align for better unixbench performance
This commit is contained in:
parent
4cadcf76e2
commit
eb2c7a9ee8
@ -65,7 +65,7 @@
|
|||||||
##############################################################################
|
##############################################################################
|
||||||
Name: glibc
|
Name: glibc
|
||||||
Version: 2.36
|
Version: 2.36
|
||||||
Release: 1
|
Release: 2
|
||||||
Summary: The GNU libc libraries
|
Summary: The GNU libc libraries
|
||||||
License: %{all_license}
|
License: %{all_license}
|
||||||
URL: http://www.gnu.org/software/glibc/
|
URL: http://www.gnu.org/software/glibc/
|
||||||
@ -99,6 +99,7 @@ Patch9001: locale-delete-no-hard-link-to-avoid-all_language-pac.patch
|
|||||||
Patch9011: use-region-to-instead-of-country-for-extract-timezon.patch
|
Patch9011: use-region-to-instead-of-country-for-extract-timezon.patch
|
||||||
Patch9012: malloc-use-__get_nprocs-replace-__get_nprocs_sched.patch
|
Patch9012: malloc-use-__get_nprocs-replace-__get_nprocs_sched.patch
|
||||||
Patch9013: x86-use-total-l3cache-for-non_temporal_threshold.patch
|
Patch9013: x86-use-total-l3cache-for-non_temporal_threshold.patch
|
||||||
|
Patch9014: strcmp-delete-align-for-loop_aligned.patch
|
||||||
|
|
||||||
Provides: ldconfig rtld(GNU_HASH) bundled(gnulib)
|
Provides: ldconfig rtld(GNU_HASH) bundled(gnulib)
|
||||||
|
|
||||||
@ -1257,6 +1258,9 @@ fi
|
|||||||
%endif
|
%endif
|
||||||
|
|
||||||
%changelog
|
%changelog
|
||||||
|
* Wed Aug 10 2022 Qingqing Li <liqingqing3@huawei.com> - 2.36-2
|
||||||
|
- aarch64: strcmp delete align for better unixbench performance
|
||||||
|
|
||||||
* Tue Aug 2 2022 Qingqing Li <liqingqing3@huawei.com> - 2.36-1
|
* Tue Aug 2 2022 Qingqing Li <liqingqing3@huawei.com> - 2.36-1
|
||||||
- upgrade to 2.36
|
- upgrade to 2.36
|
||||||
|
|
||||||
|
|||||||
32
strcmp-delete-align-for-loop_aligned.patch
Normal file
32
strcmp-delete-align-for-loop_aligned.patch
Normal file
@ -0,0 +1,32 @@
|
|||||||
|
From 9bbffed83b93f633b272368fc536a4f24e9942e6 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Yang Yanchao <yangyanchao6@huawei.com>
|
||||||
|
Date: Mon, 21 Feb 2022 14:25:25 +0800
|
||||||
|
Subject: [PATCH] strcmp: delete align for loop_aligned
|
||||||
|
|
||||||
|
In Kunpeng-920, the performance of strcmp deteriorates only
|
||||||
|
when the 16 to 23 characters are different.Or the string is
|
||||||
|
only 16-23 characters.That shows 2 misses per iteration which
|
||||||
|
means this is a branch predictor issue indeed.
|
||||||
|
In the preceding scenario, strcmp performance is 300% worse than expected.
|
||||||
|
|
||||||
|
Fortunately, this problem can be solved by modifying the alignment of the functions.
|
||||||
|
---
|
||||||
|
sysdeps/aarch64/strcmp.S | 2 --
|
||||||
|
1 file changed, 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/sysdeps/aarch64/strcmp.S b/sysdeps/aarch64/strcmp.S
|
||||||
|
index f225d718..7a048b66 100644
|
||||||
|
--- a/sysdeps/aarch64/strcmp.S
|
||||||
|
+++ b/sysdeps/aarch64/strcmp.S
|
||||||
|
@@ -71,8 +71,6 @@ ENTRY(strcmp)
|
||||||
|
b.ne L(misaligned8)
|
||||||
|
cbnz tmp, L(mutual_align)
|
||||||
|
|
||||||
|
- .p2align 4
|
||||||
|
-
|
||||||
|
L(loop_aligned):
|
||||||
|
ldr data2, [src1, off2]
|
||||||
|
ldr data1, [src1], 8
|
||||||
|
--
|
||||||
|
2.33.0
|
||||||
|
|
||||||
Loading…
x
Reference in New Issue
Block a user