backport SM3/SM4 optimization

This commit is contained in:
Xu Yizhou 2023-03-16 09:45:55 +08:00
parent 51515d7131
commit 592d5b6da7
14 changed files with 8571 additions and 1 deletions

View File

@ -0,0 +1,74 @@
From 06f13f85ee86cd7fbc546060fbe2d077176b0be4 Mon Sep 17 00:00:00 2001
From: Xu Yizhou <xuyizhou1@huawei.com>
Date: Mon, 31 Oct 2022 11:28:15 +0800
Subject: [PATCH 11/13] Apply SM4 optimization patch to Kunpeng-920
In the ideal scenario, performance can reach up to 2.2X.
But in single block input or CFB/OFB mode, CBC encryption,
performance could drop about 50%.
Perf data on Kunpeng-920 2.6GHz hardware, before and after optimization:
Before:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 75318.96k 79089.62k 79736.15k 79934.12k 80325.44k 80068.61k
SM4-ECB 80211.39k 84998.36k 86472.28k 87024.93k 87144.80k 86862.51k
SM4-GCM 72156.19k 82012.08k 83848.02k 84322.65k 85103.65k 84896.43k
SM4-CBC 77956.13k 80638.81k 81976.17k 81606.31k 82078.91k 81750.70k
SM4-CFB 78078.20k 81054.87k 81841.07k 82396.38k 82203.99k 82236.76k
SM4-OFB 78282.76k 82074.03k 82765.74k 82989.06k 83200.68k 83487.17k
After:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 35678.07k 120687.25k 176632.27k 177192.62k 177586.18k 178295.18k
SM4-ECB 35540.32k 122628.07k 175067.90k 178007.84k 178298.88k 178328.92k
SM4-GCM 34215.75k 116720.50k 170275.16k 171770.88k 172714.21k 172272.30k
SM4-CBC 35645.60k 36544.86k 36515.50k 36732.15k 36618.24k 36629.16k
SM4-CFB 35528.14k 35690.99k 35954.86k 35843.42k 35809.18k 35809.96k
SM4-OFB 35563.55k 35853.56k 35963.05k 36203.52k 36233.85k 36307.82k
Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19547)
---
crypto/arm_arch.h | 4 ++++
include/crypto/sm4_platform.h | 3 ++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/crypto/arm_arch.h b/crypto/arm_arch.h
index 5b5af31d92..c10748e5f8 100644
--- a/crypto/arm_arch.h
+++ b/crypto/arm_arch.h
@@ -98,9 +98,13 @@ extern unsigned int OPENSSL_armv8_rsa_neonized;
*/
# define ARM_CPU_IMP_ARM 0x41
+# define HISI_CPU_IMP 0x48
# define ARM_CPU_PART_CORTEX_A72 0xD08
# define ARM_CPU_PART_N1 0xD0C
+# define ARM_CPU_PART_V1 0xD40
+# define ARM_CPU_PART_N2 0xD49
+# define HISI_CPU_PART_KP920 0xD01
# define MIDR_PARTNUM_SHIFT 4
# define MIDR_PARTNUM_MASK (0xfffU << MIDR_PARTNUM_SHIFT)
diff --git a/include/crypto/sm4_platform.h b/include/crypto/sm4_platform.h
index 11f9b9d88b..15d8abbcb1 100644
--- a/include/crypto/sm4_platform.h
+++ b/include/crypto/sm4_platform.h
@@ -20,7 +20,8 @@ static inline int vpsm4_capable(void)
{
return (OPENSSL_armcap_P & ARMV8_CPUID) &&
(MIDR_IS_CPU_MODEL(OPENSSL_arm_midr, ARM_CPU_IMP_ARM, ARM_CPU_PART_V1) ||
- MIDR_IS_CPU_MODEL(OPENSSL_arm_midr, ARM_CPU_IMP_ARM, ARM_CPU_PART_N1));
+ MIDR_IS_CPU_MODEL(OPENSSL_arm_midr, ARM_CPU_IMP_ARM, ARM_CPU_PART_N1) ||
+ MIDR_IS_CPU_MODEL(OPENSSL_arm_midr, HISI_CPU_IMP, HISI_CPU_PART_KP920));
}
# if defined(VPSM4_ASM)
# define VPSM4_CAPABLE vpsm4_capable()
--
2.37.3.windows.1

View File

@ -0,0 +1,60 @@
From d7d5490d7201dcfb1f3811ad1bfc57ed9b2c0b77 Mon Sep 17 00:00:00 2001
From: "fangming.fang" <fangming.fang@arm.com>
Date: Thu, 8 Dec 2022 10:46:27 +0000
Subject: [PATCH 09/13] Fix SM4-CBC regression on Armv8
Fixes #19858
During decryption, the last ciphertext is not fed to next block
correctly when the number of input blocks is exactly 4. Fix this
and add the corresponding test cases.
Thanks xu-yi-zhou for reporting this issue and proposing the fix.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19872)
---
crypto/sm4/asm/vpsm4-armv8.pl | 2 +-
test/recipes/30-test_evp_data/evpciph_sm4.txt | 12 ++++++++++++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/crypto/sm4/asm/vpsm4-armv8.pl b/crypto/sm4/asm/vpsm4-armv8.pl
index 095d9dae64..c842ef61d5 100755
--- a/crypto/sm4/asm/vpsm4-armv8.pl
+++ b/crypto/sm4/asm/vpsm4-armv8.pl
@@ -880,7 +880,7 @@ $code.=<<___;
subs $blocks,$blocks,#4
b.gt .Lcbc_4_blocks_dec
// save back IV
- st1 {@vtmp[3].16b}, [$ivp]
+ st1 {@data[3].16b}, [$ivp]
b 100f
1: // last block
subs $blocks,$blocks,#1
diff --git a/test/recipes/30-test_evp_data/evpciph_sm4.txt b/test/recipes/30-test_evp_data/evpciph_sm4.txt
index 9fb16ca15c..e9a98c9898 100644
--- a/test/recipes/30-test_evp_data/evpciph_sm4.txt
+++ b/test/recipes/30-test_evp_data/evpciph_sm4.txt
@@ -19,6 +19,18 @@ IV = 0123456789ABCDEFFEDCBA9876543210
Plaintext = 0123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA9876543210
Ciphertext = 2677F46B09C122CC975533105BD4A22AF6125F7275CE552C3A2BBCF533DE8A3B
+Cipher = SM4-CBC
+Key = 0123456789ABCDEFFEDCBA9876543210
+IV = 0123456789ABCDEFFEDCBA9876543210
+Plaintext = 0123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA9876543210
+Ciphertext = 2677F46B09C122CC975533105BD4A22AF6125F7275CE552C3A2BBCF533DE8A3BFFF5A4F208092C0901BA02D5772977369915E3FA2356C9F4EB6460ECC457E7f8E3CFA3DEEBFE9883E3A48BCF7C4A11AA3EC9E0D317C5D319BE72A5CDDDEC640C
+
+Cipher = SM4-CBC
+Key = 0123456789ABCDEFFEDCBA9876543210
+IV = 0123456789ABCDEFFEDCBA9876543210
+Plaintext = 0123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA9876543210
+Ciphertext = 2677f46b09c122cc975533105bd4a22af6125f7275ce552c3a2bbcf533de8a3bfff5a4f208092c0901ba02d5772977369915e3fa2356c9f4eb6460ecc457e7f8e3cfa3deebfe9883e3a48bcf7c4a11aa3ec9e0d317c5d319be72a5cdddec640c6fc70bfa3ddaafffdd7c09b2774dcb2cec29f0c6f0b6773e985b3e395e924238505a8f120d9ca84de5c3cf7e45f097b14b3a46c5b1068669982a5c1f5f61be291b984f331d44ffb2758f771672448fc957fa1416c446427a41e25d5524a2418b9d96b2f17582f0f1aa9c204c6807f54f7b6833c5f00856659ddabc245936868c
+
Cipher = SM4-OFB
Key = 0123456789ABCDEFFEDCBA9876543210
IV = 0123456789ABCDEFFEDCBA9876543210
--
2.37.3.windows.1

View File

@ -0,0 +1,87 @@
From 6df7707fb22e8bd1c7d778a2041c1403f9852060 Mon Sep 17 00:00:00 2001
From: Xu Yizhou <xuyizhou1@huawei.com>
Date: Fri, 3 Feb 2023 15:59:59 +0800
Subject: [PATCH 13/13] Fix SM4-XTS build failure on Mac mini M1
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20202)
---
crypto/sm4/asm/vpsm4-armv8.pl | 4 +++-
crypto/sm4/asm/vpsm4_ex-armv8.pl | 23 ++++++++++++++++-------
2 files changed, 19 insertions(+), 8 deletions(-)
diff --git a/crypto/sm4/asm/vpsm4-armv8.pl b/crypto/sm4/asm/vpsm4-armv8.pl
index e19de30901..d30e78f3ce 100755
--- a/crypto/sm4/asm/vpsm4-armv8.pl
+++ b/crypto/sm4/asm/vpsm4-armv8.pl
@@ -524,7 +524,7 @@ sub compute_tweak_vec() {
my $std = shift;
&rbit(@vtmp[2],$src,$std);
$code.=<<___;
- ldr @qtmp[0], =0x01010101010101010101010101010187
+ ldr @qtmp[0], .Lxts_magic
shl $des.16b, @vtmp[2].16b, #1
ext @vtmp[1].16b, @vtmp[2].16b, @vtmp[2].16b,#15
ushr @vtmp[1].16b, @vtmp[1].16b, #7
@@ -572,6 +572,8 @@ _vpsm4_consts:
.dword 0x56aa3350a3b1bac6,0xb27022dc677d9197
.Lshuffles:
.dword 0x0B0A090807060504,0x030201000F0E0D0C
+.Lxts_magic:
+ .dword 0x0101010101010187,0x0101010101010101
.size _vpsm4_consts,.-_vpsm4_consts
___
diff --git a/crypto/sm4/asm/vpsm4_ex-armv8.pl b/crypto/sm4/asm/vpsm4_ex-armv8.pl
index 3d094aa535..f2d5b6debf 100644
--- a/crypto/sm4/asm/vpsm4_ex-armv8.pl
+++ b/crypto/sm4/asm/vpsm4_ex-armv8.pl
@@ -475,12 +475,12 @@ sub load_sbox () {
my $data = shift;
$code.=<<___;
- ldr $MaskQ, =0x0306090c0f0205080b0e0104070a0d00
- ldr $TAHMatQ, =0x22581a6002783a4062185a2042387a00
- ldr $TALMatQ, =0xc10bb67c4a803df715df62a89e54e923
- ldr $ATAHMatQ, =0x1407c6d56c7fbeadb9aa6b78c1d21300
- ldr $ATALMatQ, =0xe383c1a1fe9edcbc6404462679195b3b
- ldr $ANDMaskQ, =0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f
+ ldr $MaskQ, .Lsbox_magic
+ ldr $TAHMatQ, .Lsbox_magic+16
+ ldr $TALMatQ, .Lsbox_magic+32
+ ldr $ATAHMatQ, .Lsbox_magic+48
+ ldr $ATALMatQ, .Lsbox_magic+64
+ ldr $ANDMaskQ, .Lsbox_magic+80
___
}
@@ -525,7 +525,7 @@ sub compute_tweak_vec() {
my $std = shift;
&rbit(@vtmp[2],$src,$std);
$code.=<<___;
- ldr @qtmp[0], =0x01010101010101010101010101010187
+ ldr @qtmp[0], .Lxts_magic
shl $des.16b, @vtmp[2].16b, #1
ext @vtmp[1].16b, @vtmp[2].16b, @vtmp[2].16b,#15
ushr @vtmp[1].16b, @vtmp[1].16b, #7
@@ -556,6 +556,15 @@ _${prefix}_consts:
.dword 0x56aa3350a3b1bac6,0xb27022dc677d9197
.Lshuffles:
.dword 0x0B0A090807060504,0x030201000F0E0D0C
+.Lxts_magic:
+ .dword 0x0101010101010187,0x0101010101010101
+.Lsbox_magic:
+ .dword 0x0b0e0104070a0d00,0x0306090c0f020508
+ .dword 0x62185a2042387a00,0x22581a6002783a40
+ .dword 0x15df62a89e54e923,0xc10bb67c4a803df7
+ .dword 0xb9aa6b78c1d21300,0x1407c6d56c7fbead
+ .dword 0x6404462679195b3b,0xe383c1a1fe9edcbc
+ .dword 0x0f0f0f0f0f0f0f0f,0x0f0f0f0f0f0f0f0f
.size _${prefix}_consts,.-_${prefix}_consts
___
--
2.37.3.windows.1

View File

@ -0,0 +1,207 @@
From b8f24cb95dbe70cbeef08b41f35018141b6ce994 Mon Sep 17 00:00:00 2001
From: Xu Yizhou <xuyizhou1@huawei.com>
Date: Thu, 15 Dec 2022 10:21:07 +0800
Subject: [PATCH 10/13] Fix SM4 test failures on big-endian ARM processors
Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>
Reviewed-by: Paul Yang <kaishen.yy@antfin.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19910)
---
crypto/sm4/asm/vpsm4-armv8.pl | 52 +++++++++++++++++------------------
1 file changed, 26 insertions(+), 26 deletions(-)
diff --git a/crypto/sm4/asm/vpsm4-armv8.pl b/crypto/sm4/asm/vpsm4-armv8.pl
index c842ef61d5..73797af582 100755
--- a/crypto/sm4/asm/vpsm4-armv8.pl
+++ b/crypto/sm4/asm/vpsm4-armv8.pl
@@ -45,7 +45,7 @@ sub rev32() {
if ($src and ("$src" ne "$dst")) {
$code.=<<___;
-#ifndef __ARMEB__
+#ifndef __AARCH64EB__
rev32 $dst.16b,$src.16b
#else
mov $dst.16b,$src.16b
@@ -53,7 +53,7 @@ $code.=<<___;
___
} else {
$code.=<<___;
-#ifndef __ARMEB__
+#ifndef __AARCH64EB__
rev32 $dst.16b,$dst.16b
#endif
___
@@ -428,10 +428,10 @@ sub load_sbox () {
$code.=<<___;
adr $ptr,.Lsbox
- ld1 {@sbox[0].4s,@sbox[1].4s,@sbox[2].4s,@sbox[3].4s},[$ptr],#64
- ld1 {@sbox[4].4s,@sbox[5].4s,@sbox[6].4s,@sbox[7].4s},[$ptr],#64
- ld1 {@sbox[8].4s,@sbox[9].4s,@sbox[10].4s,@sbox[11].4s},[$ptr],#64
- ld1 {@sbox[12].4s,@sbox[13].4s,@sbox[14].4s,@sbox[15].4s},[$ptr]
+ ld1 {@sbox[0].16b,@sbox[1].16b,@sbox[2].16b,@sbox[3].16b},[$ptr],#64
+ ld1 {@sbox[4].16b,@sbox[5].16b,@sbox[6].16b,@sbox[7].16b},[$ptr],#64
+ ld1 {@sbox[8].16b,@sbox[9].16b,@sbox[10].16b,@sbox[11].16b},[$ptr],#64
+ ld1 {@sbox[12].16b,@sbox[13].16b,@sbox[14].16b,@sbox[15].16b},[$ptr]
___
}
@@ -492,9 +492,9 @@ ___
&rev32($vkey,$vkey);
$code.=<<___;
adr $pointer,.Lshuffles
- ld1 {$vmap.4s},[$pointer]
+ ld1 {$vmap.2d},[$pointer]
adr $pointer,.Lfk
- ld1 {$vfk.4s},[$pointer]
+ ld1 {$vfk.2d},[$pointer]
eor $vkey.16b,$vkey.16b,$vfk.16b
mov $schedules,#32
adr $pointer,.Lck
@@ -615,7 +615,7 @@ $code.=<<___;
.align 5
${prefix}_${dir}crypt:
AARCH64_VALID_CALL_TARGET
- ld1 {@data[0].16b},[$inp]
+ ld1 {@data[0].4s},[$inp]
___
&load_sbox();
&rev32(@data[0],@data[0]);
@@ -624,7 +624,7 @@ $code.=<<___;
___
&encrypt_1blk(@data[0]);
$code.=<<___;
- st1 {@data[0].16b},[$outp]
+ st1 {@data[0].4s},[$outp]
ret
.size ${prefix}_${dir}crypt,.-${prefix}_${dir}crypt
___
@@ -692,12 +692,12 @@ $code.=<<___;
cmp $blocks,#1
b.lt 100f
b.gt 1f
- ld1 {@data[0].16b},[$inp]
+ ld1 {@data[0].4s},[$inp]
___
&rev32(@data[0],@data[0]);
&encrypt_1blk(@data[0]);
$code.=<<___;
- st1 {@data[0].16b},[$outp]
+ st1 {@data[0].4s},[$outp]
b 100f
1: // process last 2 blocks
ld4 {@data[0].s,@data[1].s,@data[2].s,@data[3].s}[0],[$inp],#16
@@ -798,11 +798,11 @@ ___
&rev32($ivec0,$ivec0);
&encrypt_1blk($ivec0);
$code.=<<___;
- st1 {$ivec0.16b},[$outp],#16
+ st1 {$ivec0.4s},[$outp],#16
b 1b
2:
// save back IV
- st1 {$ivec0.16b},[$ivp]
+ st1 {$ivec0.4s},[$ivp]
ret
.Ldec:
@@ -834,7 +834,7 @@ ___
&transpose(@vtmp,@datax);
&transpose(@data,@datax);
$code.=<<___;
- ld1 {$ivec1.16b},[$ivp]
+ ld1 {$ivec1.4s},[$ivp]
ld1 {@datax[0].4s,@datax[1].4s,@datax[2].4s,@datax[3].4s},[$inp],#64
// note ivec1 and vtmpx[3] are resuing the same register
// care needs to be taken to avoid conflict
@@ -844,7 +844,7 @@ $code.=<<___;
eor @vtmp[2].16b,@vtmp[2].16b,@datax[1].16b
eor @vtmp[3].16b,$vtmp[3].16b,@datax[2].16b
// save back IV
- st1 {$vtmpx[3].16b}, [$ivp]
+ st1 {$vtmpx[3].4s}, [$ivp]
eor @data[0].16b,@data[0].16b,$datax[3].16b
eor @data[1].16b,@data[1].16b,@vtmpx[0].16b
eor @data[2].16b,@data[2].16b,@vtmpx[1].16b
@@ -855,7 +855,7 @@ $code.=<<___;
b.gt .Lcbc_8_blocks_dec
b.eq 100f
1:
- ld1 {$ivec1.16b},[$ivp]
+ ld1 {$ivec1.4s},[$ivp]
.Lcbc_4_blocks_dec:
cmp $blocks,#4
b.lt 1f
@@ -880,7 +880,7 @@ $code.=<<___;
subs $blocks,$blocks,#4
b.gt .Lcbc_4_blocks_dec
// save back IV
- st1 {@data[3].16b}, [$ivp]
+ st1 {@data[3].4s}, [$ivp]
b 100f
1: // last block
subs $blocks,$blocks,#1
@@ -888,13 +888,13 @@ $code.=<<___;
b.gt 1f
ld1 {@data[0].4s},[$inp],#16
// save back IV
- st1 {$data[0].16b}, [$ivp]
+ st1 {$data[0].4s}, [$ivp]
___
&rev32(@datax[0],@data[0]);
&encrypt_1blk(@datax[0]);
$code.=<<___;
eor @datax[0].16b,@datax[0].16b,$ivec1.16b
- st1 {@datax[0].16b},[$outp],#16
+ st1 {@datax[0].4s},[$outp],#16
b 100f
1: // last two blocks
ld4 {@data[0].s,@data[1].s,@data[2].s,@data[3].s}[0],[$inp]
@@ -917,7 +917,7 @@ $code.=<<___;
eor @vtmp[1].16b,@vtmp[1].16b,@data[0].16b
st1 {@vtmp[0].4s,@vtmp[1].4s},[$outp],#32
// save back IV
- st1 {@data[1].16b}, [$ivp]
+ st1 {@data[1].4s}, [$ivp]
b 100f
1: // last 3 blocks
ld4 {@data[0].s,@data[1].s,@data[2].s,@data[3].s}[2],[$ptr]
@@ -937,7 +937,7 @@ $code.=<<___;
eor @vtmp[2].16b,@vtmp[2].16b,@data[1].16b
st1 {@vtmp[0].4s,@vtmp[1].4s,@vtmp[2].4s},[$outp],#48
// save back IV
- st1 {@data[2].16b}, [$ivp]
+ st1 {@data[2].4s}, [$ivp]
100:
ldp d10,d11,[sp,#16]
ldp d12,d13,[sp,#32]
@@ -973,9 +973,9 @@ $code.=<<___;
___
&encrypt_1blk($ivec);
$code.=<<___;
- ld1 {@data[0].16b},[$inp]
+ ld1 {@data[0].4s},[$inp]
eor @data[0].16b,@data[0].16b,$ivec.16b
- st1 {@data[0].16b},[$outp]
+ st1 {@data[0].4s},[$outp]
ret
1:
AARCH64_SIGN_LINK_REGISTER
@@ -1053,9 +1053,9 @@ $code.=<<___;
___
&encrypt_1blk($ivec);
$code.=<<___;
- ld1 {@data[0].16b},[$inp]
+ ld1 {@data[0].4s},[$inp]
eor @data[0].16b,@data[0].16b,$ivec.16b
- st1 {@data[0].16b},[$outp]
+ st1 {@data[0].4s},[$outp]
b 100f
1: // last 2 blocks processing
dup @data[0].4s,$word0
--
2.37.3.windows.1

View File

@ -0,0 +1,67 @@
From 8746fff8f096fa35c7157199917100aa7b547d7a Mon Sep 17 00:00:00 2001
From: "fangming.fang" <fangming.fang@arm.com>
Date: Tue, 18 Jan 2022 02:58:08 +0000
Subject: [PATCH 03/13] Fix sm3ss1 translation issue in sm3-armv8.pl
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17542)
---
crypto/sm3/asm/sm3-armv8.pl | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/crypto/sm3/asm/sm3-armv8.pl b/crypto/sm3/asm/sm3-armv8.pl
index bb71b2eade..f0555fd3f2 100644
--- a/crypto/sm3/asm/sm3-armv8.pl
+++ b/crypto/sm3/asm/sm3-armv8.pl
@@ -109,7 +109,7 @@ ___
$code=<<___;
#include "arm_arch.h"
-.arch armv8.2-a+sm4
+.arch armv8.2-a
.text
___
@@ -222,8 +222,8 @@ my %sm3partopcode = (
"sm3partw1" => 0xce60C000,
"sm3partw2" => 0xce60C400);
-my %sm3sslopcode = (
- "sm3ssl" => 0xce400000);
+my %sm3ss1opcode = (
+ "sm3ss1" => 0xce400000);
my %sm3ttopcode = (
"sm3tt1a" => 0xce408000,
@@ -241,14 +241,13 @@ sub unsm3part {
$mnemonic,$arg;
}
-sub unsm3ssl {
+sub unsm3ss1 {
my ($mnemonic,$arg)=@_;
- $arg=~ m/[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,
- \s*[qv](\d+)/o
+ $arg=~ m/[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,\s*[qv](\d+)/o
&&
sprintf ".inst\t0x%08x\t//%s %s",
- $sm3sslopcode{$mnemonic}|$1|($2<<5)|($3<<16)|($4<<10),
+ $sm3ss1opcode{$mnemonic}|$1|($2<<5)|($3<<16)|($4<<10),
$mnemonic,$arg;
}
@@ -274,7 +273,7 @@ foreach(split("\n",$code)) {
s/\`([^\`]*)\`/eval($1)/ge;
s/\b(sm3partw[1-2])\s+([qv].*)/unsm3part($1,$2)/ge;
- s/\b(sm3ssl)\s+([qv].*)/unsm3ssl($1,$2)/ge;
+ s/\b(sm3ss1)\s+([qv].*)/unsm3ss1($1,$2)/ge;
s/\b(sm3tt[1-2][a-b])\s+([qv].*)/unsm3tt($1,$2)/ge;
print $_,"\n";
}
--
2.37.3.windows.1

View File

@ -0,0 +1,73 @@
From 98da8a58f964e279decc1bbbe8f07d807de05f7f Mon Sep 17 00:00:00 2001
From: Daniel Hu <Daniel.Hu@arm.com>
Date: Wed, 2 Mar 2022 12:55:39 +0000
Subject: [PATCH 06/13] Further acceleration for SM4-GCM on ARM
This patch will allow the SM4-GCM function to leverage the SM4
high-performance CTR crypto interface already implemented for ARM,
which is faster than current single block cipher routine used
for GCM
It does not address the acceleration of GHASH function of GCM,
which can be a future task, still we can see immediate uplift of
performance (up to 4X)
Before this patch:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-GCM 186432.92k 394234.05k 587916.46k 639365.12k 648486.91k 652924.25k
After the patch:
SM4-GCM 193924.87k 860940.35k 1696083.71k 2302548.31k 2580411.73k 2607398.91k
Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17814)
---
.../ciphers/cipher_sm4_gcm_hw.c | 25 ++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/providers/implementations/ciphers/cipher_sm4_gcm_hw.c b/providers/implementations/ciphers/cipher_sm4_gcm_hw.c
index c0c9b22bd3..b9633f83ed 100644
--- a/providers/implementations/ciphers/cipher_sm4_gcm_hw.c
+++ b/providers/implementations/ciphers/cipher_sm4_gcm_hw.c
@@ -42,11 +42,34 @@ static int sm4_gcm_initkey(PROV_GCM_CTX *ctx, const unsigned char *key,
return 1;
}
+static int hw_gcm_cipher_update(PROV_GCM_CTX *ctx, const unsigned char *in,
+ size_t len, unsigned char *out)
+{
+ if (ctx->enc) {
+ if (ctx->ctr != NULL) {
+ if (CRYPTO_gcm128_encrypt_ctr32(&ctx->gcm, in, out, len, ctx->ctr))
+ return 0;
+ } else {
+ if (CRYPTO_gcm128_encrypt(&ctx->gcm, in, out, len))
+ return 0;
+ }
+ } else {
+ if (ctx->ctr != NULL) {
+ if (CRYPTO_gcm128_decrypt_ctr32(&ctx->gcm, in, out, len, ctx->ctr))
+ return 0;
+ } else {
+ if (CRYPTO_gcm128_decrypt(&ctx->gcm, in, out, len))
+ return 0;
+ }
+ }
+ return 1;
+}
+
static const PROV_GCM_HW sm4_gcm = {
sm4_gcm_initkey,
ossl_gcm_setiv,
ossl_gcm_aad_update,
- ossl_gcm_cipher_update,
+ hw_gcm_cipher_update,
ossl_gcm_cipher_final,
ossl_gcm_one_shot
};
--
2.37.3.windows.1

View File

@ -0,0 +1,457 @@
From 8a83d735057dde1f727eb0921446e4ca8b085267 Mon Sep 17 00:00:00 2001
From: "fangming.fang" <fangming.fang@arm.com>
Date: Fri, 24 Dec 2021 08:29:04 +0000
Subject: [PATCH 02/13] SM3 acceleration with SM3 hardware instruction on
aarch64
SM3 hardware instruction is optional feature of crypto extension for
aarch64. This implementation accelerates SM3 via SM3 instructions. For
the platform not supporting SM3 instruction, the original C
implementation still works. Thanks to AliBaba for testing and reporting
the following perf numbers for Yitian710:
Benchmark on T-Head Yitian-710 2.75GHz:
Before:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sm3 49297.82k 121062.63k 223106.05k 283371.52k 307574.10k 309400.92k
After (33% - 74% faster):
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sm3 65640.01k 179121.79k 359854.59k 481448.96k 534055.59k 538274.47k
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17454)
---
crypto/arm64cpuid.pl | 8 +
crypto/arm_arch.h | 2 +
crypto/armcap.c | 10 ++
crypto/sm3/asm/sm3-armv8.pl | 282 ++++++++++++++++++++++++++++++++++++
crypto/sm3/build.info | 21 ++-
crypto/sm3/sm3_local.h | 16 +-
6 files changed, 336 insertions(+), 3 deletions(-)
create mode 100644 crypto/sm3/asm/sm3-armv8.pl
diff --git a/crypto/arm64cpuid.pl b/crypto/arm64cpuid.pl
index 11f0e50279..10d267b7ad 100755
--- a/crypto/arm64cpuid.pl
+++ b/crypto/arm64cpuid.pl
@@ -96,6 +96,14 @@ _armv8_cpuid_probe:
ret
.size _armv8_cpuid_probe,.-_armv8_cpuid_probe
+.globl _armv8_sm3_probe
+.type _armv8_sm3_probe,%function
+_armv8_sm3_probe:
+ AARCH64_VALID_CALL_TARGET
+ .long 0xce63c004 // sm3partw1 v4.4s, v0.4s, v3.4s
+ ret
+.size _armv8_sm3_probe,.-_armv8_sm3_probe
+
.globl OPENSSL_cleanse
.type OPENSSL_cleanse,%function
.align 5
diff --git a/crypto/arm_arch.h b/crypto/arm_arch.h
index a815a5c72b..c8b501f34c 100644
--- a/crypto/arm_arch.h
+++ b/crypto/arm_arch.h
@@ -83,6 +83,8 @@ extern unsigned int OPENSSL_armv8_rsa_neonized;
# define ARMV8_PMULL (1<<5)
# define ARMV8_SHA512 (1<<6)
# define ARMV8_CPUID (1<<7)
+# define ARMV8_RNG (1<<8)
+# define ARMV8_SM3 (1<<9)
/*
* MIDR_EL1 system register
diff --git a/crypto/armcap.c b/crypto/armcap.c
index c021330e32..365a48df45 100644
--- a/crypto/armcap.c
+++ b/crypto/armcap.c
@@ -52,6 +52,7 @@ void _armv8_sha1_probe(void);
void _armv8_sha256_probe(void);
void _armv8_pmull_probe(void);
# ifdef __aarch64__
+void _armv8_sm3_probe(void);
void _armv8_sha512_probe(void);
unsigned int _armv8_cpuid_probe(void);
# endif
@@ -137,6 +138,7 @@ static unsigned long getauxval(unsigned long key)
# define HWCAP_CE_SHA1 (1 << 5)
# define HWCAP_CE_SHA256 (1 << 6)
# define HWCAP_CPUID (1 << 11)
+# define HWCAP_CE_SM3 (1 << 18)
# define HWCAP_CE_SHA512 (1 << 21)
# endif
@@ -210,6 +212,9 @@ void OPENSSL_cpuid_setup(void)
if (hwcap & HWCAP_CPUID)
OPENSSL_armcap_P |= ARMV8_CPUID;
+
+ if (hwcap & HWCAP_CE_SM3)
+ OPENSSL_armcap_P |= ARMV8_SM3;
# endif
}
# endif
@@ -253,6 +258,11 @@ void OPENSSL_cpuid_setup(void)
_armv8_sha512_probe();
OPENSSL_armcap_P |= ARMV8_SHA512;
}
+
+ if (sigsetjmp(ill_jmp, 1) == 0) {
+ _armv8_sm3_probe();
+ OPENSSL_armcap_P |= ARMV8_SM3;
+ }
# endif
}
# endif
diff --git a/crypto/sm3/asm/sm3-armv8.pl b/crypto/sm3/asm/sm3-armv8.pl
new file mode 100644
index 0000000000..bb71b2eade
--- /dev/null
+++ b/crypto/sm3/asm/sm3-armv8.pl
@@ -0,0 +1,282 @@
+#! /usr/bin/env perl
+# Copyright 2021 The OpenSSL Project Authors. All Rights Reserved.
+#
+# Licensed under the Apache License 2.0 (the "License"). You may not use
+# this file except in compliance with the License. You can obtain a copy
+# in the file LICENSE in the source distribution or at
+# https://www.openssl.org/source/license.html
+#
+# This module implements support for Armv8 SM3 instructions
+
+# $output is the last argument if it looks like a file (it has an extension)
+# $flavour is the first argument if it doesn't look like a file
+$output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef;
+$flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef;
+
+$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+die "can't locate arm-xlate.pl";
+
+open OUT,"| \"$^X\" $xlate $flavour \"$output\""
+ or die "can't call $xlate: $!";
+*STDOUT=*OUT;
+
+# Message expanding:
+# Wj <- P1(W[j-16]^W[j-9]^(W[j-3]<<<15))^(W[j-13]<<<7)^W[j-6]
+# Input: s0, s1, s2, s3
+# s0 = w0 | w1 | w2 | w3
+# s1 = w4 | w5 | w6 | w7
+# s2 = w8 | w9 | w10 | w11
+# s3 = w12 | w13 | w14 | w15
+# Output: s4
+sub msg_exp () {
+my $s0 = shift;
+my $s1 = shift;
+my $s2 = shift;
+my $s3 = shift;
+my $s4 = shift;
+my $vtmp1 = shift;
+my $vtmp2 = shift;
+$code.=<<___;
+ // s4 = w7 | w8 | w9 | w10
+ ext $s4.16b, $s1.16b, $s2.16b, #12
+ // vtmp1 = w3 | w4 | w5 | w6
+ ext $vtmp1.16b, $s0.16b, $s1.16b, #12
+ // vtmp2 = w10 | w11 | w12 | w13
+ ext $vtmp2.16b, $s2.16b, $s3.16b, #8
+ sm3partw1 $s4.4s, $s0.4s, $s3.4s
+ sm3partw2 $s4.4s, $vtmp2.4s, $vtmp1.4s
+___
+}
+
+# A round of compresson function
+# Input:
+# ab - choose instruction among sm3tt1a, sm3tt1b, sm3tt2a, sm3tt2b
+# vstate0 - vstate1, store digest status(A - H)
+# vconst0 - vconst1, interleaved used to store Tj <<< j
+# vtmp - temporary register
+# vw - for sm3tt1ab, vw = s0 eor s1
+# s0 - for sm3tt2ab, just be s0
+# i, choose wj' or wj from vw
+sub round () {
+my $ab = shift;
+my $vstate0 = shift;
+my $vstate1 = shift;
+my $vconst0 = shift;
+my $vconst1 = shift;
+my $vtmp = shift;
+my $vw = shift;
+my $s0 = shift;
+my $i = shift;
+$code.=<<___;
+ sm3ss1 $vtmp.4s, $vstate0.4s, $vconst0.4s, $vstate1.4s
+ shl $vconst1.4s, $vconst0.4s, #1
+ sri $vconst1.4s, $vconst0.4s, #31
+ sm3tt1$ab $vstate0.4s, $vtmp.4s, $vw.4s[$i]
+ sm3tt2$ab $vstate1.4s, $vtmp.4s, $s0.4s[$i]
+___
+}
+
+sub qround () {
+my $ab = shift;
+my $vstate0 = shift;
+my $vstate1 = shift;
+my $vconst0 = shift;
+my $vconst1 = shift;
+my $vtmp1 = shift;
+my $vtmp2 = shift;
+my $s0 = shift;
+my $s1 = shift;
+my $s2 = shift;
+my $s3 = shift;
+my $s4 = shift;
+ if($s4) {
+ &msg_exp($s0, $s1, $s2, $s3, $s4, $vtmp1, $vtmp2);
+ }
+$code.=<<___;
+ eor $vtmp1.16b, $s0.16b, $s1.16b
+___
+ &round($ab, $vstate0, $vstate1, $vconst0, $vconst1, $vtmp2,
+ $vtmp1, $s0, 0);
+ &round($ab, $vstate0, $vstate1, $vconst1, $vconst0, $vtmp2,
+ $vtmp1, $s0, 1);
+ &round($ab, $vstate0, $vstate1, $vconst0, $vconst1, $vtmp2,
+ $vtmp1, $s0, 2);
+ &round($ab, $vstate0, $vstate1, $vconst1, $vconst0, $vtmp2,
+ $vtmp1, $s0, 3);
+}
+
+$code=<<___;
+#include "arm_arch.h"
+.arch armv8.2-a+sm4
+.text
+___
+
+{{{
+my ($pstate,$pdata,$num)=("x0","x1","w2");
+my ($state1,$state2)=("v5","v6");
+my ($sconst1, $sconst2)=("s16","s17");
+my ($vconst1, $vconst2)=("v16","v17");
+my ($s0,$s1,$s2,$s3,$s4)=map("v$_",(0..4));
+my ($bkstate1,$bkstate2)=("v18","v19");
+my ($vconst_tmp1,$vconst_tmp2)=("v20","v21");
+my ($vtmp1,$vtmp2)=("v22","v23");
+my $constaddr="x8";
+# void ossl_hwsm3_block_data_order(SM3_CTX *c, const void *p, size_t num)
+$code.=<<___;
+.globl ossl_hwsm3_block_data_order
+.type ossl_hwsm3_block_data_order,%function
+.align 5
+ossl_hwsm3_block_data_order:
+ AARCH64_VALID_CALL_TARGET
+ // load state
+ ld1 {$state1.4s-$state2.4s}, [$pstate]
+ rev64 $state1.4s, $state1.4s
+ rev64 $state2.4s, $state2.4s
+ ext $state1.16b, $state1.16b, $state1.16b, #8
+ ext $state2.16b, $state2.16b, $state2.16b, #8
+
+ adr $constaddr, .Tj
+ ldp $sconst1, $sconst2, [$constaddr]
+
+.Loop:
+ // load input
+ ld1 {$s0.16b-$s3.16b}, [$pdata], #64
+ sub $num, $num, #1
+
+ mov $bkstate1.16b, $state1.16b
+ mov $bkstate2.16b, $state2.16b
+
+#ifndef __ARMEB__
+ rev32 $s0.16b, $s0.16b
+ rev32 $s1.16b, $s1.16b
+ rev32 $s2.16b, $s2.16b
+ rev32 $s3.16b, $s3.16b
+#endif
+
+ ext $vconst_tmp1.16b, $vconst1.16b, $vconst1.16b, #4
+___
+ &qround("a",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s0,$s1,$s2,$s3,$s4);
+ &qround("a",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s1,$s2,$s3,$s4,$s0);
+ &qround("a",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s2,$s3,$s4,$s0,$s1);
+ &qround("a",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s3,$s4,$s0,$s1,$s2);
+
+$code.=<<___;
+ ext $vconst_tmp1.16b, $vconst2.16b, $vconst2.16b, #4
+___
+
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s4,$s0,$s1,$s2,$s3);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s0,$s1,$s2,$s3,$s4);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s1,$s2,$s3,$s4,$s0);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s2,$s3,$s4,$s0,$s1);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s3,$s4,$s0,$s1,$s2);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s4,$s0,$s1,$s2,$s3);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s0,$s1,$s2,$s3,$s4);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s1,$s2,$s3,$s4,$s0);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s2,$s3,$s4,$s0,$s1);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s3,$s4);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s4,$s0);
+ &qround("b",$state1,$state2,$vconst_tmp1,$vconst_tmp2,$vtmp1,$vtmp2,
+ $s0,$s1);
+
+$code.=<<___;
+ eor $state1.16b, $state1.16b, $bkstate1.16b
+ eor $state2.16b, $state2.16b, $bkstate2.16b
+
+ // any remained blocks?
+ cbnz $num, .Loop
+
+ // save state
+ rev64 $state1.4s, $state1.4s
+ rev64 $state2.4s, $state2.4s
+ ext $state1.16b, $state1.16b, $state1.16b, #8
+ ext $state2.16b, $state2.16b, $state2.16b, #8
+ st1 {$state1.4s-$state2.4s}, [$pstate]
+ ret
+.size ossl_hwsm3_block_data_order,.-ossl_hwsm3_block_data_order
+
+.align 3
+.Tj:
+.word 0x79cc4519, 0x9d8a7a87
+___
+}}}
+
+#########################################
+my %sm3partopcode = (
+ "sm3partw1" => 0xce60C000,
+ "sm3partw2" => 0xce60C400);
+
+my %sm3sslopcode = (
+ "sm3ssl" => 0xce400000);
+
+my %sm3ttopcode = (
+ "sm3tt1a" => 0xce408000,
+ "sm3tt1b" => 0xce408400,
+ "sm3tt2a" => 0xce408800,
+ "sm3tt2b" => 0xce408C00);
+
+sub unsm3part {
+ my ($mnemonic,$arg)=@_;
+
+ $arg=~ m/[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,\s*[qv](\d+)/o
+ &&
+ sprintf ".inst\t0x%08x\t//%s %s",
+ $sm3partopcode{$mnemonic}|$1|($2<<5)|($3<<16),
+ $mnemonic,$arg;
+}
+
+sub unsm3ssl {
+ my ($mnemonic,$arg)=@_;
+
+ $arg=~ m/[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,
+ \s*[qv](\d+)/o
+ &&
+ sprintf ".inst\t0x%08x\t//%s %s",
+ $sm3sslopcode{$mnemonic}|$1|($2<<5)|($3<<16)|($4<<10),
+ $mnemonic,$arg;
+}
+
+sub unsm3tt {
+ my ($mnemonic,$arg)=@_;
+
+ $arg=~ m/[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*,\s*[qv](\d+)[^,]*\[([0-3])\]/o
+ &&
+ sprintf ".inst\t0x%08x\t//%s %s",
+ $sm3ttopcode{$mnemonic}|$1|($2<<5)|($3<<16)|($4<<12),
+ $mnemonic,$arg;
+}
+
+open SELF,$0;
+while(<SELF>) {
+ next if (/^#!/);
+ last if (!s/^#/\/\// and !/^$/);
+ print;
+}
+close SELF;
+
+foreach(split("\n",$code)) {
+ s/\`([^\`]*)\`/eval($1)/ge;
+
+ s/\b(sm3partw[1-2])\s+([qv].*)/unsm3part($1,$2)/ge;
+ s/\b(sm3ssl)\s+([qv].*)/unsm3ssl($1,$2)/ge;
+ s/\b(sm3tt[1-2][a-b])\s+([qv].*)/unsm3tt($1,$2)/ge;
+ print $_,"\n";
+}
+
+close STDOUT or die "error closing STDOUT: $!";
diff --git a/crypto/sm3/build.info b/crypto/sm3/build.info
index eca68216f2..2fa54a4a8b 100644
--- a/crypto/sm3/build.info
+++ b/crypto/sm3/build.info
@@ -1,5 +1,22 @@
LIBS=../../libcrypto
IF[{- !$disabled{sm3} -}]
- SOURCE[../../libcrypto]=sm3.c legacy_sm3.c
-ENDIF
\ No newline at end of file
+ IF[{- !$disabled{asm} -}]
+ $SM3ASM_aarch64=sm3-armv8.S
+ $SM3DEF_aarch64=OPENSSL_SM3_ASM
+
+ # Now that we have defined all the arch specific variables, use the
+ # appropriate ones, and define the appropriate macros
+ IF[$SM3ASM_{- $target{asm_arch} -}]
+ $SM3ASM=$SM3ASM_{- $target{asm_arch} -}
+ $SM3DEF=$SM3DEF_{- $target{asm_arch} -}
+ ENDIF
+ ENDIF
+
+ SOURCE[../../libcrypto]=sm3.c legacy_sm3.c $SM3ASM
+ DEFINE[../../libcrypto]=$SM3DEF
+
+ GENERATE[sm3-armv8.S]=asm/sm3-armv8.pl
+ INCLUDE[sm3-armv8.o]=..
+ENDIF
+
diff --git a/crypto/sm3/sm3_local.h b/crypto/sm3/sm3_local.h
index 6daeb878a8..ac8a2bf768 100644
--- a/crypto/sm3/sm3_local.h
+++ b/crypto/sm3/sm3_local.h
@@ -32,7 +32,21 @@
ll=(c)->G; (void)HOST_l2c(ll, (s)); \
ll=(c)->H; (void)HOST_l2c(ll, (s)); \
} while (0)
-#define HASH_BLOCK_DATA_ORDER ossl_sm3_block_data_order
+
+#if defined(OPENSSL_SM3_ASM)
+# if defined(__aarch64__)
+# include "crypto/arm_arch.h"
+# define HWSM3_CAPABLE (OPENSSL_armcap_P & ARMV8_SM3)
+void ossl_hwsm3_block_data_order(SM3_CTX *c, const void *p, size_t num);
+# endif
+#endif
+
+#if defined(HWSM3_CAPABLE)
+# define HASH_BLOCK_DATA_ORDER (HWSM3_CAPABLE ? ossl_hwsm3_block_data_order \
+ : ossl_sm3_block_data_order)
+#else
+# define HASH_BLOCK_DATA_ORDER ossl_sm3_block_data_order
+#endif
void ossl_sm3_block_data_order(SM3_CTX *c, const void *p, size_t num);
void ossl_sm3_transform(SM3_CTX *c, const unsigned char *data);
--
2.37.3.windows.1

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,360 @@
From 2f1c0b5f1b585a307f21a70ef3ae652643c25f6d Mon Sep 17 00:00:00 2001
From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Date: Wed, 1 Sep 2021 16:54:15 +0800
Subject: [PATCH 04/13] providers: Add SM4 GCM implementation
The GCM mode of the SM4 algorithm is specifieded by RFC8998.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Paul Yang <kaishen.yy@antfin.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16491)
---
providers/defltprov.c | 2 +
providers/implementations/ciphers/build.info | 4 +-
.../implementations/ciphers/cipher_sm4_ccm.c | 39 +++++++++++++++++
.../implementations/ciphers/cipher_sm4_ccm.h | 22 ++++++++++
.../ciphers/cipher_sm4_ccm_hw.c | 41 ++++++++++++++++++
.../implementations/ciphers/cipher_sm4_gcm.c | 40 +++++++++++++++++
.../implementations/ciphers/cipher_sm4_gcm.h | 22 ++++++++++
.../ciphers/cipher_sm4_gcm_hw.c | 43 +++++++++++++++++++
.../include/prov/implementations.h | 2 +
.../implementations/include/prov/names.h | 2 +
test/recipes/30-test_evp_data/evpciph_sm4.txt | 20 +++++++++
11 files changed, 236 insertions(+), 1 deletion(-)
create mode 100644 providers/implementations/ciphers/cipher_sm4_ccm.c
create mode 100644 providers/implementations/ciphers/cipher_sm4_ccm.h
create mode 100644 providers/implementations/ciphers/cipher_sm4_ccm_hw.c
create mode 100644 providers/implementations/ciphers/cipher_sm4_gcm.c
create mode 100644 providers/implementations/ciphers/cipher_sm4_gcm.h
create mode 100644 providers/implementations/ciphers/cipher_sm4_gcm_hw.c
diff --git a/providers/defltprov.c b/providers/defltprov.c
index ed3f4799e7..cc0b0c3b62 100644
--- a/providers/defltprov.c
+++ b/providers/defltprov.c
@@ -289,6 +289,8 @@ static const OSSL_ALGORITHM_CAPABLE deflt_ciphers[] = {
ALG(PROV_NAMES_DES_EDE_CFB, ossl_tdes_ede2_cfb_functions),
#endif /* OPENSSL_NO_DES */
#ifndef OPENSSL_NO_SM4
+ ALG(PROV_NAMES_SM4_GCM, ossl_sm4128gcm_functions),
+ ALG(PROV_NAMES_SM4_CCM, ossl_sm4128ccm_functions),
ALG(PROV_NAMES_SM4_ECB, ossl_sm4128ecb_functions),
ALG(PROV_NAMES_SM4_CBC, ossl_sm4128cbc_functions),
ALG(PROV_NAMES_SM4_CTR, ossl_sm4128ctr_functions),
diff --git a/providers/implementations/ciphers/build.info b/providers/implementations/ciphers/build.info
index e4c5f4f051..b5d9d4f6c1 100644
--- a/providers/implementations/ciphers/build.info
+++ b/providers/implementations/ciphers/build.info
@@ -105,7 +105,9 @@ ENDIF
IF[{- !$disabled{sm4} -}]
SOURCE[$SM4_GOAL]=\
- cipher_sm4.c cipher_sm4_hw.c
+ cipher_sm4.c cipher_sm4_hw.c \
+ cipher_sm4_gcm.c cipher_sm4_gcm_hw.c \
+ cipher_sm4_ccm.c cipher_sm4_ccm_hw.c
ENDIF
IF[{- !$disabled{ocb} -}]
diff --git a/providers/implementations/ciphers/cipher_sm4_ccm.c b/providers/implementations/ciphers/cipher_sm4_ccm.c
new file mode 100644
index 0000000000..f0295a5ca2
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_ccm.c
@@ -0,0 +1,39 @@
+/*
+ * Copyright 2021 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+/* Dispatch functions for SM4 CCM mode */
+
+#include "cipher_sm4_ccm.h"
+#include "prov/implementations.h"
+#include "prov/providercommon.h"
+
+static OSSL_FUNC_cipher_freectx_fn sm4_ccm_freectx;
+
+static void *sm4_ccm_newctx(void *provctx, size_t keybits)
+{
+ PROV_SM4_CCM_CTX *ctx;
+
+ if (!ossl_prov_is_running())
+ return NULL;
+
+ ctx = OPENSSL_zalloc(sizeof(*ctx));
+ if (ctx != NULL)
+ ossl_ccm_initctx(&ctx->base, keybits, ossl_prov_sm4_hw_ccm(keybits));
+ return ctx;
+}
+
+static void sm4_ccm_freectx(void *vctx)
+{
+ PROV_SM4_CCM_CTX *ctx = (PROV_SM4_CCM_CTX *)vctx;
+
+ OPENSSL_clear_free(ctx, sizeof(*ctx));
+}
+
+/* sm4128ccm functions */
+IMPLEMENT_aead_cipher(sm4, ccm, CCM, AEAD_FLAGS, 128, 8, 96);
diff --git a/providers/implementations/ciphers/cipher_sm4_ccm.h b/providers/implementations/ciphers/cipher_sm4_ccm.h
new file mode 100644
index 0000000000..189e71e9e4
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_ccm.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright 2021 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+#include "crypto/sm4.h"
+#include "prov/ciphercommon.h"
+#include "prov/ciphercommon_ccm.h"
+
+typedef struct prov_sm4_ccm_ctx_st {
+ PROV_CCM_CTX base; /* Must be first */
+ union {
+ OSSL_UNION_ALIGN;
+ SM4_KEY ks;
+ } ks; /* SM4 key schedule to use */
+} PROV_SM4_CCM_CTX;
+
+const PROV_CCM_HW *ossl_prov_sm4_hw_ccm(size_t keylen);
diff --git a/providers/implementations/ciphers/cipher_sm4_ccm_hw.c b/providers/implementations/ciphers/cipher_sm4_ccm_hw.c
new file mode 100644
index 0000000000..791daf3e46
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_ccm_hw.c
@@ -0,0 +1,41 @@
+/*
+ * Copyright 2021 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+/*-
+ * Generic support for SM4 CCM.
+ */
+
+#include "cipher_sm4_ccm.h"
+
+static int ccm_sm4_initkey(PROV_CCM_CTX *ctx,
+ const unsigned char *key, size_t keylen)
+{
+ PROV_SM4_CCM_CTX *actx = (PROV_SM4_CCM_CTX *)ctx;
+
+ ossl_sm4_set_key(key, &actx->ks.ks);
+ CRYPTO_ccm128_init(&ctx->ccm_ctx, ctx->m, ctx->l, &actx->ks.ks,
+ (block128_f)ossl_sm4_encrypt);
+ ctx->str = NULL;
+ ctx->key_set = 1;
+ return 1;
+}
+
+static const PROV_CCM_HW ccm_sm4 = {
+ ccm_sm4_initkey,
+ ossl_ccm_generic_setiv,
+ ossl_ccm_generic_setaad,
+ ossl_ccm_generic_auth_encrypt,
+ ossl_ccm_generic_auth_decrypt,
+ ossl_ccm_generic_gettag
+};
+
+const PROV_CCM_HW *ossl_prov_sm4_hw_ccm(size_t keybits)
+{
+ return &ccm_sm4;
+}
diff --git a/providers/implementations/ciphers/cipher_sm4_gcm.c b/providers/implementations/ciphers/cipher_sm4_gcm.c
new file mode 100644
index 0000000000..7a936f00ee
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_gcm.c
@@ -0,0 +1,40 @@
+/*
+ * Copyright 2021 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+/* Dispatch functions for SM4 GCM mode */
+
+#include "cipher_sm4_gcm.h"
+#include "prov/implementations.h"
+#include "prov/providercommon.h"
+
+static OSSL_FUNC_cipher_freectx_fn sm4_gcm_freectx;
+
+static void *sm4_gcm_newctx(void *provctx, size_t keybits)
+{
+ PROV_SM4_GCM_CTX *ctx;
+
+ if (!ossl_prov_is_running())
+ return NULL;
+
+ ctx = OPENSSL_zalloc(sizeof(*ctx));
+ if (ctx != NULL)
+ ossl_gcm_initctx(provctx, &ctx->base, keybits,
+ ossl_prov_sm4_hw_gcm(keybits));
+ return ctx;
+}
+
+static void sm4_gcm_freectx(void *vctx)
+{
+ PROV_SM4_GCM_CTX *ctx = (PROV_SM4_GCM_CTX *)vctx;
+
+ OPENSSL_clear_free(ctx, sizeof(*ctx));
+}
+
+/* ossl_sm4128gcm_functions */
+IMPLEMENT_aead_cipher(sm4, gcm, GCM, AEAD_FLAGS, 128, 8, 96);
diff --git a/providers/implementations/ciphers/cipher_sm4_gcm.h b/providers/implementations/ciphers/cipher_sm4_gcm.h
new file mode 100644
index 0000000000..2b6b5f3ece
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_gcm.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright 2021 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+#include "crypto/sm4.h"
+#include "prov/ciphercommon.h"
+#include "prov/ciphercommon_gcm.h"
+
+typedef struct prov_sm4_gcm_ctx_st {
+ PROV_GCM_CTX base; /* must be first entry in struct */
+ union {
+ OSSL_UNION_ALIGN;
+ SM4_KEY ks;
+ } ks;
+} PROV_SM4_GCM_CTX;
+
+const PROV_GCM_HW *ossl_prov_sm4_hw_gcm(size_t keybits);
diff --git a/providers/implementations/ciphers/cipher_sm4_gcm_hw.c b/providers/implementations/ciphers/cipher_sm4_gcm_hw.c
new file mode 100644
index 0000000000..6bcd1ec406
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_gcm_hw.c
@@ -0,0 +1,43 @@
+/*
+ * Copyright 2021 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+/*-
+ * Generic support for SM4 GCM.
+ */
+
+#include "cipher_sm4_gcm.h"
+
+static int sm4_gcm_initkey(PROV_GCM_CTX *ctx, const unsigned char *key,
+ size_t keylen)
+{
+ PROV_SM4_GCM_CTX *actx = (PROV_SM4_GCM_CTX *)ctx;
+ SM4_KEY *ks = &actx->ks.ks;
+
+ ctx->ks = ks;
+ ossl_sm4_set_key(key, ks);
+ CRYPTO_gcm128_init(&ctx->gcm, ks, (block128_f)ossl_sm4_encrypt);
+ ctx->ctr = (ctr128_f)NULL;
+ ctx->key_set = 1;
+
+ return 1;
+}
+
+static const PROV_GCM_HW sm4_gcm = {
+ sm4_gcm_initkey,
+ ossl_gcm_setiv,
+ ossl_gcm_aad_update,
+ ossl_gcm_cipher_update,
+ ossl_gcm_cipher_final,
+ ossl_gcm_one_shot
+};
+
+const PROV_GCM_HW *ossl_prov_sm4_hw_gcm(size_t keybits)
+{
+ return &sm4_gcm;
+}
diff --git a/providers/implementations/include/prov/implementations.h b/providers/implementations/include/prov/implementations.h
index 3f6dd7ee16..498eab4ad4 100644
--- a/providers/implementations/include/prov/implementations.h
+++ b/providers/implementations/include/prov/implementations.h
@@ -174,6 +174,8 @@ extern const OSSL_DISPATCH ossl_seed128ofb128_functions[];
extern const OSSL_DISPATCH ossl_seed128cfb128_functions[];
#endif /* OPENSSL_NO_SEED */
#ifndef OPENSSL_NO_SM4
+extern const OSSL_DISPATCH ossl_sm4128gcm_functions[];
+extern const OSSL_DISPATCH ossl_sm4128ccm_functions[];
extern const OSSL_DISPATCH ossl_sm4128ecb_functions[];
extern const OSSL_DISPATCH ossl_sm4128cbc_functions[];
extern const OSSL_DISPATCH ossl_sm4128ctr_functions[];
diff --git a/providers/implementations/include/prov/names.h b/providers/implementations/include/prov/names.h
index e0dbb69a9d..0fac23a850 100644
--- a/providers/implementations/include/prov/names.h
+++ b/providers/implementations/include/prov/names.h
@@ -162,6 +162,8 @@
#define PROV_NAMES_SM4_CTR "SM4-CTR:1.2.156.10197.1.104.7"
#define PROV_NAMES_SM4_OFB "SM4-OFB:SM4-OFB128:1.2.156.10197.1.104.3"
#define PROV_NAMES_SM4_CFB "SM4-CFB:SM4-CFB128:1.2.156.10197.1.104.4"
+#define PROV_NAMES_SM4_GCM "SM4-GCM:1.2.156.10197.1.104.8"
+#define PROV_NAMES_SM4_CCM "SM4-CCM:1.2.156.10197.1.104.9"
#define PROV_NAMES_ChaCha20 "ChaCha20"
#define PROV_NAMES_ChaCha20_Poly1305 "ChaCha20-Poly1305"
#define PROV_NAMES_CAST5_ECB "CAST5-ECB"
diff --git a/test/recipes/30-test_evp_data/evpciph_sm4.txt b/test/recipes/30-test_evp_data/evpciph_sm4.txt
index ec8a45bd3f..9fb16ca15c 100644
--- a/test/recipes/30-test_evp_data/evpciph_sm4.txt
+++ b/test/recipes/30-test_evp_data/evpciph_sm4.txt
@@ -36,3 +36,23 @@ Key = 0123456789ABCDEFFEDCBA9876543210
IV = 0123456789ABCDEFFEDCBA9876543210
Plaintext = AAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBCCCCCCCCCCCCCCCCDDDDDDDDDDDDDDDDEEEEEEEEEEEEEEEEFFFFFFFFFFFFFFFFEEEEEEEEEEEEEEEEAAAAAAAAAAAAAAAA
Ciphertext = C2B4759E78AC3CF43D0852F4E8D5F9FD7256E8A5FCB65A350EE00630912E44492A0B17E1B85B060D0FBA612D8A95831638B361FD5FFACD942F081485A83CA35D
+
+Title = SM4 GCM test vectors from RFC8998
+
+Cipher = SM4-GCM
+Key = 0123456789abcdeffedcba9876543210
+IV = 00001234567800000000abcd
+AAD = feedfacedeadbeeffeedfacedeadbeefabaddad2
+Tag = 83de3541e4c2b58177e065a9bf7b62ec
+Plaintext = aaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbccccccccccccccccddddddddddddddddeeeeeeeeeeeeeeeeffffffffffffffffeeeeeeeeeeeeeeeeaaaaaaaaaaaaaaaa
+Ciphertext = 17f399f08c67d5ee19d0dc9969c4bb7d5fd46fd3756489069157b282bb200735d82710ca5c22f0ccfa7cbf93d496ac15a56834cbcf98c397b4024a2691233b8d
+
+Title = SM4 CCM test vectors from RFC8998
+
+Cipher = SM4-CCM
+Key = 0123456789abcdeffedcba9876543210
+IV = 00001234567800000000abcd
+AAD = feedfacedeadbeeffeedfacedeadbeefabaddad2
+Tag = 16842d4fa186f56ab33256971fa110f4
+Plaintext = aaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbccccccccccccccccddddddddddddddddeeeeeeeeeeeeeeeeffffffffffffffffeeeeeeeeeeeeeeeeaaaaaaaaaaaaaaaa
+Ciphertext = 48af93501fa62adbcd414cce6034d895dda1bf8f132f042098661572e7483094fd12e518ce062c98acee28d95df4416bed31a2f04476c18bb40c84a74b97dc5b
--
2.37.3.windows.1

View File

@ -0,0 +1,763 @@
From 57c854480481bd6b0900984d17db17426c44aa40 Mon Sep 17 00:00:00 2001
From: Xu Yizhou <xuyizhou1@huawei.com>
Date: Fri, 25 Nov 2022 13:52:49 +0800
Subject: [PATCH 08/13] providers: Add SM4 XTS implementation
Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19619)
---
crypto/modes/build.info | 2 +-
crypto/modes/xts128gb.c | 199 +++++++++++++
include/crypto/modes.h | 6 +
include/openssl/core_names.h | 1 +
providers/defltprov.c | 1 +
providers/implementations/ciphers/build.info | 4 +-
.../implementations/ciphers/cipher_sm4_xts.c | 281 ++++++++++++++++++
.../implementations/ciphers/cipher_sm4_xts.h | 46 +++
.../ciphers/cipher_sm4_xts_hw.c | 89 ++++++
.../include/prov/implementations.h | 1 +
.../implementations/include/prov/names.h | 1 +
11 files changed, 629 insertions(+), 2 deletions(-)
create mode 100644 crypto/modes/xts128gb.c
create mode 100644 providers/implementations/ciphers/cipher_sm4_xts.c
create mode 100644 providers/implementations/ciphers/cipher_sm4_xts.h
create mode 100644 providers/implementations/ciphers/cipher_sm4_xts_hw.c
diff --git a/crypto/modes/build.info b/crypto/modes/build.info
index f3558fa1a4..0ee297ced8 100644
--- a/crypto/modes/build.info
+++ b/crypto/modes/build.info
@@ -49,7 +49,7 @@ IF[{- !$disabled{asm} -}]
ENDIF
$COMMON=cbc128.c ctr128.c cfb128.c ofb128.c gcm128.c ccm128.c xts128.c \
- wrap128.c $MODESASM
+ wrap128.c xts128gb.c $MODESASM
SOURCE[../../libcrypto]=$COMMON \
cts128.c ocb128.c siv128.c
SOURCE[../../providers/libfips.a]=$COMMON
diff --git a/crypto/modes/xts128gb.c b/crypto/modes/xts128gb.c
new file mode 100644
index 0000000000..021c0597e4
--- /dev/null
+++ b/crypto/modes/xts128gb.c
@@ -0,0 +1,199 @@
+/*
+ * Copyright 2022 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+#include <string.h>
+#include <openssl/crypto.h>
+#include "internal/endian.h"
+#include "crypto/modes.h"
+
+#ifndef STRICT_ALIGNMENT
+# ifdef __GNUC__
+typedef u64 u64_a1 __attribute((__aligned__(1)));
+# else
+typedef u64 u64_a1;
+# endif
+#endif
+
+int ossl_crypto_xts128gb_encrypt(const XTS128_CONTEXT *ctx,
+ const unsigned char iv[16],
+ const unsigned char *inp, unsigned char *out,
+ size_t len, int enc)
+{
+ DECLARE_IS_ENDIAN;
+ union {
+ u64 u[2];
+ u32 d[4];
+ u8 c[16];
+ } tweak, scratch;
+ unsigned int i;
+
+ if (len < 16)
+ return -1;
+
+ memcpy(tweak.c, iv, 16);
+
+ (*ctx->block2) (tweak.c, tweak.c, ctx->key2);
+
+ if (!enc && (len % 16))
+ len -= 16;
+
+ while (len >= 16) {
+#if defined(STRICT_ALIGNMENT)
+ memcpy(scratch.c, inp, 16);
+ scratch.u[0] ^= tweak.u[0];
+ scratch.u[1] ^= tweak.u[1];
+#else
+ scratch.u[0] = ((u64_a1 *)inp)[0] ^ tweak.u[0];
+ scratch.u[1] = ((u64_a1 *)inp)[1] ^ tweak.u[1];
+#endif
+ (*ctx->block1) (scratch.c, scratch.c, ctx->key1);
+#if defined(STRICT_ALIGNMENT)
+ scratch.u[0] ^= tweak.u[0];
+ scratch.u[1] ^= tweak.u[1];
+ memcpy(out, scratch.c, 16);
+#else
+ ((u64_a1 *)out)[0] = scratch.u[0] ^= tweak.u[0];
+ ((u64_a1 *)out)[1] = scratch.u[1] ^= tweak.u[1];
+#endif
+ inp += 16;
+ out += 16;
+ len -= 16;
+
+ if (len == 0)
+ return 0;
+
+ if (IS_LITTLE_ENDIAN) {
+ u8 res;
+ u64 hi, lo;
+#ifdef BSWAP8
+ hi = BSWAP8(tweak.u[0]);
+ lo = BSWAP8(tweak.u[1]);
+#else
+ u8 *p = tweak.c;
+
+ hi = (u64)GETU32(p) << 32 | GETU32(p + 4);
+ lo = (u64)GETU32(p + 8) << 32 | GETU32(p + 12);
+#endif
+ res = (u8)lo & 1;
+ tweak.u[0] = (lo >> 1) | (hi << 63);
+ tweak.u[1] = hi >> 1;
+ if (res)
+ tweak.c[15] ^= 0xe1;
+#ifdef BSWAP8
+ hi = BSWAP8(tweak.u[0]);
+ lo = BSWAP8(tweak.u[1]);
+#else
+ p = tweak.c;
+
+ hi = (u64)GETU32(p) << 32 | GETU32(p + 4);
+ lo = (u64)GETU32(p + 8) << 32 | GETU32(p + 12);
+#endif
+ tweak.u[0] = lo;
+ tweak.u[1] = hi;
+ } else {
+ u8 carry, res;
+ carry = 0;
+ for (i = 0; i < 16; ++i) {
+ res = (tweak.c[i] << 7) & 0x80;
+ tweak.c[i] = ((tweak.c[i] >> 1) + carry) & 0xff;
+ carry = res;
+ }
+ if (res)
+ tweak.c[0] ^= 0xe1;
+ }
+ }
+ if (enc) {
+ for (i = 0; i < len; ++i) {
+ u8 c = inp[i];
+ out[i] = scratch.c[i];
+ scratch.c[i] = c;
+ }
+ scratch.u[0] ^= tweak.u[0];
+ scratch.u[1] ^= tweak.u[1];
+ (*ctx->block1) (scratch.c, scratch.c, ctx->key1);
+ scratch.u[0] ^= tweak.u[0];
+ scratch.u[1] ^= tweak.u[1];
+ memcpy(out - 16, scratch.c, 16);
+ } else {
+ union {
+ u64 u[2];
+ u8 c[16];
+ } tweak1;
+
+ if (IS_LITTLE_ENDIAN) {
+ u8 res;
+ u64 hi, lo;
+#ifdef BSWAP8
+ hi = BSWAP8(tweak.u[0]);
+ lo = BSWAP8(tweak.u[1]);
+#else
+ u8 *p = tweak.c;
+
+ hi = (u64)GETU32(p) << 32 | GETU32(p + 4);
+ lo = (u64)GETU32(p + 8) << 32 | GETU32(p + 12);
+#endif
+ res = (u8)lo & 1;
+ tweak1.u[0] = (lo >> 1) | (hi << 63);
+ tweak1.u[1] = hi >> 1;
+ if (res)
+ tweak1.c[15] ^= 0xe1;
+#ifdef BSWAP8
+ hi = BSWAP8(tweak1.u[0]);
+ lo = BSWAP8(tweak1.u[1]);
+#else
+ p = tweak1.c;
+
+ hi = (u64)GETU32(p) << 32 | GETU32(p + 4);
+ lo = (u64)GETU32(p + 8) << 32 | GETU32(p + 12);
+#endif
+ tweak1.u[0] = lo;
+ tweak1.u[1] = hi;
+ } else {
+ u8 carry, res;
+ carry = 0;
+ for (i = 0; i < 16; ++i) {
+ res = (tweak.c[i] << 7) & 0x80;
+ tweak1.c[i] = ((tweak.c[i] >> 1) + carry) & 0xff;
+ carry = res;
+ }
+ if (res)
+ tweak1.c[0] ^= 0xe1;
+ }
+#if defined(STRICT_ALIGNMENT)
+ memcpy(scratch.c, inp, 16);
+ scratch.u[0] ^= tweak1.u[0];
+ scratch.u[1] ^= tweak1.u[1];
+#else
+ scratch.u[0] = ((u64_a1 *)inp)[0] ^ tweak1.u[0];
+ scratch.u[1] = ((u64_a1 *)inp)[1] ^ tweak1.u[1];
+#endif
+ (*ctx->block1) (scratch.c, scratch.c, ctx->key1);
+ scratch.u[0] ^= tweak1.u[0];
+ scratch.u[1] ^= tweak1.u[1];
+
+ for (i = 0; i < len; ++i) {
+ u8 c = inp[16 + i];
+ out[16 + i] = scratch.c[i];
+ scratch.c[i] = c;
+ }
+ scratch.u[0] ^= tweak.u[0];
+ scratch.u[1] ^= tweak.u[1];
+ (*ctx->block1) (scratch.c, scratch.c, ctx->key1);
+#if defined(STRICT_ALIGNMENT)
+ scratch.u[0] ^= tweak.u[0];
+ scratch.u[1] ^= tweak.u[1];
+ memcpy(out, scratch.c, 16);
+#else
+ ((u64_a1 *)out)[0] = scratch.u[0] ^ tweak.u[0];
+ ((u64_a1 *)out)[1] = scratch.u[1] ^ tweak.u[1];
+#endif
+ }
+
+ return 0;
+}
diff --git a/include/crypto/modes.h b/include/crypto/modes.h
index 19f9d85959..475b77f925 100644
--- a/include/crypto/modes.h
+++ b/include/crypto/modes.h
@@ -148,6 +148,12 @@ struct xts128_context {
block128_f block1, block2;
};
+/* XTS mode for SM4 algorithm specified by GB/T 17964-2021 */
+int ossl_crypto_xts128gb_encrypt(const XTS128_CONTEXT *ctx,
+ const unsigned char iv[16],
+ const unsigned char *inp, unsigned char *out,
+ size_t len, int enc);
+
struct ccm128_context {
union {
u64 u[2];
diff --git a/include/openssl/core_names.h b/include/openssl/core_names.h
index 6bed5a8a67..a90971099d 100644
--- a/include/openssl/core_names.h
+++ b/include/openssl/core_names.h
@@ -97,6 +97,7 @@ extern "C" {
#define OSSL_CIPHER_PARAM_CTS_MODE "cts_mode" /* utf8_string */
/* For passing the AlgorithmIdentifier parameter in DER form */
#define OSSL_CIPHER_PARAM_ALGORITHM_ID_PARAMS "alg_id_param" /* octet_string */
+#define OSSL_CIPHER_PARAM_XTS_STANDARD "xts_standard" /* utf8_string */
#define OSSL_CIPHER_PARAM_TLS1_MULTIBLOCK_MAX_SEND_FRAGMENT \
"tls1multi_maxsndfrag" /* uint */
diff --git a/providers/defltprov.c b/providers/defltprov.c
index cc0b0c3b62..ab898d3f44 100644
--- a/providers/defltprov.c
+++ b/providers/defltprov.c
@@ -296,6 +296,7 @@ static const OSSL_ALGORITHM_CAPABLE deflt_ciphers[] = {
ALG(PROV_NAMES_SM4_CTR, ossl_sm4128ctr_functions),
ALG(PROV_NAMES_SM4_OFB, ossl_sm4128ofb128_functions),
ALG(PROV_NAMES_SM4_CFB, ossl_sm4128cfb128_functions),
+ ALG(PROV_NAMES_SM4_XTS, ossl_sm4128xts_functions),
#endif /* OPENSSL_NO_SM4 */
#ifndef OPENSSL_NO_CHACHA
ALG(PROV_NAMES_ChaCha20, ossl_chacha20_functions),
diff --git a/providers/implementations/ciphers/build.info b/providers/implementations/ciphers/build.info
index b5d9d4f6c1..9f6eacf5e3 100644
--- a/providers/implementations/ciphers/build.info
+++ b/providers/implementations/ciphers/build.info
@@ -107,7 +107,9 @@ IF[{- !$disabled{sm4} -}]
SOURCE[$SM4_GOAL]=\
cipher_sm4.c cipher_sm4_hw.c \
cipher_sm4_gcm.c cipher_sm4_gcm_hw.c \
- cipher_sm4_ccm.c cipher_sm4_ccm_hw.c
+ cipher_sm4_ccm.c cipher_sm4_ccm_hw.c \
+ cipher_sm4_xts.c cipher_sm4_xts_hw.c
+
ENDIF
IF[{- !$disabled{ocb} -}]
diff --git a/providers/implementations/ciphers/cipher_sm4_xts.c b/providers/implementations/ciphers/cipher_sm4_xts.c
new file mode 100644
index 0000000000..3c568d4d18
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_xts.c
@@ -0,0 +1,281 @@
+
+/*
+ * Copyright 2022 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+/* Dispatch functions for SM4 XTS mode */
+
+#include <openssl/proverr.h>
+#include "cipher_sm4_xts.h"
+#include "prov/implementations.h"
+#include "prov/providercommon.h"
+
+#define SM4_XTS_FLAGS PROV_CIPHER_FLAG_CUSTOM_IV
+#define SM4_XTS_IV_BITS 128
+#define SM4_XTS_BLOCK_BITS 8
+
+/* forward declarations */
+static OSSL_FUNC_cipher_encrypt_init_fn sm4_xts_einit;
+static OSSL_FUNC_cipher_decrypt_init_fn sm4_xts_dinit;
+static OSSL_FUNC_cipher_update_fn sm4_xts_stream_update;
+static OSSL_FUNC_cipher_final_fn sm4_xts_stream_final;
+static OSSL_FUNC_cipher_cipher_fn sm4_xts_cipher;
+static OSSL_FUNC_cipher_freectx_fn sm4_xts_freectx;
+static OSSL_FUNC_cipher_dupctx_fn sm4_xts_dupctx;
+static OSSL_FUNC_cipher_set_ctx_params_fn sm4_xts_set_ctx_params;
+static OSSL_FUNC_cipher_settable_ctx_params_fn sm4_xts_settable_ctx_params;
+
+/*-
+ * Provider dispatch functions
+ */
+static int sm4_xts_init(void *vctx, const unsigned char *key, size_t keylen,
+ const unsigned char *iv, size_t ivlen,
+ const OSSL_PARAM params[], int enc)
+{
+ PROV_SM4_XTS_CTX *xctx = (PROV_SM4_XTS_CTX *)vctx;
+ PROV_CIPHER_CTX *ctx = &xctx->base;
+
+ if (!ossl_prov_is_running())
+ return 0;
+
+ ctx->enc = enc;
+
+ if (iv != NULL) {
+ if (!ossl_cipher_generic_initiv(vctx, iv, ivlen))
+ return 0;
+ }
+ if (key != NULL) {
+ if (keylen != ctx->keylen) {
+ ERR_raise(ERR_LIB_PROV, PROV_R_INVALID_KEY_LENGTH);
+ return 0;
+ }
+ if (!ctx->hw->init(ctx, key, keylen))
+ return 0;
+ }
+ return sm4_xts_set_ctx_params(xctx, params);
+}
+
+static int sm4_xts_einit(void *vctx, const unsigned char *key, size_t keylen,
+ const unsigned char *iv, size_t ivlen,
+ const OSSL_PARAM params[])
+{
+ return sm4_xts_init(vctx, key, keylen, iv, ivlen, params, 1);
+}
+
+static int sm4_xts_dinit(void *vctx, const unsigned char *key, size_t keylen,
+ const unsigned char *iv, size_t ivlen,
+ const OSSL_PARAM params[])
+{
+ return sm4_xts_init(vctx, key, keylen, iv, ivlen, params, 0);
+}
+
+static void *sm4_xts_newctx(void *provctx, unsigned int mode, uint64_t flags,
+ size_t kbits, size_t blkbits, size_t ivbits)
+{
+ PROV_SM4_XTS_CTX *ctx = OPENSSL_zalloc(sizeof(*ctx));
+
+ if (ctx != NULL) {
+ ossl_cipher_generic_initkey(&ctx->base, kbits, blkbits, ivbits, mode,
+ flags, ossl_prov_cipher_hw_sm4_xts(kbits),
+ NULL);
+ }
+ return ctx;
+}
+
+static void sm4_xts_freectx(void *vctx)
+{
+ PROV_SM4_XTS_CTX *ctx = (PROV_SM4_XTS_CTX *)vctx;
+
+ ossl_cipher_generic_reset_ctx((PROV_CIPHER_CTX *)vctx);
+ OPENSSL_clear_free(ctx, sizeof(*ctx));
+}
+
+static void *sm4_xts_dupctx(void *vctx)
+{
+ PROV_SM4_XTS_CTX *in = (PROV_SM4_XTS_CTX *)vctx;
+ PROV_SM4_XTS_CTX *ret = NULL;
+
+ if (!ossl_prov_is_running())
+ return NULL;
+
+ if (in->xts.key1 != NULL) {
+ if (in->xts.key1 != &in->ks1)
+ return NULL;
+ }
+ if (in->xts.key2 != NULL) {
+ if (in->xts.key2 != &in->ks2)
+ return NULL;
+ }
+ ret = OPENSSL_malloc(sizeof(*ret));
+ if (ret == NULL)
+ return NULL;
+ in->base.hw->copyctx(&ret->base, &in->base);
+ return ret;
+}
+
+static int sm4_xts_cipher(void *vctx, unsigned char *out, size_t *outl,
+ size_t outsize, const unsigned char *in, size_t inl)
+{
+ PROV_SM4_XTS_CTX *ctx = (PROV_SM4_XTS_CTX *)vctx;
+
+ if (!ossl_prov_is_running()
+ || ctx->xts.key1 == NULL
+ || ctx->xts.key2 == NULL
+ || !ctx->base.iv_set
+ || out == NULL
+ || in == NULL
+ || inl < SM4_BLOCK_SIZE)
+ return 0;
+
+ /*
+ * Impose a limit of 2^20 blocks per data unit as specified by
+ * IEEE Std 1619-2018. The earlier and obsolete IEEE Std 1619-2007
+ * indicated that this was a SHOULD NOT rather than a MUST NOT.
+ * NIST SP 800-38E mandates the same limit.
+ */
+ if (inl > XTS_MAX_BLOCKS_PER_DATA_UNIT * SM4_BLOCK_SIZE) {
+ ERR_raise(ERR_LIB_PROV, PROV_R_XTS_DATA_UNIT_IS_TOO_LARGE);
+ return 0;
+ }
+ if (ctx->xts_standard) {
+ if (ctx->stream != NULL)
+ (*ctx->stream)(in, out, inl, ctx->xts.key1, ctx->xts.key2,
+ ctx->base.iv);
+ else if (CRYPTO_xts128_encrypt(&ctx->xts, ctx->base.iv, in, out, inl,
+ ctx->base.enc))
+ return 0;
+ } else {
+ if (ctx->stream_gb != NULL)
+ (*ctx->stream_gb)(in, out, inl, ctx->xts.key1, ctx->xts.key2,
+ ctx->base.iv);
+ else if (ossl_crypto_xts128gb_encrypt(&ctx->xts, ctx->base.iv, in, out,
+ inl, ctx->base.enc))
+ return 0;
+ }
+ *outl = inl;
+ return 1;
+}
+
+static int sm4_xts_stream_update(void *vctx, unsigned char *out, size_t *outl,
+ size_t outsize, const unsigned char *in,
+ size_t inl)
+{
+ PROV_SM4_XTS_CTX *ctx = (PROV_SM4_XTS_CTX *)vctx;
+
+ if (outsize < inl) {
+ ERR_raise(ERR_LIB_PROV, PROV_R_OUTPUT_BUFFER_TOO_SMALL);
+ return 0;
+ }
+
+ if (!sm4_xts_cipher(ctx, out, outl, outsize, in, inl)) {
+ ERR_raise(ERR_LIB_PROV, PROV_R_CIPHER_OPERATION_FAILED);
+ return 0;
+ }
+
+ return 1;
+}
+
+static int sm4_xts_stream_final(void *vctx, unsigned char *out, size_t *outl,
+ size_t outsize)
+{
+ if (!ossl_prov_is_running())
+ return 0;
+ *outl = 0;
+ return 1;
+}
+
+static const OSSL_PARAM sm4_xts_known_settable_ctx_params[] = {
+ OSSL_PARAM_utf8_string(OSSL_CIPHER_PARAM_XTS_STANDARD, NULL, 0),
+ OSSL_PARAM_END
+};
+
+static const OSSL_PARAM *sm4_xts_settable_ctx_params(ossl_unused void *cctx,
+ ossl_unused void *provctx)
+{
+ return sm4_xts_known_settable_ctx_params;
+}
+
+static int sm4_xts_set_ctx_params(void *vxctx, const OSSL_PARAM params[])
+{
+ PROV_SM4_XTS_CTX *xctx = (PROV_SM4_XTS_CTX *)vxctx;
+ const OSSL_PARAM *p;
+
+ if (params == NULL)
+ return 1;
+
+ /*-
+ * Sets the XTS standard to use with SM4-XTS algorithm.
+ *
+ * Must be utf8 string "GB" or "IEEE",
+ * "GB" means the GB/T 17964-2021 standard
+ * "IEEE" means the IEEE Std 1619-2007 standard
+ */
+ p = OSSL_PARAM_locate_const(params, OSSL_CIPHER_PARAM_XTS_STANDARD);
+
+ if (p != NULL) {
+ const char *xts_standard = NULL;
+
+ if (p->data_type != OSSL_PARAM_UTF8_STRING)
+ return 0;
+
+ if (!OSSL_PARAM_get_utf8_string_ptr(p, &xts_standard)) {
+ ERR_raise(ERR_LIB_PROV, PROV_R_FAILED_TO_GET_PARAMETER);
+ return 0;
+ }
+ if (OPENSSL_strcasecmp(xts_standard, "GB") == 0) {
+ xctx->xts_standard = 0;
+ } else if (OPENSSL_strcasecmp(xts_standard, "IEEE") == 0) {
+ xctx->xts_standard = 1;
+ } else {
+ ERR_raise(ERR_LIB_PROV, PROV_R_FAILED_TO_SET_PARAMETER);
+ return 0;
+ }
+ }
+
+ return 1;
+}
+
+#define IMPLEMENT_cipher(lcmode, UCMODE, kbits, flags) \
+static OSSL_FUNC_cipher_get_params_fn sm4_##kbits##_##lcmode##_get_params; \
+static int sm4_##kbits##_##lcmode##_get_params(OSSL_PARAM params[]) \
+{ \
+ return ossl_cipher_generic_get_params(params, EVP_CIPH_##UCMODE##_MODE, \
+ flags, 2 * kbits, SM4_XTS_BLOCK_BITS,\
+ SM4_XTS_IV_BITS); \
+} \
+static OSSL_FUNC_cipher_newctx_fn sm4_##kbits##_xts_newctx; \
+static void *sm4_##kbits##_xts_newctx(void *provctx) \
+{ \
+ return sm4_xts_newctx(provctx, EVP_CIPH_##UCMODE##_MODE, flags, 2 * kbits, \
+ SM4_XTS_BLOCK_BITS, SM4_XTS_IV_BITS); \
+} \
+const OSSL_DISPATCH ossl_sm4##kbits##xts_functions[] = { \
+ { OSSL_FUNC_CIPHER_NEWCTX, (void (*)(void))sm4_##kbits##_xts_newctx }, \
+ { OSSL_FUNC_CIPHER_ENCRYPT_INIT, (void (*)(void))sm4_xts_einit }, \
+ { OSSL_FUNC_CIPHER_DECRYPT_INIT, (void (*)(void))sm4_xts_dinit }, \
+ { OSSL_FUNC_CIPHER_UPDATE, (void (*)(void))sm4_xts_stream_update }, \
+ { OSSL_FUNC_CIPHER_FINAL, (void (*)(void))sm4_xts_stream_final }, \
+ { OSSL_FUNC_CIPHER_CIPHER, (void (*)(void))sm4_xts_cipher }, \
+ { OSSL_FUNC_CIPHER_FREECTX, (void (*)(void))sm4_xts_freectx }, \
+ { OSSL_FUNC_CIPHER_DUPCTX, (void (*)(void))sm4_xts_dupctx }, \
+ { OSSL_FUNC_CIPHER_GET_PARAMS, \
+ (void (*)(void))sm4_##kbits##_##lcmode##_get_params }, \
+ { OSSL_FUNC_CIPHER_GETTABLE_PARAMS, \
+ (void (*)(void))ossl_cipher_generic_gettable_params }, \
+ { OSSL_FUNC_CIPHER_GET_CTX_PARAMS, \
+ (void (*)(void))ossl_cipher_generic_get_ctx_params }, \
+ { OSSL_FUNC_CIPHER_GETTABLE_CTX_PARAMS, \
+ (void (*)(void))ossl_cipher_generic_gettable_ctx_params }, \
+ { OSSL_FUNC_CIPHER_SET_CTX_PARAMS, \
+ (void (*)(void))sm4_xts_set_ctx_params }, \
+ { OSSL_FUNC_CIPHER_SETTABLE_CTX_PARAMS, \
+ (void (*)(void))sm4_xts_settable_ctx_params }, \
+ { 0, NULL } \
+}
+/* ossl_sm4128xts_functions */
+IMPLEMENT_cipher(xts, XTS, 128, SM4_XTS_FLAGS);
diff --git a/providers/implementations/ciphers/cipher_sm4_xts.h b/providers/implementations/ciphers/cipher_sm4_xts.h
new file mode 100644
index 0000000000..4c369183e2
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_xts.h
@@ -0,0 +1,46 @@
+/*
+ * Copyright 2022 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+#include <crypto/sm4.h>
+#include "prov/ciphercommon.h"
+#include "crypto/sm4_platform.h"
+
+PROV_CIPHER_FUNC(void, xts_stream,
+ (const unsigned char *in, unsigned char *out, size_t len,
+ const SM4_KEY *key1, const SM4_KEY *key2,
+ const unsigned char iv[16]));
+
+typedef struct prov_sm4_xts_ctx_st {
+ /* Must be first */
+ PROV_CIPHER_CTX base;
+
+ /* SM4 key schedules to use */
+ union {
+ OSSL_UNION_ALIGN;
+ SM4_KEY ks;
+ } ks1, ks2;
+
+ /*-
+ * XTS standard to use with SM4-XTS algorithm
+ *
+ * Must be 0 or 1,
+ * 0 for XTS mode specified by GB/T 17964-2021
+ * 1 for XTS mode specified by IEEE Std 1619-2007
+ */
+ int xts_standard;
+
+ XTS128_CONTEXT xts;
+
+ /* Stream function for XTS mode specified by GB/T 17964-2021 */
+ OSSL_xts_stream_fn stream_gb;
+ /* Stream function for XTS mode specified by IEEE Std 1619-2007 */
+ OSSL_xts_stream_fn stream;
+} PROV_SM4_XTS_CTX;
+
+const PROV_CIPHER_HW *ossl_prov_cipher_hw_sm4_xts(size_t keybits);
diff --git a/providers/implementations/ciphers/cipher_sm4_xts_hw.c b/providers/implementations/ciphers/cipher_sm4_xts_hw.c
new file mode 100644
index 0000000000..403eb879b1
--- /dev/null
+++ b/providers/implementations/ciphers/cipher_sm4_xts_hw.c
@@ -0,0 +1,89 @@
+/*
+ * Copyright 2022 The OpenSSL Project Authors. All Rights Reserved.
+ *
+ * Licensed under the Apache License 2.0 (the "License"). You may not use
+ * this file except in compliance with the License. You can obtain a copy
+ * in the file LICENSE in the source distribution or at
+ * https://www.openssl.org/source/license.html
+ */
+
+#include "cipher_sm4_xts.h"
+
+#define XTS_SET_KEY_FN(fn_set_enc_key, fn_set_dec_key, \
+ fn_block_enc, fn_block_dec, \
+ fn_stream_enc, fn_stream_dec, \
+ fn_stream_gb_enc, fn_stream_gb_dec) { \
+ size_t bytes = keylen / 2; \
+ \
+ if (ctx->enc) { \
+ fn_set_enc_key(key, &xctx->ks1.ks); \
+ xctx->xts.block1 = (block128_f)fn_block_enc; \
+ } else { \
+ fn_set_dec_key(key, &xctx->ks1.ks); \
+ xctx->xts.block1 = (block128_f)fn_block_dec; \
+ } \
+ fn_set_enc_key(key + bytes, &xctx->ks2.ks); \
+ xctx->xts.block2 = (block128_f)fn_block_enc; \
+ xctx->xts.key1 = &xctx->ks1; \
+ xctx->xts.key2 = &xctx->ks2; \
+ xctx->stream = ctx->enc ? fn_stream_enc : fn_stream_dec; \
+ xctx->stream_gb = ctx->enc ? fn_stream_gb_enc : fn_stream_gb_dec; \
+}
+
+static int cipher_hw_sm4_xts_generic_initkey(PROV_CIPHER_CTX *ctx,
+ const unsigned char *key,
+ size_t keylen)
+{
+ PROV_SM4_XTS_CTX *xctx = (PROV_SM4_XTS_CTX *)ctx;
+ OSSL_xts_stream_fn stream_enc = NULL;
+ OSSL_xts_stream_fn stream_dec = NULL;
+ OSSL_xts_stream_fn stream_gb_enc = NULL;
+ OSSL_xts_stream_fn stream_gb_dec = NULL;
+#ifdef HWSM4_CAPABLE
+ if (HWSM4_CAPABLE) {
+ XTS_SET_KEY_FN(HWSM4_set_encrypt_key, HWSM4_set_decrypt_key,
+ HWSM4_encrypt, HWSM4_decrypt, stream_enc, stream_dec,
+ stream_gb_enc, stream_gb_dec);
+ return 1;
+ } else
+#endif /* HWSM4_CAPABLE */
+#ifdef VPSM4_CAPABLE
+ if (VPSM4_CAPABLE) {
+ XTS_SET_KEY_FN(vpsm4_set_encrypt_key, vpsm4_set_decrypt_key,
+ vpsm4_encrypt, vpsm4_decrypt, stream_enc, stream_dec,
+ stream_gb_enc, stream_gb_dec);
+ return 1;
+ } else
+#endif /* VPSM4_CAPABLE */
+ {
+ (void)0;
+ }
+ {
+ XTS_SET_KEY_FN(ossl_sm4_set_key, ossl_sm4_set_key, ossl_sm4_encrypt,
+ ossl_sm4_decrypt, stream_enc, stream_dec, stream_gb_enc,
+ stream_gb_dec);
+ }
+ return 1;
+}
+
+static void cipher_hw_sm4_xts_copyctx(PROV_CIPHER_CTX *dst,
+ const PROV_CIPHER_CTX *src)
+{
+ PROV_SM4_XTS_CTX *sctx = (PROV_SM4_XTS_CTX *)src;
+ PROV_SM4_XTS_CTX *dctx = (PROV_SM4_XTS_CTX *)dst;
+
+ *dctx = *sctx;
+ dctx->xts.key1 = &dctx->ks1.ks;
+ dctx->xts.key2 = &dctx->ks2.ks;
+}
+
+
+static const PROV_CIPHER_HW sm4_generic_xts = {
+ cipher_hw_sm4_xts_generic_initkey,
+ NULL,
+ cipher_hw_sm4_xts_copyctx
+};
+const PROV_CIPHER_HW *ossl_prov_cipher_hw_sm4_xts(size_t keybits)
+{
+ return &sm4_generic_xts;
+}
diff --git a/providers/implementations/include/prov/implementations.h b/providers/implementations/include/prov/implementations.h
index 498eab4ad4..cfa32ea3ca 100644
--- a/providers/implementations/include/prov/implementations.h
+++ b/providers/implementations/include/prov/implementations.h
@@ -181,6 +181,7 @@ extern const OSSL_DISPATCH ossl_sm4128cbc_functions[];
extern const OSSL_DISPATCH ossl_sm4128ctr_functions[];
extern const OSSL_DISPATCH ossl_sm4128ofb128_functions[];
extern const OSSL_DISPATCH ossl_sm4128cfb128_functions[];
+extern const OSSL_DISPATCH ossl_sm4128xts_functions[];
#endif /* OPENSSL_NO_SM4 */
#ifndef OPENSSL_NO_RC5
extern const OSSL_DISPATCH ossl_rc5128ecb_functions[];
diff --git a/providers/implementations/include/prov/names.h b/providers/implementations/include/prov/names.h
index 0fac23a850..5192f4f471 100644
--- a/providers/implementations/include/prov/names.h
+++ b/providers/implementations/include/prov/names.h
@@ -164,6 +164,7 @@
#define PROV_NAMES_SM4_CFB "SM4-CFB:SM4-CFB128:1.2.156.10197.1.104.4"
#define PROV_NAMES_SM4_GCM "SM4-GCM:1.2.156.10197.1.104.8"
#define PROV_NAMES_SM4_CCM "SM4-CCM:1.2.156.10197.1.104.9"
+#define PROV_NAMES_SM4_XTS "SM4-XTS:1.2.156.10197.1.104.10"
#define PROV_NAMES_ChaCha20 "ChaCha20"
#define PROV_NAMES_ChaCha20_Poly1305 "ChaCha20-Poly1305"
#define PROV_NAMES_CAST5_ECB "CAST5-ECB"
--
2.37.3.windows.1

View File

@ -2,7 +2,7 @@
Name: openssl
Epoch: 1
Version: 3.0.8
Release: 1
Release: 2
Summary: Cryptography and SSL/TLS Toolkit
License: OpenSSL and SSLeay
URL: https://www.openssl.org/
@ -10,6 +10,19 @@ Source0: https://www.openssl.org/source/%{name}-%{version}.tar.gz
Source1: Makefile.certificate
Patch1: openssl-3.0-build.patch
Patch2: Backport-aarch64-support-BTI-and-pointer-authentication-in-as.patch
Patch3: Backport-SM3-acceleration-with-SM3-hardware-instruction-on-aa.patch
Patch4: Backport-Fix-sm3ss1-translation-issue-in-sm3-armv8.pl.patch
Patch5: Backport-providers-Add-SM4-GCM-implementation.patch
Patch6: Backport-SM4-optimization-for-ARM-by-HW-instruction.patch
Patch7: Backport-Further-acceleration-for-SM4-GCM-on-ARM.patch
Patch8: Backport-SM4-optimization-for-ARM-by-ASIMD.patch
Patch9: Backport-providers-Add-SM4-XTS-implementation.patch
Patch10: Backport-Fix-SM4-CBC-regression-on-Armv8.patch
Patch11: Backport-Fix-SM4-test-failures-on-big-endian-ARM-processors.patch
Patch12: Backport-Apply-SM4-optimization-patch-to-Kunpeng-920.patch
Patch13: Backport-SM4-AESE-optimization-for-ARMv8.patch
Patch14: Backport-Fix-SM4-XTS-build-failure-on-Mac-mini-M1.patch
BuildRequires: gcc gcc-c++ perl make lksctp-tools-devel coreutils util-linux zlib-devel
Requires: coreutils %{name}-libs%{?_isa} = %{epoch}:%{version}-%{release}
@ -206,6 +219,10 @@ make test || :
%ldconfig_scriptlets libs
%changelog
* Thu Mar 16 2023 Xu Yizhou <xuyizhou1@huawei.com> - 1:3.0.8-2
- backport SM4 GCM/CCM/XTS implementation
- backport SM3/SM4 optimization
* Tue Feb 7 2023 wangcheng <wangcheng156@huawei.com> - 1:3.0.8-1
- upgrade to 3.0.8 for fixing CVEs