From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 5D8A415800A for ; Sun, 13 Aug 2023 00:35:52 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id A0DDC2BC028; Sun, 13 Aug 2023 00:35:51 +0000 (UTC) Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 823472BC028 for ; Sun, 13 Aug 2023 00:35:51 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 7EBAE335D80 for ; Sun, 13 Aug 2023 00:35:50 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id D2005F37 for ; Sun, 13 Aug 2023 00:35:48 +0000 (UTC) From: "Sam James" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Sam James" Message-ID: <1691886039.7b28599cfed98fc831c16f1b528f15fd99011dae.sam@gentoo> Subject: [gentoo-commits] proj/gcc-patches:master commit in: 13.2.0/gentoo/ X-VCS-Repository: proj/gcc-patches X-VCS-Files: 13.2.0/gentoo/84_all_x86_PR110792-Early-clobber-issues-with-rot32di2.patch 13.2.0/gentoo/README.history X-VCS-Directories: 13.2.0/gentoo/ X-VCS-Committer: sam X-VCS-Committer-Name: Sam James X-VCS-Revision: 7b28599cfed98fc831c16f1b528f15fd99011dae X-VCS-Branch: master Date: Sun, 13 Aug 2023 00:35:48 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: aae3d0c7-9419-4a74-939f-81c0bdd609d4 X-Archives-Hash: ce80f39ec5614dbb0e72b74ae3c49cbe commit: 7b28599cfed98fc831c16f1b528f15fd99011dae Author: Sam James gentoo org> AuthorDate: Sun Aug 13 00:20:39 2023 +0000 Commit: Sam James gentoo org> CommitDate: Sun Aug 13 00:20:39 2023 +0000 URL: https://gitweb.gentoo.org/proj/gcc-patches.git/commit/?id=7b28599c 13.2.0: add patch for Botan miscompilation Bug: https://github.com/randombit/botan/issues/3637 Bug: https://gcc.gnu.org/PR110792 Signed-off-by: Sam James gentoo.org> ...110792-Early-clobber-issues-with-rot32di2.patch | 186 +++++++++++++++++++++ 13.2.0/gentoo/README.history | 3 + 2 files changed, 189 insertions(+) diff --git a/13.2.0/gentoo/84_all_x86_PR110792-Early-clobber-issues-with-rot32di2.patch b/13.2.0/gentoo/84_all_x86_PR110792-Early-clobber-issues-with-rot32di2.patch new file mode 100644 index 0000000..e3c09cc --- /dev/null +++ b/13.2.0/gentoo/84_all_x86_PR110792-Early-clobber-issues-with-rot32di2.patch @@ -0,0 +1,186 @@ +https://gcc.gnu.org/PR110792 +https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=790c1f60a5662b16eb19eb4b81922995863c7571 +https://github.com/randombit/botan/issues/3637 + +From 85628c5653ff40963158a24c60eeec6a3b5a8e56 Mon Sep 17 00:00:00 2001 +From: Roger Sayle +Date: Thu, 3 Aug 2023 07:12:04 +0100 +Subject: [PATCH] PR target/110792: Early clobber issues with + rot32di2_doubleword on i386. + +This patch is a conservative fix for PR target/110792, a wrong-code +regression affecting doubleword rotations by BITS_PER_WORD, which +effectively swaps the highpart and lowpart words, when the source to be +rotated resides in memory. The issue is that if the register used to +hold the lowpart of the destination is mentioned in the address of +the memory operand, the current define_insn_and_split unintentionally +clobbers it before reading the highpart. + +Hence, for the testcase, the incorrectly generated code looks like: + + salq $4, %rdi // calculate address + movq WHIRL_S+8(%rdi), %rdi // accidentally clobber addr + movq WHIRL_S(%rdi), %rbp // load (wrong) lowpart + +Traditionally, the textbook way to fix this would be to add an +explicit early clobber to the instruction's constraints. + + (define_insn_and_split "32di2_doubleword" +- [(set (match_operand:DI 0 "register_operand" "=r,r,r") ++ [(set (match_operand:DI 0 "register_operand" "=r,r,&r") + (any_rotate:DI (match_operand:DI 1 "nonimmediate_operand" "0,r,o") + (const_int 32)))] + +but unfortunately this currently generates significantly worse code, +due to a strange choice of reloads (effectively memcpy), which ends up +looking like: + + salq $4, %rdi // calculate address + movdqa WHIRL_S(%rdi), %xmm0 // load the double word in SSE reg. + movaps %xmm0, -16(%rsp) // store the SSE reg back to the stack + movq -8(%rsp), %rdi // load highpart + movq -16(%rsp), %rbp // load lowpart + +Note that reload's "&" doesn't distinguish between the memory being +early clobbered, vs the registers used in an addressing mode being +early clobbered. + +The fix proposed in this patch is to remove the third alternative, that +allowed offsetable memory as an operand, forcing reload to place the +operand into a register before the rotation. This results in: + + salq $4, %rdi + movq WHIRL_S(%rdi), %rax + movq WHIRL_S+8(%rdi), %rdi + movq %rax, %rbp + +I believe there's a more advanced solution, by swapping the order of +the loads (if first destination register is mentioned in the address), +or inserting a lea insn (if both destination registers are mentioned +in the address), but this fix is a minimal "safe" solution, that +should hopefully be suitable for backporting. + +2023-08-03 Roger Sayle + +gcc/ChangeLog + PR target/110792 + * config/i386/i386.md (ti3): For rotations by 64 bits + place operand in a register before gen_64ti2_doubleword. + (di3): Likewise, for rotations by 32 bits, place + operand in a register before gen_32di2_doubleword. + (32di2_doubleword): Constrain operand to be in register. + (64ti2_doubleword): Likewise. + +gcc/testsuite/ChangeLog + PR target/110792 + * g++.target/i386/pr110792.C: New 32-bit C++ test case. + * gcc.target/i386/pr110792.c: New 64-bit C test case. + +(cherry picked from commit 790c1f60a5662b16eb19eb4b81922995863c7571) +--- + gcc/config/i386/i386.md | 18 ++++++++++++------ + gcc/testsuite/g++.target/i386/pr110792.C | 16 ++++++++++++++++ + gcc/testsuite/gcc.target/i386/pr110792.c | 18 ++++++++++++++++++ + 3 files changed, 46 insertions(+), 6 deletions(-) + create mode 100644 gcc/testsuite/g++.target/i386/pr110792.C + create mode 100644 gcc/testsuite/gcc.target/i386/pr110792.c + +diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md +index f3a3305..a71e837 100644 +--- a/gcc/config/i386/i386.md ++++ b/gcc/config/i386/i386.md +@@ -14359,7 +14359,10 @@ + emit_insn (gen_ix86_ti3_doubleword + (operands[0], operands[1], operands[2])); + else if (CONST_INT_P (operands[2]) && INTVAL (operands[2]) == 64) +- emit_insn (gen_64ti2_doubleword (operands[0], operands[1])); ++ { ++ operands[1] = force_reg (TImode, operands[1]); ++ emit_insn (gen_64ti2_doubleword (operands[0], operands[1])); ++ } + else + { + rtx amount = force_reg (QImode, operands[2]); +@@ -14394,7 +14397,10 @@ + emit_insn (gen_ix86_di3_doubleword + (operands[0], operands[1], operands[2])); + else if (CONST_INT_P (operands[2]) && INTVAL (operands[2]) == 32) +- emit_insn (gen_32di2_doubleword (operands[0], operands[1])); ++ { ++ operands[1] = force_reg (DImode, operands[1]); ++ emit_insn (gen_32di2_doubleword (operands[0], operands[1])); ++ } + else + FAIL; + +@@ -14562,8 +14568,8 @@ + }) + + (define_insn_and_split "32di2_doubleword" +- [(set (match_operand:DI 0 "register_operand" "=r,r,r") +- (any_rotate:DI (match_operand:DI 1 "nonimmediate_operand" "0,r,o") ++ [(set (match_operand:DI 0 "register_operand" "=r,r") ++ (any_rotate:DI (match_operand:DI 1 "register_operand" "0,r") + (const_int 32)))] + "!TARGET_64BIT" + "#" +@@ -14580,8 +14586,8 @@ + }) + + (define_insn_and_split "64ti2_doubleword" +- [(set (match_operand:TI 0 "register_operand" "=r,r,r") +- (any_rotate:TI (match_operand:TI 1 "nonimmediate_operand" "0,r,o") ++ [(set (match_operand:TI 0 "register_operand" "=r,r") ++ (any_rotate:TI (match_operand:TI 1 "register_operand" "0,r") + (const_int 64)))] + "TARGET_64BIT" + "#" +diff --git a/gcc/testsuite/g++.target/i386/pr110792.C b/gcc/testsuite/g++.target/i386/pr110792.C +new file mode 100644 +index 0000000..ce21a7a +--- /dev/null ++++ b/gcc/testsuite/g++.target/i386/pr110792.C +@@ -0,0 +1,16 @@ ++/* { dg-do compile { target ia32 } } */ ++/* { dg-options "-O2" } */ ++ ++template ++inline T rotr(T input) ++{ ++ return static_cast((input >> ROT) | (input << (8 * sizeof(T) - ROT))); ++} ++ ++unsigned long long WHIRL_S[256] = {0x18186018C07830D8}; ++unsigned long long whirl(unsigned char x0) ++{ ++ const unsigned long long s4 = WHIRL_S[x0&0xFF]; ++ return rotr<32>(s4); ++} ++/* { dg-final { scan-assembler-not "movl\tWHIRL_S\\+4\\(,%eax,8\\), %eax" } } */ +diff --git a/gcc/testsuite/gcc.target/i386/pr110792.c b/gcc/testsuite/gcc.target/i386/pr110792.c +new file mode 100644 +index 0000000..b65125c +--- /dev/null ++++ b/gcc/testsuite/gcc.target/i386/pr110792.c +@@ -0,0 +1,18 @@ ++/* { dg-do compile { target int128 } } */ ++/* { dg-options "-O2" } */ ++ ++static inline unsigned __int128 rotr(unsigned __int128 input) ++{ ++ return ((input >> 64) | (input << (64))); ++} ++ ++unsigned __int128 WHIRL_S[256] = {((__int128)0x18186018C07830D8) << 64 |0x18186018C07830D8}; ++unsigned __int128 whirl(unsigned char x0) ++{ ++ register int t __asm("rdi") = x0&0xFF; ++ const unsigned __int128 s4 = WHIRL_S[t]; ++ register unsigned __int128 tt __asm("rdi") = rotr(s4); ++ asm("":::"memory"); ++ return tt; ++} ++/* { dg-final { scan-assembler-not "movq\tWHIRL_S\\+8\\(%rdi\\), %rdi" } } */ +-- +2.41.0 + diff --git a/13.2.0/gentoo/README.history b/13.2.0/gentoo/README.history index 769413a..2f1fc73 100644 --- a/13.2.0/gentoo/README.history +++ b/13.2.0/gentoo/README.history @@ -1,3 +1,6 @@ +6 13 Aug 2023 + + 84_all_x86_PR110792-Early-clobber-issues-with-rot32di2.patch + 5 05 Aug 2023 - 82_all_arm64_PR110280_ICE_fold-const.patch