From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 61CF5158042 for ; Tue, 22 Oct 2024 19:09:34 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 832D7E087A; Tue, 22 Oct 2024 19:09:33 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 6B372E087A for ; Tue, 22 Oct 2024 19:09:33 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 9A885335D0F for ; Tue, 22 Oct 2024 19:09:32 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id A3463AE7 for ; Tue, 22 Oct 2024 19:09:30 +0000 (UTC) From: "Sam James" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Sam James" Message-ID: <1729624145.52de78302a3c40e11a16185917bf8bb4bccfd199.sam@gentoo> Subject: [gentoo-commits] proj/gcc-patches:master commit in: 15.0.0/gentoo/ X-VCS-Repository: proj/gcc-patches X-VCS-Files: 15.0.0/gentoo/72_all_PR117190.patch X-VCS-Directories: 15.0.0/gentoo/ X-VCS-Committer: sam X-VCS-Committer-Name: Sam James X-VCS-Revision: 52de78302a3c40e11a16185917bf8bb4bccfd199 X-VCS-Branch: master Date: Tue, 22 Oct 2024 19:09:30 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: fd39121b-a110-4bf9-9627-4dc7d090a69a X-Archives-Hash: d21df82a0644dd2fdfb175ae83d8d7f2 commit: 52de78302a3c40e11a16185917bf8bb4bccfd199 Author: Sam James gentoo org> AuthorDate: Tue Oct 22 19:09:05 2024 +0000 Commit: Sam James gentoo org> CommitDate: Tue Oct 22 19:09:05 2024 +0000 URL: https://gitweb.gentoo.org/proj/gcc-patches.git/commit/?id=52de7830 15.0.0: add 72_all_PR117190.patch This patch isn't yet merged but should be soon. Signed-off-by: Sam James gentoo.org> 15.0.0/gentoo/72_all_PR117190.patch | 179 ++++++++++++++++++++++++++++++++++++ 1 file changed, 179 insertions(+) diff --git a/15.0.0/gentoo/72_all_PR117190.patch b/15.0.0/gentoo/72_all_PR117190.patch new file mode 100644 index 0000000..497d788 --- /dev/null +++ b/15.0.0/gentoo/72_all_PR117190.patch @@ -0,0 +1,179 @@ +From 756a3f3aad7200052d9aee207717c9766dce8be1 Mon Sep 17 00:00:00 2001 +Message-ID: <756a3f3aad7200052d9aee207717c9766dce8be1.1729624110.git.sam@gentoo.org> +From: Jakub Jelinek +Date: Tue, 22 Oct 2024 20:03:35 +0200 +Subject: [PATCH] c: Better fix for speed up compilation of large char array + initializers when not using #embed [PR117190] + +On Wed, Oct 16, 2024 at 11:09:32PM +0200, Jakub Jelinek wrote: +> Apparently my +> c: Speed up compilation of large char array initializers when not using #embed +> patch broke building glibc. +> +> The issue is that when using CPP_EMBED, we are guaranteed by the +> preprocessor that there is CPP_NUMBER CPP_COMMA before it and +> CPP_COMMA CPP_NUMBER after it (or CPP_COMMA CPP_EMBED), so RAW_DATA_CST +> never ends up at the end of arrays of unknown length. +> Now, the c_parser_initval optimization attempted to preserve that property +> rather than changing everything that e.g. inferes array number of elements +> from the initializer etc. to deal with RAW_DATA_CST at the end, but +> it didn't take into account the possibility that there could be +> CPP_COMMA followed by CPP_CLOSE_BRACE (where the CPP_COMMA is redundant). +> +> As we are peaking already at 4 tokens in that code, peeking more would +> require using raw tokens and that seems to be expensive doing it for +> every pair of tokens due to vec_free done when we are out of raw tokens. + +Sorry for rushing the previous patch too much, turns out I was wrong, +given that the c_parser_peek_nth_token numbering is 1 based, we can peek +also with c_parser_peek_nth_token (parser, 4) and the loop actually peeked +just at 3 tokens, not 4. + +So, I think it is better to revert the previous patch (but keep the new +test) and instead peek the 4th non-raw token, which is what the following +patch does. + +Additionally, PR117190 shows one further spot which missed the peek of +the token after CPP_COMMA, in case it is incomplete array with exactly 65 +elements with redundant comma after it, which this patch handles too. + +Bootstrapped/regtested on x86_64-linux, i686-linux, powerpc64le-linux +and powerpc64-linux, ok for trunk? + +2024-10-22 Jakub Jelinek + + PR c/117190 +gcc/c/ + * c-parser.cc (c_parser_initval): Revert 2024-10-17 changes. + Instead peek the 4th token and if it is not CPP_NUMBER, + handle it like 3rd token CPP_CLOSE_BRACE for orig_len == INT_MAX. + Also, check (2 + 2 * i)th raw token for the orig_len == INT_MAX + case and punt if it is not CPP_NUMBER. +gcc/testsuite/ + * c-c++-common/init-5.c: New test. +--- + gcc/c/c-parser.cc | 42 +++++++++++------------------ + gcc/testsuite/c-c++-common/init-5.c | 19 +++++++++++++ + 2 files changed, 35 insertions(+), 26 deletions(-) + create mode 100644 gcc/testsuite/c-c++-common/init-5.c + +diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc +index 090ab1cbc088..3f2d7ddc5c42 100644 +--- a/gcc/c/c-parser.cc ++++ b/gcc/c/c-parser.cc +@@ -6529,7 +6529,6 @@ c_parser_initval (c_parser *parser, struct c_expr *after, + unsigned int i; + gcc_checking_assert (len >= 64); + location_t last_loc = UNKNOWN_LOCATION; +- location_t prev_loc = UNKNOWN_LOCATION; + for (i = 0; i < 64; ++i) + { + c_token *tok = c_parser_peek_nth_token_raw (parser, 1 + 2 * i); +@@ -6545,7 +6544,6 @@ c_parser_initval (c_parser *parser, struct c_expr *after, + buf1[i] = (char) tree_to_uhwi (tok->value); + if (i == 0) + loc = tok->location; +- prev_loc = last_loc; + last_loc = tok->location; + } + if (i < 64) +@@ -6560,7 +6558,9 @@ c_parser_initval (c_parser *parser, struct c_expr *after, + (as guaranteed for CPP_EMBED). */ + if (tok->type == CPP_CLOSE_BRACE && len != INT_MAX) + len = i; +- else if (tok->type != CPP_COMMA) ++ else if (tok->type != CPP_COMMA ++ || (c_parser_peek_nth_token_raw (parser, 2 + 2 * i)->type ++ != CPP_NUMBER)) + { + vals_to_ignore = i; + return; +@@ -6569,7 +6569,6 @@ c_parser_initval (c_parser *parser, struct c_expr *after, + unsigned int max_len = 131072 - offsetof (struct tree_string, str) - 1; + unsigned int orig_len = len; + unsigned int off = 0, last = 0; +- unsigned char lastc = 0; + if (!wi::neg_p (wi::to_wide (val)) && wi::to_widest (val) <= UCHAR_MAX) + off = 1; + len = MIN (len, max_len - off); +@@ -6599,25 +6598,23 @@ c_parser_initval (c_parser *parser, struct c_expr *after, + if (tok2->type != CPP_COMMA && tok2->type != CPP_CLOSE_BRACE) + break; + buf2[i + off] = (char) tree_to_uhwi (tok->value); +- prev_loc = last_loc; ++ /* If orig_len is INT_MAX, this can be flexible array member and ++ in that case we need to ensure another element which ++ for CPP_EMBED is normally guaranteed after it. Include ++ that byte in the RAW_DATA_OWNER though, so it can be optimized ++ later. */ ++ if (orig_len == INT_MAX ++ && (tok2->type == CPP_CLOSE_BRACE ++ || (c_parser_peek_nth_token (parser, 4)->type ++ != CPP_NUMBER))) ++ { ++ last = 1; ++ break; ++ } + last_loc = tok->location; + c_parser_consume_token (parser); + c_parser_consume_token (parser); + } +- /* If orig_len is INT_MAX, this can be flexible array member and +- in that case we need to ensure another element which +- for CPP_EMBED is normally guaranteed after it. Include +- that byte in the RAW_DATA_OWNER though, so it can be optimized +- later. */ +- if (orig_len == INT_MAX +- && (!c_parser_next_token_is (parser, CPP_COMMA) +- || c_parser_peek_2nd_token (parser)->type != CPP_NUMBER)) +- { +- --i; +- last = 1; +- std::swap (prev_loc, last_loc); +- lastc = (unsigned char) buf2[i + off]; +- } + val = make_node (RAW_DATA_CST); + TREE_TYPE (val) = integer_type_node; + RAW_DATA_LENGTH (val) = i; +@@ -6633,13 +6630,6 @@ c_parser_initval (c_parser *parser, struct c_expr *after, + init.original_type = integer_type_node; + init.m_decimal = 0; + process_init_element (loc, init, false, braced_init_obstack); +- if (last) +- { +- init.value = build_int_cst (integer_type_node, lastc); +- init.original_code = INTEGER_CST; +- set_c_expr_source_range (&init, prev_loc, prev_loc); +- process_init_element (prev_loc, init, false, braced_init_obstack); +- } + } + } + +diff --git a/gcc/testsuite/c-c++-common/init-5.c b/gcc/testsuite/c-c++-common/init-5.c +new file mode 100644 +index 000000000000..61b6cdb97e2f +--- /dev/null ++++ b/gcc/testsuite/c-c++-common/init-5.c +@@ -0,0 +1,19 @@ ++/* PR c/117190 */ ++/* { dg-do run } */ ++/* { dg-options "-O2" } */ ++ ++struct S { char d[]; } v = { ++{ 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ++ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ++ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ++ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ++ 0, } ++}; ++ ++int ++main () ++{ ++ for (int i = 0; i < 65; ++i) ++ if (v.d[i] != (i == 0 ? 8 : 0)) ++ __builtin_abort (); ++} +-- +2.47.0 +