* [gentoo-commits] repo/gentoo:master commit in: dev-cpp/highway/files/, dev-cpp/highway/
@ 2023-03-07 11:44 Sam James
0 siblings, 0 replies; 3+ messages in thread
From: Sam James @ 2023-03-07 11:44 UTC (permalink / raw
To: gentoo-commits
commit: 1f2eceb5dfcce9899649870caac59d62055e7b82
Author: stefson <herrtimson <AT> yahoo <DOT> de>
AuthorDate: Tue Mar 7 11:01:52 2023 +0000
Commit: Sam James <sam <AT> gentoo <DOT> org>
CommitDate: Tue Mar 7 11:44:19 2023 +0000
URL: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=1f2eceb5
dev-cpp/highway: fix compile without neon optimization on armv7
Closes: https://bugs.gentoo.org/869077
Signed-off-by: Steffen Kuhn <nielson2 <AT> yandex.com>
Closes: https://github.com/gentoo/gentoo/pull/29964
Signed-off-by: Sam James <sam <AT> gentoo.org>
...ile-for-armv7-targets-with-vfp4-and-lower.patch | 123 +++++++++++++++++++++
dev-cpp/highway/highway-1.0.1-r1.ebuild | 6 +-
dev-cpp/highway/highway-1.0.3.ebuild | 4 +
3 files changed, 132 insertions(+), 1 deletion(-)
diff --git a/dev-cpp/highway/files/0001-fix-compile-for-armv7-targets-with-vfp4-and-lower.patch b/dev-cpp/highway/files/0001-fix-compile-for-armv7-targets-with-vfp4-and-lower.patch
new file mode 100644
index 000000000000..ebf448cfbb24
--- /dev/null
+++ b/dev-cpp/highway/files/0001-fix-compile-for-armv7-targets-with-vfp4-and-lower.patch
@@ -0,0 +1,123 @@
+https://github.com/google/highway/commit/dc63f813c465f3bf95cb5b98f01aeed28b81173c
+https://github.com/google/highway/pull/1143
+
+https://github.com/google/highway/issues/834
+https://github.com/google/highway/issues/1032
+
+https://bugs.gentoo.org/869077
+
+From dc63f813c465f3bf95cb5b98f01aeed28b81173c Mon Sep 17 00:00:00 2001
+From: Julien Olivain <ju.o@free.fr>
+Date: Mon, 20 Feb 2023 23:22:28 +0100
+Subject: [PATCH] Fix compilation for armv7 targets with vfp < v4 and gcc >= 8
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+When using a armv7 gcc >= 8 toolchain (like [1]) with Highway
+configured with -DHWY_CMAKE_ARM7=OFF and HWY_ENABLE_CONTRIB=ON,
+compilation fails with error:
+
+ In file included from /build/highway-1.0.3/hwy/ops/arm_neon-inl.h:33,
+ from /build/highway-1.0.3/hwy/highway.h:358,
+ from /build/highway-1.0.3/hwy/contrib/sort/shared-inl.h:104,
+ from /build/highway-1.0.3/hwy/contrib/sort/traits128-inl.h:27,
+ from /build/highway-1.0.3/hwy/contrib/sort/vqsort_128d.cc:23,
+ from /build/highway-1.0.3/hwy/foreach_target.h:81,
+ from /build/highway-1.0.3/hwy/contrib/sort/vqsort_128d.cc:20:
+ /toolchain/lib/gcc/arm-buildroot-linux-gnueabihf/12.2.0/include/arm_neon.h: In function ‘void hwy::N_NEON::StoreU(Vec128<long long unsigned int, 2>, Full128<long long unsigned int>, uint64_t*)’:
+ /toolchain/lib/gcc/arm-buildroot-linux-gnueabihf/12.2.0/include/arm_neon.h:11052:1: error: inlining failed in call to ‘always_inline’ ‘void vst1q_u64(uint64_t*, uint64x2_t)’: target specific option mismatch
+ 11052 | vst1q_u64 (uint64_t * __a, uint64x2_t __b)
+ | ^~~~~~~~~
+ /build/highway-1.0.3/hwy/ops/arm_neon-inl.h:2786:12: note: called from here
+ 2786 | vst1q_u64(unaligned, v.raw);
+ | ~~~~~~~~~^~~~~~~~~~~~~~~~~~
+
+The same errors happen when configured with HWY_ENABLE_EXAMPLES=ON,
+or from client libraries like libjxl (at other places).
+
+The issue is that Highway Arm NEON ops have a dependency on the
+Advanced SIMD (Neon) v2 and the VFPv4 floating-point instructions.
+The SIMD (Neon) v1 and VFPv3 instructions are not supported.
+
+There was several attempts to fix variants of this issues.
+See #834 and #1032.
+
+HWY_NEON target is selected only if __ARM_NEON is defined. See:
+https://github.com/google/highway/blob/1.0.3/hwy/detect_targets.h#L251
+
+This test is not sufficient since __ARM_NEON will be predefined in
+any cases when Neon is enabled (neon-vfpv3, neon-vfpv4).
+
+The issue is that HWY_CMAKE_ARM7=ON implies VFPv4 / NEON SIMD v2.
+When setting HWY_CMAKE_ARM7=OFF, "neon-vfpv4" will not be forced,
+but the code is still using intrinsics assuming VFPv4. Gcc will fail
+with error because code cannot be generated for the selected
+architecture.
+
+This issue can be avoided by adding "-DHWY_DISABLED_TARGETS=HWY_NEON" in
+CXXFLAGS. The problem with this solution is that every client program will
+also need to do the same. This goes against the very purpose of
+"hwy/detect_targets.h".
+
+Technically, Armv7-a processors with VFPv4 can be detected using some
+ACLE (Arm C Language Extensions [2]) predefined macros:
+
+Basically, we want Highway to define HWY_NEON only when the target
+supports SIMDv2/VFPv4 or higher. An older target with vfpv3 only
+(e.g. Cortex-A8, A9, ...) would NOT define HWY_NEON, and therefore
+would fallback on HWY_SCALAR implementation.
+
+However, not all compiler completely support ACLE. There is also
+several versions too. So we cannot easily rely on macros like
+"__ARM_VFPV4__" (which clang predefine, but not gcc).
+
+The alternative solution proposed in this patch, is to declare the
+HWY_NEON target architecture as broken, when we detect the target is
+Armv7-A, but mandatory features for vfpv4 (namely half-float, FMA)
+are missing. Half-floats are tested using the macro __ARM_NEON_FP,
+and the FMA with the macro __ARM_FEATURE_FMA. See ACLE [2]. The
+intent of declaring the target as broken, rather than selecting
+HWY_NEON only if vfpv4 features are detected is to remain a bit
+conservative, since the detection is slithly inaccurate.
+
+For a given compiler/cflags, predefined macros for Arm/ACLE can be
+reviewed with commands like:
+
+ arm-linux-gnueabihf-gcc -mcpu=cortex-a9 -mfpu=neon-vfpv3 -Wp,-dM -E -c - < /dev/null | grep -Fi arm | sort
+ arm-linux-gnueabihf-gcc -mcpu=cortex-a7 -mfpu=neon-vfpv4 -Wp,-dM -E -c - < /dev/null | grep -Fi arm | sort
+ clang -target armv7a -mcpu=cortex-a9 -mfpu=neon-vfpv3 -mfloat-abi=hard -Wp,-dM -E -c - < /dev/null | grep -Fi arm | sort
+ clang -target armv7a -mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard -Wp,-dM -E -c - < /dev/null | grep -Fi arm | sort
+
+The different values of __ARM_NEON_FP can be seen, depending which
+"-mfpu" is passed. Same for __ARM_FEATURE_FMA.
+
+[1] https://toolchains.bootlin.com/downloads/releases/toolchains/armv7-eabihf/tarballs/armv7-eabihf--glibc--bleeding-edge-2022.08-1.tar.bz2
+[2] https://github.com/ARM-software/acle/
+
+Signed-off-by: Julien Olivain <ju.o@free.fr>
+---
+ hwy/detect_targets.h | 10 ++++++++++
+ 1 file changed, 10 insertions(+)
+
+diff --git a/hwy/detect_targets.h b/hwy/detect_targets.h
+index 2beca95b..40ae7fe7 100644
+--- a/hwy/detect_targets.h
++++ b/hwy/detect_targets.h
+@@ -154,6 +154,16 @@
+ (defined(__BYTE_ORDER) && __BYTE_ORDER == __BIG_ENDIAN))
+ #define HWY_BROKEN_TARGETS (HWY_NEON)
+
++// armv7-a without a detected vfpv4 is not supported
++// (for example Cortex-A8, Cortex-A9)
++// vfpv4 always have neon half-float _and_ FMA.
++#elif HWY_ARCH_ARM_V7 && \
++ (__ARM_ARCH_PROFILE == 'A') && \
++ !defined(__ARM_VFPV4__) && \
++ !((__ARM_NEON_FP & 0x2 /* half-float */) && \
++ (__ARM_FEATURE_FMA == 1))
++#define HWY_BROKEN_TARGETS (HWY_NEON)
++
+ // SVE[2] require recent clang or gcc versions.
+ #elif (HWY_COMPILER_CLANG && HWY_COMPILER_CLANG < 1100) || \
+ (HWY_COMPILER_GCC_ACTUAL && HWY_COMPILER_GCC_ACTUAL < 1000)
diff --git a/dev-cpp/highway/highway-1.0.1-r1.ebuild b/dev-cpp/highway/highway-1.0.1-r1.ebuild
index e4a2c4b87d4b..b0a7900ce124 100644
--- a/dev-cpp/highway/highway-1.0.1-r1.ebuild
+++ b/dev-cpp/highway/highway-1.0.1-r1.ebuild
@@ -1,4 +1,4 @@
-# Copyright 2021-2022 Gentoo Authors
+# Copyright 2021-2023 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2
EAPI=8
@@ -24,6 +24,10 @@ DEPEND="test? ( dev-cpp/gtest[${MULTILIB_USEDEP}] )"
RESTRICT="!test? ( test )"
+PATCHES=(
+ "${FILESDIR}"/0001-fix-compile-for-armv7-targets-with-vfp4-and-lower.patch
+)
+
multilib_src_configure() {
local mycmakeargs=(
-DHWY_CMAKE_ARM7=$(usex cpu_flags_arm_neon)
diff --git a/dev-cpp/highway/highway-1.0.3.ebuild b/dev-cpp/highway/highway-1.0.3.ebuild
index af752cf34a06..c4d4ed034ee9 100644
--- a/dev-cpp/highway/highway-1.0.3.ebuild
+++ b/dev-cpp/highway/highway-1.0.3.ebuild
@@ -24,6 +24,10 @@ DEPEND="test? ( dev-cpp/gtest[${MULTILIB_USEDEP}] )"
RESTRICT="!test? ( test )"
+PATCHES=(
+ "${FILESDIR}"/0001-fix-compile-for-armv7-targets-with-vfp4-and-lower.patch
+)
+
multilib_src_configure() {
local mycmakeargs=(
-DHWY_CMAKE_ARM7=$(usex cpu_flags_arm_neon)
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [gentoo-commits] repo/gentoo:master commit in: dev-cpp/highway/files/, dev-cpp/highway/
@ 2024-02-09 14:10 Joonas Niilola
0 siblings, 0 replies; 3+ messages in thread
From: Joonas Niilola @ 2024-02-09 14:10 UTC (permalink / raw
To: gentoo-commits
commit: 75145e4759c8e12bf8994889ee69c24bacf1c4d5
Author: Daniel Novomeský <dnovomesky <AT> gmail <DOT> com>
AuthorDate: Thu Jan 25 11:55:42 2024 +0000
Commit: Joonas Niilola <juippis <AT> gentoo <DOT> org>
CommitDate: Fri Feb 9 14:09:49 2024 +0000
URL: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=75145e47
dev-cpp/highway: revbump for UB case fix
Closes: https://bugs.gentoo.org/922793
Signed-off-by: Daniel Novomeský <dnovomesky <AT> gmail.com>
Closes: https://github.com/gentoo/gentoo/pull/35005
Signed-off-by: Joonas Niilola <juippis <AT> gentoo.org>
...ay-1.0.7-Fix_UB_case_with_signed_overflow.patch | 29 +++++++++++++++
dev-cpp/highway/highway-1.0.7-r1.ebuild | 41 ++++++++++++++++++++++
2 files changed, 70 insertions(+)
diff --git a/dev-cpp/highway/files/highway-1.0.7-Fix_UB_case_with_signed_overflow.patch b/dev-cpp/highway/files/highway-1.0.7-Fix_UB_case_with_signed_overflow.patch
new file mode 100644
index 000000000000..814d584e8b3a
--- /dev/null
+++ b/dev-cpp/highway/files/highway-1.0.7-Fix_UB_case_with_signed_overflow.patch
@@ -0,0 +1,29 @@
+https://github.com/google/highway/issues/1549
+https://github.com/google/highway/commit/45eea15b5488f3e7a15c2c94ac77bd9e99703203
+
+From 45eea15b5488f3e7a15c2c94ac77bd9e99703203 Mon Sep 17 00:00:00 2001
+From: Mathieu Malaterre <mathieu.malaterre@gmail.com>
+Date: Thu, 5 Oct 2023 08:00:38 +0200
+Subject: [PATCH] Fix UB case with signed overflow, prefer unsigned
+
+Fixes #1549
+
+Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110643
+Suggested-by: Andrew Pinski <pinskia@gcc.gnu.org>
+---
+ hwy/ops/arm_neon-inl.h | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/hwy/ops/arm_neon-inl.h b/hwy/ops/arm_neon-inl.h
+index 7ac7a10c62..97de46be2e 100644
+--- a/hwy/ops/arm_neon-inl.h
++++ b/hwy/ops/arm_neon-inl.h
+@@ -4592,7 +4592,7 @@ HWY_API Mask128<T, N> IsNaN(const Vec128<T, N> v) {
+ template <typename T, size_t N, HWY_IF_FLOAT(T)>
+ HWY_API Mask128<T, N> IsInf(const Vec128<T, N> v) {
+ const DFromV<decltype(v)> d;
+- const RebindToSigned<decltype(d)> di;
++ const RebindToUnsigned<decltype(d)> di;
+ const VFromD<decltype(di)> vi = BitCast(di, v);
+ // 'Shift left' to clear the sign bit, check for exponent=max and mantissa=0.
+ return RebindMask(d, Eq(Add(vi, vi), Set(di, hwy::MaxExponentTimes2<T>())));
diff --git a/dev-cpp/highway/highway-1.0.7-r1.ebuild b/dev-cpp/highway/highway-1.0.7-r1.ebuild
new file mode 100644
index 000000000000..98f940b6570c
--- /dev/null
+++ b/dev-cpp/highway/highway-1.0.7-r1.ebuild
@@ -0,0 +1,41 @@
+# Copyright 2021-2024 Gentoo Authors
+# Distributed under the terms of the GNU General Public License v2
+
+EAPI=8
+
+inherit cmake-multilib
+
+DESCRIPTION="Performance-portable, length-agnostic SIMD with runtime dispatch"
+HOMEPAGE="https://github.com/google/highway"
+
+if [[ "${PV}" == *9999* ]]; then
+ inherit git-r3
+ EGIT_REPO_URI="https://github.com/google/highway.git"
+else
+ SRC_URI="https://github.com/google/highway/archive/refs/tags/${PV}.tar.gz -> ${P}.tar.gz"
+ KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~loong ~ppc ~ppc64 ~riscv ~sparc ~x86"
+fi
+
+LICENSE="Apache-2.0"
+SLOT="0"
+IUSE="cpu_flags_arm_neon test"
+
+DEPEND="test? ( dev-cpp/gtest[${MULTILIB_USEDEP}] )"
+
+RESTRICT="!test? ( test )"
+
+PATCHES=(
+ "${FILESDIR}/${PN}-1.0.7-Fix_UB_case_with_signed_overflow.patch"
+)
+
+multilib_src_configure() {
+ local mycmakeargs=(
+ -DHWY_CMAKE_ARM7=$(usex cpu_flags_arm_neon)
+ -DBUILD_TESTING=$(usex test)
+ -DHWY_WARNINGS_ARE_ERRORS=OFF
+ )
+
+ use test && mycmakeargs+=( "-DHWY_SYSTEM_GTEST=ON" )
+
+ cmake_src_configure
+}
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [gentoo-commits] repo/gentoo:master commit in: dev-cpp/highway/files/, dev-cpp/highway/
@ 2022-04-06 6:26 Matt Turner
0 siblings, 0 replies; 3+ messages in thread
From: Matt Turner @ 2022-04-06 6:26 UTC (permalink / raw
To: gentoo-commits
commit: e6426d5f35e2fb9cc596fe69425c0338ca5b4496
Author: Paolo Pedroni <paolo.pedroni <AT> iol <DOT> it>
AuthorDate: Wed Mar 30 16:03:13 2022 +0000
Commit: Matt Turner <mattst88 <AT> gentoo <DOT> org>
CommitDate: Wed Apr 6 06:26:46 2022 +0000
URL: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e6426d5f
dev-cpp/highway: Fix detection of AVX512 for IceLake Client CPUs
Closes: https://bugs.gentoo.org/836373
Closes: https://github.com/gentoo/gentoo/pull/24819
Signed-off-by: Paolo Pedroni <paolo.pedroni <AT> iol.it>
Signed-off-by: Matt Turner <mattst88 <AT> gentoo.org>
...y-0.16.0-fix-AVX512-detection-on-IceLakeClient.patch | 17 +++++++++++++++++
.../{highway-0.16.0.ebuild => highway-0.16.0-r1.ebuild} | 4 ++++
2 files changed, 21 insertions(+)
diff --git a/dev-cpp/highway/files/highway-0.16.0-fix-AVX512-detection-on-IceLakeClient.patch b/dev-cpp/highway/files/highway-0.16.0-fix-AVX512-detection-on-IceLakeClient.patch
new file mode 100644
index 000000000000..de157925c6ef
--- /dev/null
+++ b/dev-cpp/highway/files/highway-0.16.0-fix-AVX512-detection-on-IceLakeClient.patch
@@ -0,0 +1,17 @@
+https://github.com/google/highway/commit/daf441c78191b3433410498d27a5bfdfdf93a142
+
+diff --git a/hwy/targets.cc b/hwy/targets.cc
+index 2a0ab4ef..7e7e2d79 100644
+--- a/hwy/targets.cc
++++ b/hwy/targets.cc
+@@ -328,8 +328,8 @@ uint32_t SupportedTargets() {
+ if (!IsBitSet(xcr0, 2)) {
+ bits &= ~uint32_t(HWY_AVX2 | HWY_AVX3 | HWY_AVX3_DL);
+ }
+- // ZMM + opmask
+- if ((xcr0 & 0x70) != 0x70) {
++ // opmask, ZMM lo/hi
++ if (!IsBitSet(xcr0, 5) || !IsBitSet(xcr0, 6) || !IsBitSet(xcr0, 7)) {
+ bits &= ~uint32_t(HWY_AVX3 | HWY_AVX3_DL);
+ }
+ }
diff --git a/dev-cpp/highway/highway-0.16.0.ebuild b/dev-cpp/highway/highway-0.16.0-r1.ebuild
similarity index 91%
rename from dev-cpp/highway/highway-0.16.0.ebuild
rename to dev-cpp/highway/highway-0.16.0-r1.ebuild
index 89b07a85a587..52fb0b16d961 100644
--- a/dev-cpp/highway/highway-0.16.0.ebuild
+++ b/dev-cpp/highway/highway-0.16.0-r1.ebuild
@@ -25,6 +25,10 @@ DEPEND="test? ( dev-cpp/gtest[${MULTILIB_USEDEP}] )"
RESTRICT="!test? ( test )"
+PATCHES=(
+ "${FILESDIR}"/${P}-fix-AVX512-detection-on-IceLakeClient.patch
+)
+
multilib_src_configure() {
local mycmakeargs=(
-DBUILD_TESTING=$(usex test)
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-02-09 14:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-07 11:44 [gentoo-commits] repo/gentoo:master commit in: dev-cpp/highway/files/, dev-cpp/highway/ Sam James
-- strict thread matches above, loose matches on Subject: below --
2024-02-09 14:10 Joonas Niilola
2022-04-06 6:26 Matt Turner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox