From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 0DD01158043 for ; Sun, 14 Apr 2024 16:22:37 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 104B82BC020; Sun, 14 Apr 2024 16:22:15 +0000 (UTC) Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id ABB6E2BC01C for ; Sun, 14 Apr 2024 16:22:14 +0000 (UTC) Received: by mail-pj1-x1036.google.com with SMTP id 98e67ed59e1d1-2a698b0948fso920966a91.1 for ; Sun, 14 Apr 2024 09:22:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713111734; x=1713716534; darn=lists.gentoo.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BA+ujzssE1SGamfYZIpVmEoyF+HjevKwUea/5kdtuEg=; b=bFPTTkW8eKQbe/FPZ6Z0mqs8aGXKvHcDMSLmU7uSGG86rcIdCWgjdbLntsiJPNu9K6 7Gj3Jmt3K17xxtVbuDmJw6CSL7FmwpAB2Lm3vbS2ZeOmu3uu2Yk6Q+ckTJVB+mHEhNAq h+0lHGv4jWw+udUCF8KuwpRek71s9MCCG5ZEy2WVf7BEHkL85mydcHqaujLpXjHkKlbh Lm3ZEeIdsVKC8fQf/QLQGJVaawXQS8WoJdSeG4Ij3/4gEKHJrLinq6oaae+k0b+IxM6l kso5Zbhh9aUaoaXsvg9yW5VCCVPDrfr03wXxjBC+U2XbOoVHL3EyCcW+qaQmDUxAFF7o oipA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713111734; x=1713716534; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BA+ujzssE1SGamfYZIpVmEoyF+HjevKwUea/5kdtuEg=; b=K/cNCWstn4pK7n82dff058DQxM0W0VDzIaa2ImFNqCK+qd6cb/SGu+eAhaZ1FOmLkn 8iNg0TAqqDMN8J593YlWrV16QBsV/u5UiARcE9gutocN5KRBtkaHBoc9m6l+FdFEEpon CUKiZQICXMCP1nWLMz8kUpj4nvDfyF07OTPvpzmHgOjvNjOr61CBm6gpBl7Z152CGR83 AVLxUgvKrNADOioJWxosY0svT07aqhrtCNvFyx58lhtdrn415k3NF4JIuRSu4qfHaMkh CYYTqC1m6cBTcf+xhb/69M2YNjWIIFT6jWvBAlc3keMW3Fsi1BGoyA/UCXiYLMqqaFS5 GWlw== X-Gm-Message-State: AOJu0YwdQJL/YZb9FempEjvqn4c1/L1199qHq4la3VoB8wT+n24aPp5t 3BU4gUqqyFgneBw4Z109iXb7yRzABeFj3g0sK4PU/J1hd4yfsCOQyxam4/tiv80= X-Google-Smtp-Source: AGHT+IEN4Gf3XpPrHHNCrDxKBEsxDuLtBHZliF0J0HRAgsd5yuNRBeZNADPieW0xAylxvfUPSgHcHg== X-Received: by 2002:a05:6a00:650e:b0:6ed:6944:b170 with SMTP id hc14-20020a056a00650e00b006ed6944b170mr8731767pfb.1.1713111733447; Sun, 14 Apr 2024 09:22:13 -0700 (PDT) Received: from localhost ([166.111.32.48]) by smtp.gmail.com with ESMTPSA id e8-20020aa78c48000000b006ed37f1a912sm5753977pfd.89.2024.04.14.09.22.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 14 Apr 2024 09:22:13 -0700 (PDT) From: Yiyang Wu To: gentoo-dev@lists.gentoo.org Cc: Yiyang Wu Subject: [gentoo-dev] [PATCH 1/2] rocm.eclass: remove xnack flag for broader compatibility Date: Mon, 15 Apr 2024 00:21:49 +0800 Message-ID: <20240414162158.1777-2-xgreenlandforwyy@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20240414162158.1777-1-xgreenlandforwyy@gmail.com> References: <20240414162158.1777-1-xgreenlandforwyy@gmail.com> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Archives-Salt: f026c24e-6e15-44ca-9613-b3e8ac0d4191 X-Archives-Hash: 8b8e9cd5092a4a8af6000babdc1955ea Initially, rocm.eclass append xnack[1,2] feature flag to gfx9 GPUs, since ROCm upstream does this in many of their math libraries, e.g. rocBLAS [3]. The list includes gfx90a:xnack+, indicating xnack is usable for MI200 series, thus rocm.eclass append :xnack+ to gfx90a. But it turns out xnack- is also common for MI200 series, restricting to xnack+ produces incompatible GPU kernel with xnack- mode. Also, community also explores using xnack on other gfx9 GPU [4,5], which is previously restricted to xnack- in rocm.eclass. By not appending xnack feature flag, GPU kernels are compiled to "xnack any" mode, which can be run in either mode, potentially scarifying some performance [6,7], with no direct evidence. rocFFT reports no performance penalty[8]. For the reason above, do not append xnack feature flag to AMDGPU_TARGETS, which is compatible with GPUs operate in both xnack mode. [1] https://wiki.gentoo.org/wiki/ROCm#XNACK_target_feature [2] https://rocm.docs.amd.com/en/latest/conceptual/gpu-memory.html#xnack [3] https://github.com/ROCm/rocBLAS/blob/release/rocm-rel-5.0/CMakeLists.txt#L201 [4] https://niconiconi.neocities.org/tech-notes/xnack-on-amd-gpus/ [5] https://arxiv.org/abs/2401.02680 [6] https://llvm.org/docs/AMDGPUUsage.html#target-features [7] https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html#compiling-hip-kernels-for-specific-xnack-modes [8] https://github.com/ROCm/rocFFT/commit/cd2689360ba3b3579d044d8925838ff307b4b4cf Signed-off-by: Yiyang Wu --- eclass/rocm.eclass | 19 ++----------------- 1 file changed, 2 insertions(+), 17 deletions(-) diff --git a/eclass/rocm.eclass b/eclass/rocm.eclass index 9804ecde97d0..e03e8bdd507a 100644 --- a/eclass/rocm.eclass +++ b/eclass/rocm.eclass @@ -1,4 +1,4 @@ -# Copyright 2022-2023 Gentoo Authors +# Copyright 2022-2024 Gentoo Authors # Distributed under the terms of the GNU General Public License v2 # @ECLASS: rocm.eclass @@ -201,22 +201,7 @@ unset -f _rocm_set_globals # Append default target feature to GPU arch. See # https://llvm.org/docs/AMDGPUUsage.html#target-features get_amdgpu_flags() { - local amdgpu_target_flags - for gpu_target in ${AMDGPU_TARGETS}; do - local target_feature= - case ${gpu_target} in - gfx906|gfx908) - target_feature=:xnack- - ;; - gfx90a) - target_feature=:xnack+ - ;; - *) - ;; - esac - amdgpu_target_flags+="${gpu_target}${target_feature};" - done - echo "${amdgpu_target_flags}" + echo $(printf "%s;" ${AMDGPU_TARGETS[@]}) } # @FUNCTION: check_amdgpu -- 2.41.0