From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 50D1E158095 for ; Wed, 31 Aug 2022 16:40:42 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 35BBBE070D; Wed, 31 Aug 2022 16:40:38 +0000 (UTC) Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 0F6E8E05AC for ; Wed, 31 Aug 2022 16:40:37 +0000 (UTC) Received: by mail-pj1-x102d.google.com with SMTP id q3so11422692pjg.3 for ; Wed, 31 Aug 2022 09:40:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc; bh=5u9mPz2Vm6+VdweNi9eZF326tPqbL4YEKrt/onmW+1A=; b=MUZrLXFbNoKr7iSUoJRXYwD3lIO0uQOC3j5t7IJ/2nox3Xs9vjcldnhmZItLdid1MB suyJmZOL6ZatwX2caZQ0UQ3+RCcjX1XqZhzy9dh05hTH/TRCZEFw77+9yz4miZnnMz8k JGUFQ//rWTxVBZF+ygoywccU0sVe/hZW5lbBcMks8myypTl/0xPFoj8k3nwKY9vSmIxw Lx85N5R52uvYPC3j0rRI1w0q8y0FCakJlubT+zTPDOkMRkVunfgi6dD6R2Za+Umw/Okp HRZD2vMftxaupbCkh2zjvDvngWa2FwuaeryigAfyZ5AzldFr3TBSRBoyKzKoO60Mswd/ ccVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc; bh=5u9mPz2Vm6+VdweNi9eZF326tPqbL4YEKrt/onmW+1A=; b=5br+yGNKE+HYQtdh9U9v3kMMVhFs7IegEz1b7laSWYwpZWceawpTiPGBSuPqIcVvrZ G94TO2Yk1AJEAYWWBeGFmcyNE9Ye45NnEtzhlPCasHu3StlZARvIuM04EKb40d7fxajG IQ+1V9I7AwUEHt8DOCNeHaroM6Qrca9JI1eEo6eo5+Nd/bJhXyFgxG3WI9nRYobpf1je WDn6vqPrQoUiwKS+3khz6g1DvFkP0xA+cMFXRFa1b/YP8u3GfbOaSlo9O1a0tja9FnZO +2JvnfQ+J7enKxUGWm1T6vWXrRIi/ciOYRFli46FncNPNGeKWARkPIM4t/Aflr28ZKyp 5kHg== X-Gm-Message-State: ACgBeo1TSeM5p4iaOp9GgsActxvczB6Laovark6AuTlHCk8pOXdlhUD5 mK0VgoM9EeSZhM/2u8wMa6SocpsCnHsq2itN X-Google-Smtp-Source: AA6agR5768ZXbbRo0DXw8mOcko6gPT7W1pFvdVkxX7zdqtc7c/7dv/m8QedZ1DMlKC6xULB460FOrA== X-Received: by 2002:a17:903:244c:b0:171:59be:6762 with SMTP id l12-20020a170903244c00b0017159be6762mr26158481pls.20.1661964037018; Wed, 31 Aug 2022 09:40:37 -0700 (PDT) Received: from localhost (49.212.183.201.v6.sakura.ne.jp. [2403:3a00:202:1120:49:212:183:201]) by smtp.gmail.com with ESMTPSA id o69-20020a17090a0a4b00b001faee47021dsm1480883pjo.45.2022.08.31.09.40.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Aug 2022 09:40:36 -0700 (PDT) From: Yiyang Wu To: gentoo-dev@lists.gentoo.org Cc: Benda Xu , =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= Subject: [gentoo-dev] [PATCH v6 0/2] rocm.eclass: new eclass Date: Thu, 1 Sep 2022 00:40:32 +0800 Message-Id: X-Mailer: git-send-email 2.34.1 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Archives-Salt: 80a3d775-dfc4-4097-a9e0-9f32c53506f6 X-Archives-Hash: 319f6baaf48765bd9ed32b0bc42f5d0c The v6 fixes several issues raised in Github PR: https://github.com/gentoo/gentoo/pull/26784 Changelog against v5: 1. Update outdated examples and comments 2. QA fixes 3. Rename rocm_src_{test,configure} to rocm-{test,configure} to avoid confusion 4. Simplify rocm-test function 5. Change the reference of AMDGPU targets and GPU product matching. Yiyang Wu (2): rocm.eclass: new eclass profiles/desc: add amdgpu_targets.desc for USE_EXPAND eclass/rocm.eclass | 284 ++++++++++++++++++++++++++++++ profiles/base/make.defaults | 2 +- profiles/desc/amdgpu_targets.desc | 17 ++ 3 files changed, 302 insertions(+), 1 deletion(-) create mode 100644 eclass/rocm.eclass create mode 100644 profiles/desc/amdgpu_targets.desc Interdiff against v5: diff --git a/eclass/rocm.eclass b/eclass/rocm.eclass index 679b1af54e0a..1866d6b7cc94 100644 --- a/eclass/rocm.eclass +++ b/eclass/rocm.eclass @@ -18,13 +18,16 @@ # # Most ROCm packages use cmake as build system, so this eclass does not export # phase functions which overwrites the phase functions in cmake.eclass. Ebuild -# should explicitly call rocm_src_* in src_configure and src_test. +# should explicitly call rocm-{configure,test} in src_configure and src_test. # # @EXAMPLE: -# # Example for ROCm packages in https://github.com/ROCmSoftwarePlatform # @CODE +# # Example ebuild for ROCm library in https://github.com/ROCmSoftwarePlatform +# # whcih depends on rocBLAS # inherit cmake rocm +# # ROCm libraries SRC_URI is usually in form of: # SRC_URI="https://github.com/ROCmSoftwarePlatform/${PN}/archive/rocm-${PV}.tar.gz -> ${P}.tar.gz" +# S=${WORKDIR}/${PN}-rocm-${PV} # SLOT="0/$(ver_cut 1-2)" # IUSE="test" # REQUIRED_USE="${ROCM_REQUIRED_USE}" @@ -35,17 +38,15 @@ # sci-libs/rocBLAS:${SLOT}[${ROCM_USEDEP}] # " # -# S=${WORKDIR}/${PN}-rocm-${PV} -# # src_configure() { # local mycmakeargs=( # -DBUILD_CLIENTS_TESTS=$(usex test ON OFF) # ) -# rocm_src_configure +# rocm-configure # } # # src_test() { -# rocm_src_test +# rocm-test # } # @CODE # @@ -53,7 +54,7 @@ # # rocBLAS, and use comma seperated ${HCC_AMDGPU_TARGET} to determine GPU # # architecture to compile. Requires ROCm version >5. # @CODE -# ROCM_VERSION=5 +# ROCM_VERSION=5.1 # inherit rocm # IUSE="rocm" # REQUIRED_USE="rocm? ( ${ROCM_REQUIRED_USE} )" @@ -206,25 +207,31 @@ get_amdgpu_flags() { # @FUNCTION: check_rw_permission # @USAGE: check_rw_permission # @DESCRIPTION: -# check read and write permissions on specific files. -# allow using wildcard, for example check_rw_permission /dev/dri/render* +# check read and write permissions on a specific file, die if no permission. +# @EXAMPLE: +# @CODE +# check_rw_permission /dev/kfd +# CODE check_rw_permission() { - [[ -r $1 ]] && [[ -w $1 ]] || die \ - "Portage do not have read or write permissions on $1! \n Make sure both are in render group and check the permissions." + if [[ ! -r $1 ]] || [[ ! -w $1 ]]; then + eerror "Portage do not have read or write permissions on $1!" + eerror "Make sure both are in render group and check the permissions." + die "No permissions on $1" + fi } # == phase functions == -# @FUNCTION: rocm_src_configure +# @FUNCTION: rocm-configure # @DESCRIPTION: -# configure rocm packages, and setting common cmake arguments -rocm_src_configure() { - # allow acces to hardware +# configure rocm packages, and setting common cmake arguments. Only for ROCm +# libraries in https://github.com/ROCmSoftwarePlatform using cmake. +rocm-configure() { + # avoid sandbox violation addpredict /dev/kfd addpredict /dev/dri/ mycmakeargs+=( - -DCMAKE_INSTALL_PREFIX="${EPREFIX}/usr" -DAMDGPU_TARGETS="$(get_amdgpu_flags)" -DCMAKE_SKIP_RPATH=TRUE ) @@ -232,46 +239,45 @@ rocm_src_configure() { CXX="hipcc" cmake_src_configure } -# @FUNCTION: rocm_src_test +# @FUNCTION: rocm-test # @DESCRIPTION: -# Test whether valid GPU device is present. If so, find how to, and execute test. -# ROCm packages can have to test mechanism: +# Test whether valid GPU device is present. If so, execute test. +# @EXAMPLE: +# ROCm packages can have two test scenarioes: # 1. cmake_src_test. MAKEOPTS="-j1" ensures only one test on GPU at a time; -# 2. one single gtest binary called "${PN,,}"-test; -# 3. Some package like rocFFT have alternative test like rocfft-selftest; -# 4. Custome testing binaries like dev-libs/rccl. Use ${ROCM_TESTS} to specify. -rocm_src_test() { +# @CODE +# LD_LIBRARY_PATH= rocm-test --cmake +# @CODE +# 2. one gtest binary called "${PN,,}"-test in ${BUILD_DIR}/clients/staging; +# @CODE +# cd "${BUILD_DIR}"/clients/staging || die +# LD_LIBRARY_PATH= rocm-test "${PN,,}"-test +# @CODE +# Some packages like rocFFT have two test binaries like rocfft-selftest; +# packages like dev-libs/rccl have test binary with custom names. +# @CODE +# cd "${BUILD_DIR}"/clients/staging || die +# export LD_LIBRARY_PATH= +# cd || die +# rocm-test +# rocm-test +# @CODE +rocm-test() { # grant and check permissions on /dev/kfd and /dev/dri/render* for device in /dev/kfd /dev/dri/render*; do - addwrite ${device} - check_rw_permission ${device} + addwrite "${device}" + check_rw_permission "${device}" done - : ${LD_LIBRARY_PATH:="${BUILD_DIR}/clients:${BUILD_DIR}/src:${BUILD_DIR}/library:${BUILD_DIR}/library/src:${BUILD_DIR}/library/src/device"} - export LD_LIBRARY_PATH - if grep -q 'build test:' "${BUILD_DIR}"/build.ninja; then - MAKEOPTS="-j1" cmake_src_test - elif [[ -d ${BUILD_DIR}/clients/staging ]]; then - cd "${BUILD_DIR}/clients/staging" || die "Test directory not found!" - for test_program in "${PN,,}-"*test; do - if [[ -x ${test_program} ]]; then - edob ./${test_program} - else - die "The test program ${test_program} does not exist or cannot be excuted!" - fi - done - elif [[ -n ${ROCM_TESTS} ]]; then - for test_program in ${ROCM_TESTS}; do - cd "${BUILD_DIR}" || die - if [[ -x ${test_program} ]]; then - edob ./${test_program} - else - die "The test program ${test_program} does not exist or cannot be excuted!" - fi - done - else - die "There is no cmake tests, no \${ROCM_TESTS} executable provided, nor ${BUILD_DIR}/clients/staging where test program might be located." - fi + case ${1} in + --cmake) + # Avoid multi jobs running that may cause GPU error or CPU overload + MAKEOPTS="-j1" cmake_src_test + ;; + *) + edob ./${1} + ;; + esac } _ROCM_ECLASS=1 diff --git a/profiles/desc/amdgpu_targets.desc b/profiles/desc/amdgpu_targets.desc index 8a3db2b56dab..df013d4f2c08 100644 --- a/profiles/desc/amdgpu_targets.desc +++ b/profiles/desc/amdgpu_targets.desc @@ -1,7 +1,9 @@ # Copyright 1999-2022 Gentoo Authors. # Distributed under the terms of the GNU General Public License v2 -# Copied from https://www.coelacanth-dream.com/posts/2019/12/30/did-rid-product-matome-p2/#fn:67 +# Referene: +# GPU name and Architecture codename: https://github.com/GPUOpen-Tools/device_info/blob/master/DeviceInfo.cpp +# See also: https://www.coelacanth-dream.com/posts/2019/12/30/did-rid-product-matome-p2/#fn:67 gfx803 - Fiji GPU, codename fiji, including Radeon R9 Nano/Fury/FuryX, Radeon Pro Duo, FirePro S9300x2, Radeon Instinct MI8 gfx900 - Vega GPU, codename vega10, including Radeon Vega Frontier Edition, Radeon RX Vega 56/64, Radeon RX Vega 64 Liquid, Radeon Pro Vega 48/56/64/64X, Radeon Pro WX 8200/9100, Radeon Pro V320/V340/SSG, Radeon Instinct MI25 -- 2.34.1