From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 1B6FC158094 for ; Mon, 8 Aug 2022 15:15:24 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 22866E0CB2; Mon, 8 Aug 2022 15:15:23 +0000 (UTC) Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 7F477E0CB2 for ; Mon, 8 Aug 2022 15:15:22 +0000 (UTC) Received: by mail-pj1-x1034.google.com with SMTP id o3-20020a17090a0a0300b001f7649cd317so1760193pjo.0 for ; Mon, 08 Aug 2022 08:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-disposition:mime-version:message-id:subject:to:from:date :from:to:cc; bh=l6M/4WbO93+uHbYvTiPR8/22tFIRqeHCFI8VwIEfV2Y=; b=ZZULPaxlcIwIiUc67kwwcetcai0fRVD+DY4SkkrowX+8AdupoUPJf9aUNWc81Jbtch vlae/YZn5xPKpMhrInyOiOVe730HwaAZGjiTtT8NobqYBAQWCuPZN1GJrROU6Su+gKZ7 kjzaUMvfi0OCr0S6BK+rq9x2sIPezXirYJ9Cc0PJLtNkkrBceoth1j/1E4DBL9cAkCIv B6Qb5B6zMwGhL6huaFLqh6qyzVsjRJn4Ev37VHqEGSkQrf3NGzkwIf24CaLgAzNsp+Dr arWi3soQn+QePkYQ/+dZH5LMRPFUntVtzigpi56sA9SSIcfQ1Kd0/MR14xNmt9fWqf1h fXjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-disposition:mime-version:message-id:subject:to:from:date :x-gm-message-state:from:to:cc; bh=l6M/4WbO93+uHbYvTiPR8/22tFIRqeHCFI8VwIEfV2Y=; b=m7ft4QxaCAPW2qC+VULkVkhQ/lMPhcvztlsxY4G8+blY9i1w3Jvt1QYQEHN1vyv8mY BtnFd3BxyIgkNHGSm53a/TlCAgxqTFIHARmw+zLhOt9fRCzmyibIv0pPJ4d2GUYVyw7X dvk9VEl3n4ZFmLzTTPwIfEuEfyUrjr8n3/8iZRnR7yeyc5BXiWQXy8xWJr1FwU+6sFQ1 nzBvSxcANH7W78uCdcXavWDxxQSldN+RvrnUlCgER5nj4yVkFnr+GUltEVxM8TJoWeef jLnsI9yH3X54+oXqMgvyem5FZk1kpuY8hSBU1wIOHMirMUNQeMWUdqjWS39Npf/FfEJ8 kqHA== X-Gm-Message-State: ACgBeo1caQBF89on1ko4DeoJjs/wVxIs+mesTB7tT9GNauPvk8U43mqE d9HAhnB5I6zByJ1sJieCd5EpvBsOExNMYw== X-Google-Smtp-Source: AA6agR5/LxCMtby6An5ruaZq5lCnzQ3Xtz5PvIuu0ee8PoTdJwl+EsCayW4XMcz/uCHJVxvCc6g6vw== X-Received: by 2002:a17:902:d54e:b0:16e:f3e4:8c88 with SMTP id z14-20020a170902d54e00b0016ef3e48c88mr18929811plf.94.1659971721378; Mon, 08 Aug 2022 08:15:21 -0700 (PDT) Received: from localhost (49.212.183.201.v6.sakura.ne.jp. [2403:3a00:202:1120:49:212:183:201]) by smtp.gmail.com with ESMTPSA id t15-20020a170902e84f00b0016d1bee1519sm9102632plg.102.2022.08.08.08.15.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Aug 2022 08:15:20 -0700 (PDT) Date: Mon, 8 Aug 2022 23:15:59 +0800 From: wuyy To: gentoo-soc Subject: [gentoo-soc] Week 8 Report for Refining ROCm Packages in Gentoo Message-ID: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-soc@lists.gentoo.org Reply-to: gentoo-soc@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Archives-Salt: 21d1572d-7ddb-48b4-b0f7-4f0b449ee24a X-Archives-Hash: a13a30b634e20d9218a3b3fa7034c603 Hello all, I'm back on GSoC after a two-week temporary leave. This week there are two major progress: dev-util/rocprofiler and rocm.eclass. I have implemented all the functions I think necessary for rocm.eclass. It was just send to rocm.eclass draft to gentoo-dev mailing list (also with a Github PR at [1]), please have a review. In the following weeks, I will collect feedbacks and continue to polish it. In summary, I have implemented those functions which is listed in my proposal: - USE_EXPNAD of amdgpu_targets_, and ROCM_USEDEP to make the use flag coherent among dependencies; - rocm_src_configure contains common arguments in src_prepare; - rocm_src_test which checks the permission on /dev/kfd and /dev/dri/render* There are also something listed in proposal but I decided not to implement now: - rocm_src_prepare: although there are some similarities among ebuilds, src_prepare are highly customized to each ROCm components. Unifying would take extra work. - SRC_URI: currently all SRC_URI is already specified in each ebuilds. It does not hurt to keep the status quo. Moreover, during implementation I found another feature necessary - rocm_src_test: correctly handles different scenarios. ROCm packages may have cmake test, which can be run using cmake_src_test, or only compiled some testing binaries which requires execution from command-line. I made rocm_src_test automatically detect the method, so ROCm packages just have to call this function directly without doing anything. Actually I have never imagined rocm.eclass could be in this shape eventually. Initially I just thought it would provide some utilities, mainly src_test and USE_EXPAND. But when implementing I found all these feature requires careful treatment. The comments (mainly examples) also takes half of the length. It ends up in 278 lines, which is a middle-sized among current eclasses. Maybe it can be further trimmed down after polishing, because there could be awkward implementations or re-inventions in it. Based on my draft rocm.eclass, I have prepared sci-libs/roc*=5.1.3, sci-lib/hip-*-5.1.3 and dev-python/cupy making use of it. It feels great to simplify the ebuilds, and portage can handles the USE_EXPAND and dependencies just as expected. Once the rocm.eclass get in tree, I'll push those ROCm-5.1.3 ebuilds. Anther thing to mention is that ROCm-5.1.3 toolchains finally get merged [5], with the fixed dev-util/rocprofiler-{4.3.0,5.0.2,5.1.3}. rocprofiler is actually buggy before, because I thought I committed the patch which stripped the libhsa-amd-aqlprofile.so loading (I even claimed it in the commit message), but it was not committed and lost in history. So I reproduced the patch. Also, I did some research about this proprietary lib. By default, not loading it means tracing hsa/hip is not possible -- you only get basic information like name and time of each GPU kernel execution, but do not know the pipeline of kernel execution (which one has spawned which kernel). AQL should be HSA architected queuing language (HSA AQL), where https://llvm.org/docs/AMDGPUUsage.html#hsa-aql-queue documented. It did sound related to the pipeline of kernel dispatching. By the description, libhsa-amd-aqlprofile.so is an extension API of AQL Profile. But actually, patching the source code to let rocprofiler not loading libhsa-amd-aqlprofile.so does not breaks the tracing of hsa/hip. So, I'm not sure why libhsa-amd-aqlprofile.so is needed, and raised a question at [2]. So I complete the fix in [3,4]. According to the renewed proposal (I have been leaving for two weeks, so there are changes in plan), I should collect feedback and refine rocm.eclass, and prepare dev-python/cupy and sci-libs/rocWMMA. I'll investigate ROCgdb, too. Also, rocm-device-libs is a major package because many users relies on it to provide opencl. I'll work on bumping its version, too. What's more, with hip-5.1.3 against vanilla clang, rocm for blender can land in ::gentoo. [1] https://github.com/gentoo/gentoo/pull/26784 [2] https://github.com/RadeonOpenCompute/ROCm/issues/1781 [3] https://github.com/gentoo/gentoo/pull/26755 [4] https://github.com/gentoo/gentoo/pull/26771 [5] https://github.com/gentoo/gentoo/pull/26441 Best wishes, -- Yiyang Wu