From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 8DC94138334 for ; Wed, 29 May 2019 13:32:51 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id EB587E08D0; Wed, 29 May 2019 13:32:45 +0000 (UTC) Received: from smtp.gentoo.org (mail.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 99F55E08AB for ; Wed, 29 May 2019 13:32:44 +0000 (UTC) Received: from pomiot (d202-252.icpnet.pl [109.173.202.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: mgorny) by smtp.gentoo.org (Postfix) with ESMTPSA id 5EE70345259; Wed, 29 May 2019 13:32:42 +0000 (UTC) Message-ID: <9d42e0cc818985fcc8141743bbd840b85a58557e.camel@gentoo.org> Subject: Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching From: =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?= To: gentoo-dev@lists.gentoo.org Date: Wed, 29 May 2019 15:32:37 +0200 In-Reply-To: <2d3636f5bd6a738f30a4ad2e697b1ddb@debian.org> References: <2d3636f5bd6a738f30a4ad2e697b1ddb@debian.org> Organization: Gentoo Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="=-sOnpjCtnfnNHHK9q522T" User-Agent: Evolution 3.30.5 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 X-Archives-Salt: 932b7ba9-03bc-41f9-97e1-3515b02a5215 X-Archives-Hash: f0e43eb840c990cf9a30b05dbf731b23 --=-sOnpjCtnfnNHHK9q522T Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote: > Different BLAS/LAPACK implementations are expected to be compatible > to each other in both the API and ABI level. They can be used as > drop-in replacement to the others. This sounds nice, but the difference > in SONAME hampered the gentoo integration of well-optimized ones. If SONAMEs are different, then they are not compatible by definition. > Assume a Gentoo user compiled a pile of packages on top of the reference > BLAS and LAPACK, namely these reverse dependencies are linked against > libblas.so.3 and liblapack.so.3 . When the user discovered that > OpenBLAS provides much better performance, they'll have to recompile > the whole reverse dependency tree in order to take advantage from > OpenBLAS, > because the SONAME of OpenBLAS is libopenblas.so.0 . When the user > wants to try MKL (libmkl_rt.so), they'll have to recompile the whole > reverse dependency tree again. >=20 > This is not friendly to our earth. >=20 > Goal > ---- >=20 > * When a program is linked against libblas.so or liblapack.so > provided by any BLAS/LAPACK provider, the eselect-based solution > will allow user to switch the underlying library without recompiling > anything. >=20 > * When a program is linked against a specific implementation, e.g. > libmkl_rt.so, the solution doesn't break anything. >=20 > Solution > -------- >=20 > Similar to Debian's update-alternatives mechanism, Gentoo's eselect > is good at dealing with drop-in replacements as well. My preliminary > investigation suggests that eselect is enough for enabling BLAS/LAPACK > runtime switching. Hence, the proposed solution is eselect-based: >=20 > * Every BLAS/LAPACK implementation should provide generic library > and eselect candidate libraries at the same time. Taking netlib, > BLIS and OpenBLAS as examples: >=20 > reference: >=20 > usr/lib64/blas/reference/libblas.so.3 (SONAME=3Dlibblas.so.3) > -- default BLAS provider > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect /usr/lib64 is not supposed to be modified by eselect, it's package manager area. Yes, I know a lot of modules still do that but that's no reason to make things worse when people are putting significant effort to actually improve things. > usr/lib64/lapack/reference/liblapack.so.3 (SONAME=3Dliblapack.so.3) > -- default LAPACK provider > -- candidate of the eselect "lapack" unit > -- will be symlinked to usr/lib64/liblapack.so.3 by eselect >=20 > blis (doesn't provide LAPACK): > =20 > usr/lib64/libblis.so.2 (SONAME=3Dlibblis.so.2) > -- general purpose >=20 > usr/lib64/blas/blis/libblas.so.3 (SONAME=3Dlibblas.so.3) > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > -- compiled from the same set of object files as libblis.so.2 >=20 > openblas: > =20 > usr/lib64/libopenblas.so.0 (SONAME=3Dlibopenblas.so.0) > -- general purpose >=20 > usr/lib64/blas/openblas/libblas.so.3 (SONAME=3Dlibblas.so.3) > -- candidate of the eselect "blas" unit > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > -- compiled from the same set of object files as > libopenblas.so.0 >=20 > usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=3Dliblapack.so.3) > -- candidate of the eselect "lapack" unit > -- will be symlinked to usr/lib64/liblapack.so.3 by eselect > -- compiled from the same set of object files as > libopenblas.so.0 >=20 > This solution is similar to Debian's[3]. This solution achieves our > goal, > and it requires us to patch upstream build systems (same to Debian). > Preliminary demonstration for this solution is available, see below. So basically the three walls of text say in round-about way that you're going to introduce custom hacks to recompile libraries with different SONAME. Ok. >=20 > Is this solution reliable? > -------------------------- >=20 > * A similar solution has been used by Debian for many years. > * Many projects call BLAS/LAPACK libraries through FFI, including Julia. > (See Julia's standard library: LinearAlgebra) >=20 > Proposed Changes > ---------------- >=20 > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo > main repo. They use exactly the same source tarball. It's not quite > helpful to package these components in a fine-grained manner. A > single > sci-libs/lapack package is enough. Where's the gain in that? > 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially > harmful when "blas" and "cblas" point to different implementations. > That means "app-eselect/eselect-cblas" should be deprecated. >=20 > 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers > will be registered in their dependency information. >=20 > Note, ebuilds for BLAS/LAPACK reverse dependencies are expected to work > with these changes correctly without change. For example, my local > numpy-1.16.1 compilation was successful without change. >=20 > Preliminary Demonstration > ------------------------- >=20 > The preliminary implementation is available in my personal overlay[4]. > A simple sanity test script `check-cpp.sh` is provided to illustrate > the effectiveness of the proposed solution. >=20 > The script `check-cpp.sh` compiles two C++ programs -- one calls general > matrix-matrix multiplication from BLAS, while another one calls general > singular value decomposition from LAPACK. Once compiled, this script > will switch different BLAS/LAPACK implementations and run the C++ > programs > without recompilation. >=20 > The preliminary result is avaiable here[5]. (CPU=3DPower9, ARCH=3Dppc64le= ) > From the experimental results, we find that >=20 > For (512x512) single precision matrix multiplication: > * reference BLAS takes ~360 ms > * BLIS takes ~70 ms > * OpenBLAS takes ~10 ms >=20 > For (512x512) single precision singular value decomposition: > * reference LAPACK takes ~1900 ms > * BLIS (+reference LAPACK) takes ~1500 ms > * OpenBLAS takes ~1100 ms >=20 > The difference in computation speed illustrates the effectiveness of > the proposed solution. Theoretically, any other package could take > advantage from this solution without any recompilation as long as > it's linked against a library with SONAME. An actual ABI compliance test, e.g. done using abi-compliance-checker would be more interesting. >=20 > Acknowledgement > --------------- > This is an on-going GSoC-2019 Porject: > https://summerofcode.withgoogle.com/projects/?sp-page=3D2#626894278230016= 0 It would probably have been better if the project was discussed before GSoC. I'm really against pushing a bad idea forward just because someone set it for GSoC without discussing it first. --=20 Best regards, Micha=C5=82 G=C3=B3rny --=-sOnpjCtnfnNHHK9q522T Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQGTBAABCgB9FiEEx2qEUJQJjSjMiybFY5ra4jKeJA4FAlzuifZfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEM3 NkE4NDUwOTQwOThEMjhDQzhCMjZDNTYzOUFEQUUyMzI5RTI0MEUACgkQY5ra4jKe JA74Vwf8CpeDgFMVvMxrJOj2hDxIOjcNgNbWpu0u9ISZM/rycxYeL+HVrEU+5x+D 0iM8A6H689lwxREmOgr9OxjAy3/TvVsskDURcGDPiXoBv8CAehbx/6FcRFUMy9oj fbep8pQNAmKtkJIChTJSTV5M4LtvHAC8pXcE4uE6d0BmoPoOVcAVDKSFDsu5Uxtx Y5/C1SJPS1tGITVJ91sfJj4p2NlaqkkmMoK8q6gCgdZhxbL2m6Tj2DzC4LiEE3y5 sQXm0/grebkeI8Law6KUAYYsL7frMqGJl2mKgRAf4o+TlFmE9rrCYDrGISmEe7Qo Br7zEsY1BK2Ug6GkYOXtovngWzjD6Q== =oqVz -----END PGP SIGNATURE----- --=-sOnpjCtnfnNHHK9q522T--