From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 976AC15800D for ; Tue, 4 Jul 2023 23:09:42 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id B58BEE07B2; Tue, 4 Jul 2023 23:09:38 +0000 (UTC) Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 69733E078A for ; Tue, 4 Jul 2023 23:09:38 +0000 (UTC) Received: by mail-ed1-x533.google.com with SMTP id 4fb4d7f45d1cf-5149aafef44so6874461a12.0 for ; Tue, 04 Jul 2023 16:09:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688512177; x=1691104177; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:to:from:date:from:to:cc:subject :date:message-id:reply-to; bh=aW+SXHzSHDrapke5sEu3VsO++OTnkchozA1ntjG5uIs=; b=D8eEdzEAb4Me05+Mey0iDUxlfNs17EjOHbOCfb2s9D3BSTcKlgFXl/VKJ/8DXUqvLa WiVc3Tjzpy7PHTnhjNy7KAeghmNSPQG9iEotPbfF3Vh/khDyH1EuXdUTqBopai1g/o+e dMK190a1qdQix7iFiDDpi4mNE2oV6oftZ2k6d64/KFRLaRlwg5ETm3/uJGkSILR9UJ+N +zhl5Ka5EjL/AqbtRxMiMRxgFROz1qfZQRkc3dezr0hGrO5UJkD742jAl7oMDGknPdOC rV/RqBAiVkxj84+blr8S4mPTT71yYPpm6O/9AKogh+lPtk/ERiR+kpVtA09qHOvHx/ZH b/Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688512177; x=1691104177; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=aW+SXHzSHDrapke5sEu3VsO++OTnkchozA1ntjG5uIs=; b=fnA96l5DiDsk04WSq78waRHrpVfVBOSKR3z1Qio7W6Au72j631loz9bkh4KAXDEI6L DKn3Aso6Eb4Kmodu1uoABWEVRiRa5ky7w3XusDVrrP8kbvexTVUuLjdlcSWLJoHr12lm PooI+IPXMIelh5NKpaEQAVmKkWpU4OhuBk2qgI9JepSqN1EvEJNntJ4L/OFc8oeCA3qj re2EBy4CPweQDcuP/IfsMgzKWbUXK0tCvKRYwVqiWWVRBfTLPEhtr9ao8fENG/L775k4 xlRO558U/5zlf3DkgnJ+IrBWP0kAijgd+g7LcLBqaqyUVbZhBwnEeTuSiR5F4AhYoqkI EMyw== X-Gm-Message-State: ABy/qLaEXokH0rngJy7ONhk+cjmCdE+/3Seus+3ff+oETTGgBJKXw3fd fdStT8qG/uWCzDLfkgAudnLOH4KlTbA= X-Google-Smtp-Source: APBJJlEiF9dEYzgVoVNaUecUev/btFbQMpqEdddSUWyrkthGO50tmS7qssbuHPJ+lNiB9pllqfwHsw== X-Received: by 2002:a05:6402:3c7:b0:51d:e255:6173 with SMTP id t7-20020a05640203c700b0051de2556173mr10391031edw.0.1688512176664; Tue, 04 Jul 2023 16:09:36 -0700 (PDT) Received: from dj3ntoo (140.sub-75-226-201.myvzw.com. [75.226.201.140]) by smtp.gmail.com with ESMTPSA id e13-20020a50fb8d000000b0051e0be09297sm4049751edq.53.2023.07.04.16.09.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Jul 2023 16:09:35 -0700 (PDT) Date: Tue, 4 Jul 2023 18:09:30 -0500 From: Oskari Pirhonen To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Message-ID: Mail-Followup-To: gentoo-dev@lists.gentoo.org References: <2ZKWN4KF.MKEFFMWE.LGPKYP47@RTL7EJXF.RN4PF6UF.MDFBGF3C> <52703145-a284-30f3-aac8-69ed086a5f4a@gentoo.org> <1940209.tdWV9SEqCh@falbala> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="9ktPML3zhuYUvCVP" Content-Disposition: inline In-Reply-To: X-Archives-Salt: b03466fc-65e7-4dc5-8f4f-467d57c3da6d X-Archives-Hash: d7022ea3d71d2d56b2d2351eb00fae30 --9ktPML3zhuYUvCVP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 04, 2023 at 21:56:26 +0000, Robin H. Johnson wrote: > On Tue, Jul 04, 2023 at 12:44:39PM +0200, Gerion Entrup wrote: > > just to be curious about the whole discussion. I did not follow in the > > deepest detail but what I got is: > > - EGO_SUM blows up the Manifest file, since every little Go module needs > > to be respected. A lot of these Manifest files lead to a extremely > > increased Portage tree size. EGO_SUM is just one example (though the > > biggest one). Statically linked languages like Rust etc. have the same > > problem. > > - The current solution is to prepackage all modules, put it somewhere on > > a webserver and just manifest that file. This make the Portage tree > > small in size again, but requires a webserver/mirror and is thus > > unfriendly for overlay devs. > >=20 > > I'm not sure if it was mentioned before but has anyone considered hash > > trees / Merkle trees for the manifest file? The idea would be to hash > > the standard manifest file a second time if it gets too big and write > > down that hash as new manifest file and leave EGO_SUM as is. > This is out-of-tree/indirect Manifests, that I proposed here, more than > a year ago: > https://marc.info/?l=3Dgentoo-dev&m=3D168280762310716&w=3D2 > https://marc.info/?l=3Dgentoo-dev&m=3D165472088822215&w=3D2 >=20 > Developing it requires PMS work in addition to package manager > development, because it introduces phases. >=20 > - primary fetch of $SRC_URI per ebuild, including indirect Manifest > - primary validation of distfiles > - secondary fetch of $SRC_URI per indirect Manifest > - secondary validation of additional distfiles >=20 > A significantly impacted use case is "emerge -f", it now needs to run > downloads twice. >=20 I'm not sure double downloading is required. Consider a flow similar to this: 1. distfiles are fetched as per the ebuild 2. distfiles are hashed into a temporary Manifest 3. temporary Manifest is hashed and compared with the hashes stored in the in-tree Manifest for the direct Manifest A new Manifest format would be required in order to differentiate the current ones from an indirect one. This may require PMS changes, although I suspect ammending GLEP 74 may be enough since the PMS seems to just refer to the GLEP for a description of Manifests. This would also either rely on a stable ordering of Manifest contents when generating it or having a separate file listing in the indirect Manifest which corresponds to the order in the direct Manifest. For the latter, it should also have separate entries for different package versions so that every single distfile for every single version of said package does not need to be fetched in order to build the direct Manifest. I'm imagining something along these lines: =20 INDIRECT true PACKAGE category/package-version distfile1 distfile2 ... ALGO1 hash1 AL= GO2 hash2 ... PACKAGE ... Here `ALGO1` and `hash1` correspond to the hash of the direct Manifest containing the distfiles (and potentially other files if a repo does not have thin-manifests enabled) and their hashes in the order specified previously. The indirect Manifest as described above would be large-ish for a package that has lots of distfiles, but likely much smaller than if each distfile had its set of hashes stored directly. Please correct me if there's some detail I've overlooked. - Oskari > The rest of the posts also go into the matter of duplication within > EGO_SUM & the indirect Manifests: limiting the growth requires some form > of content-addressed layout. >=20 > It's absolutely something we should get developed, but it's a lot of > work. >=20 > The indirect Manifests still provide a hosting challenge for overlays. >=20 > --=20 > Robin Hugh Johnson > Gentoo Linux: Dev, Infra Lead, Foundation Treasurer > E-Mail : robbat2@gentoo.org > GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 > GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 --9ktPML3zhuYUvCVP Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQfOU+JeXjo4uxN6vCp8he9GGIfEQUCZKSmpQAKCRCp8he9GGIf EbMtAP4uKyqDvNVtx56tu4FLb4yge5D/+95CdaG7KeSTnh+OTQD/YarZpUuGXG3k 7hzf0A0WOE0Mz5rcOe2ndfjkOR6FLgw= =ML05 -----END PGP SIGNATURE----- --9ktPML3zhuYUvCVP--