* [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) [not found] ` <87y1k33aoy.fsf@gentoo.org> @ 2023-06-30 8:15 ` Florian Schmaus 2023-06-30 8:22 ` Sam James 0 siblings, 1 reply; 23+ messages in thread From: Florian Schmaus @ 2023-06-30 8:15 UTC (permalink / raw To: gentoo-dev [-- Attachment #1.1.1: Type: text/plain, Size: 3981 bytes --] [in reply to a gentoo-project@ post, but it was asked to continue this on gentoo-dev@] On 28/06/2023 16.46, Sam James wrote: > Florian Schmaus <flow@gentoo.org> writes: >> On 17/06/2023 10.37, Arthur Zamarin wrote: >>> I also want to nominate people who I feel contribute a lot to Gentoo and >>> I have a lot of interaction with (ordered by name, not priority): >>> […] >>> flow >> >> I apologize for the late reply, and thank you for the nomination. I am >> honored and accept. >> >> As many of you know, I am spending a lot of time on the EGO_SUM >> situation, as it is one of the most critical issues to solve. >> >> I have used the last few days to carefully consider whether a seat on >> the council is more harmful or beneficial to my efforts regarding >> EGO_SUM. On the one hand, council work means I have less time to >> improve the EGO_SUM situation. On the other hand, a seat in the >> council increases the probability of positively influencing Gentoo's >> future, also regarding EGO_SUM. >> > > That's fine and it's great to see more people running! Excellent that we share this view. :) > But with regard to EGO_SUM: you didn't appear at the meeting where we discussed > your previous EGO_SUM proposal, Naively, as I am, I expected that the mailing list would be used for discussion and that the council meeting would be used chiefly for voting and intra-council discussion. And since the request to the council to vote on a concrete proposal was preceded by a multiple-week, if not month-long, mailing list discussion, I assumed that my presence in the council meeting was optional. Had I known that my presence was required, or that the absence in the meeting would be blamed on me afterward, I would have appeared if possible. > and questions remain unanswered on the > ML (why not implement a check in pkgcheck similar to what is in Portage, > for example)? On 2023-05-30 [1], I proposed a limit in the range of 2 to 1.5 MiB for the total package-directory size. I only care a little about the tool that checks this limit, but pkgcheck is an obvious choice. I also suggested that we review this policy once the number of Go packages has doubled or two years after this policy was established (whatever comes first). But I fear you may be referring to another kind of check. You may be talking about a check that forbids EGO_SUM in ::gentoo but allows it overlays. However, as stated before [2], this is not a viable approach. One reason why it is not practicable is auditability. > The blocker is not a council seat, it's about addressing people's > concerns... Unfortunately, it appears that I am terrible at convincing everyone that the deprecation of EGO_SUM was a mistake. I tried to respond to every concern. Often, the response included arguments based on factual data. But eventually, I would only expect to convince some, as the EGO_SUM question touches the subjective realm of style. I know that the EGO_SUM situation and the resulting discussion grew huge and left many understandably bored or confused, which then turned away. But that is a pity because it is a relevant discussion for Gentoo's long-term success. The bottom line is that the EGO_SUM discussion yielded no evidence or even a slight indication that EGO_SUM was deprecated based on technical issues. Instead, it appears that EGO_SUM was deprecated because some deemed it unaesthetic. Intelligibly, EGO_SUM can be considered ugly. Compared to a traditional Gentoo package, EGO_SUM-based ones are larger. The same is true for Rust packages. However, looking at the bigger picture, EGO_SUM's advantages outweigh its disadvantages. - Flow 1: https://marc.info/?l=gentoo-dev&m=168546196902731 <25308876-7ac4-8c90-8641-1034cc67c6b0@gentoo.org> 2: https://marc.info/?l=gentoo-dev&m=168569387514376 <012fa74d-2910-ea90-6008-26cc23604d2f@gentoo.org> [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 17273 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-06-30 8:15 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Florian Schmaus @ 2023-06-30 8:22 ` Sam James 2023-06-30 9:38 ` Tim Harder 2023-07-03 10:17 ` Florian Schmaus 0 siblings, 2 replies; 23+ messages in thread From: Sam James @ 2023-06-30 8:22 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 4861 bytes --] Florian Schmaus <flow@gentoo.org> writes: > [[PGP Signed Part:Undecided]] > [in reply to a gentoo-project@ post, but it was asked to continue this > on gentoo-dev@] > > On 28/06/2023 16.46, Sam James wrote: >> Florian Schmaus <flow@gentoo.org> writes: >>> On 17/06/2023 10.37, Arthur Zamarin wrote: >>>> I also want to nominate people who I feel contribute a lot to Gentoo and >>>> I have a lot of interaction with (ordered by name, not priority): >>>> […] >>>> flow >>> >>> I apologize for the late reply, and thank you for the nomination. I am >>> honored and accept. >>> >>> As many of you know, I am spending a lot of time on the EGO_SUM >>> situation, as it is one of the most critical issues to solve. >>> >>> I have used the last few days to carefully consider whether a seat on >>> the council is more harmful or beneficial to my efforts regarding >>> EGO_SUM. On the one hand, council work means I have less time to >>> improve the EGO_SUM situation. On the other hand, a seat in the >>> council increases the probability of positively influencing Gentoo's >>> future, also regarding EGO_SUM. >>> >> That's fine and it's great to see more people running! > > Excellent that we share this view. :) > > >> But with regard to EGO_SUM: you didn't appear at the meeting where we discussed >> your previous EGO_SUM proposal, > > Naively, as I am, I expected that the mailing list would be used for > discussion and that the council meeting would be used chiefly for > voting and intra-council discussion. And since the request to the > council to vote on a concrete proposal was preceded by a > multiple-week, if not month-long, mailing list discussion, I assumed > that my presence in the council meeting was optional. > > Had I known that my presence was required, or that the absence in the > meeting would be blamed on me afterward, I would have appeared if > possible. I'm not blaming you for anything. But you didn't speak in #gentoo-council before the meeting (a few days before IIRC) when we were discussing the problem, I pinged you during the meeting, and you didn't appear there afterwards. You also didn't seem to respond to the council decision (or non-decision) in that meeting either, unless I've missed it. It seems self-evident that discussion would happen in the meeting before voting...? What am I misunderstanding? We regularly discuss things before voting on them. Do you normally observe council meetings? I don't think what we did in this instance was at all unusual. (Also: there's the issue of whether or not the council should really be voting on overriding an eclass maintainer who would then be forced to keep something working they don't want to. mgorny raised that.) > > >> and questions remain unanswered on the >> ML (why not implement a check in pkgcheck similar to what is in Portage, >> for example)? > > On 2023-05-30 [1], I proposed a limit in the range of 2 to 1.5 MiB for > the total package-directory size. I only care a little about the tool > that checks this limit, but pkgcheck is an obvious choice. I also > suggested that we review this policy once the number of Go packages > has doubled or two years after this policy was established (whatever > comes first). > > But I fear you may be referring to another kind of check. You may be > talking about a check that forbids EGO_SUM in ::gentoo but allows it > overlays. My position on this has been consistent: a check is needed to statically determine when the environment size is too big. Copying the Portage check into pkgcheck (in terms of the metrics) would satisfy this. That is, regardless of raw size, I'm asking for a calculation based on the contents of EGO_SUM where, if exceeded, the package will not be installable on some systems. You didn't have an issue implementing this for Portage and I've mentioned this a bunch of times since, so I thought it was clear what I was hoping to see. I would also like (which is not what I was referring to here) some limit on the size, given that we already have a limit on the size of ${FILESDIR}, but this is less of a concern for me given it's bounded by the aforementioned environment size check. > > Intelligibly, EGO_SUM can be considered ugly. Compared to a > traditional Gentoo package, EGO_SUM-based ones are larger. The same is > true for Rust packages. However, looking at the bigger picture, > EGO_SUM's advantages outweigh its disadvantages. > Again, am on record as being fine with the general EGO_SUM approach, even if I wish we didn't need it, as I see it as inevitable for things like yarn, .NET, and of course Rust as we already have it. Just ideally not huge ones, and certainly not huge ones which then aren't even reliably installable because of environment size. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 377 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-06-30 8:22 ` Sam James @ 2023-06-30 9:38 ` Tim Harder 2023-06-30 11:33 ` Eray Aslan 2023-07-03 10:17 ` Florian Schmaus 1 sibling, 1 reply; 23+ messages in thread From: Tim Harder @ 2023-06-30 9:38 UTC (permalink / raw To: gentoo-dev On 2023-06-30 Fri 02:22, Sam James wrote: > My position on this has been consistent: a check is needed to statically > determine when the environment size is too big. Copying the Portage > check into pkgcheck (in terms of the metrics) would satisfy this. > > That is, regardless of raw size, I'm asking for a calculation based on > the contents of EGO_SUM where, if exceeded, the package will not be > installable on some systems. You didn't have an issue implementing this > for Portage and I've mentioned this a bunch of times since, so I thought > it was clear what I was hoping to see. > > I would also like (which is not what I was referring to here) some > limit on the size, given that we already have a limit on the size of > ${FILESDIR}, but this is less of a concern for me given it's bounded > by the aforementioned environment size check. Why do we have to keep exporting the related variables that generally cause these size issues to the environment? I've asked as much on IRC multiple times (nearly every time this discussion has been brought up) and the answers I've gotten are some variation on "it's always been that way" or "not exporting them would break using commands as external programs" (e.g. calling via xargs). The first response isn't a great argument and the second response, while more valid, also feels less important than having a more minimalistic, exported environment that causes less issues like this one and others such as potentially affecting a package's build system in an unexpected fashion. See bug #721088 for the related discussion on environment variable exports. From my stance, the spec should state that the only variables to be exported are ones already "semi-standard" and used externally of package manager internals in the expected fashion, which probably only includes HOME, TMPDIR, and maybe ROOT. This would of course currently break packages that use `xargs` while calling internal commands depending on some of those exported variables, but from a cursory glance at the gentoo repo, there aren't many ebuilds using that functionality and in general those that are could be written in an easier to understand fashion without using xargs. It should also be possible to proxy the required variables to those commands in various fashions without using the environment if using commands externally is extremely important to the few ebuild maintainers who make use of that functionality. In short, adding checks to portage and pkgcheck feels like a ill-suited workaround that foists hacking around the error onto users or developers due to a poor decision made decades ago on environment handling. Tim ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-06-30 9:38 ` Tim Harder @ 2023-06-30 11:33 ` Eray Aslan 2023-07-03 10:17 ` Florian Schmaus 0 siblings, 1 reply; 23+ messages in thread From: Eray Aslan @ 2023-06-30 11:33 UTC (permalink / raw To: gentoo-dev On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote: > Why do we have to keep exporting the related variables that generally > cause these size issues to the environment? I really do not want to make a +1 response but this is an excellent question that we need to answer before implementing EGO_SUM. -- Eray ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-06-30 11:33 ` Eray Aslan @ 2023-07-03 10:17 ` Florian Schmaus 2023-07-04 7:13 ` Tim Harder 0 siblings, 1 reply; 23+ messages in thread From: Florian Schmaus @ 2023-07-03 10:17 UTC (permalink / raw To: gentoo-dev, Eray Aslan [-- Attachment #1.1.1: Type: text/plain, Size: 832 bytes --] On 30/06/2023 13.33, Eray Aslan wrote: > On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote: >> Why do we have to keep exporting the related variables that generally >> cause these size issues to the environment? > > I really do not want to make a +1 response but this is an excellent > question that we need to answer before implementing EGO_SUM. Could you please discuss why you make the reintroduction of EGO_SUM dependent on this question? Portage will show you a warning message if the exported environment approaches the kernel limit, and it will show a detailed error message if executing an ebuild failed due to the limit being reached. There seems to be no reason why you should not be able to allow EGO_SUM again without first fixing, for example, https://bugs.gentoo.org/721088. - Flow [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 17273 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-03 10:17 ` Florian Schmaus @ 2023-07-04 7:13 ` Tim Harder 2023-07-04 10:44 ` Gerion Entrup 2023-07-06 6:09 ` Zoltan Puskas 0 siblings, 2 replies; 23+ messages in thread From: Tim Harder @ 2023-07-04 7:13 UTC (permalink / raw To: gentoo-dev On 2023-07-03 Mon 04:17, Florian Schmaus wrote: >On 30/06/2023 13.33, Eray Aslan wrote: >>On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote: >>>Why do we have to keep exporting the related variables that generally >>>cause these size issues to the environment? >> >>I really do not want to make a +1 response but this is an excellent >>question that we need to answer before implementing EGO_SUM. > >Could you please discuss why you make the reintroduction of EGO_SUM >dependent on this question? Just to be clear, I don't particularly care about EGO_SUM enough to gate its reintroduction (and don't have any leverage to do so anyway). I'm just tired of the circular discussions around env issues that all seem to avoid actual fixes, catering instead to functionality used by a vanishingly small subset of ebuilds in the main repo that compels a certain design mostly due to how portage functioned before EAPI 0. Other than that, supporting EGO_SUM (or any other language ecosystem trending towards distro-unfriendly releases) is fine as long as devs are cognizant how the related global-scope eclass design affects everyone running or working on the raw repo. I hope devs continue leveraging the relatively recent benchmark tooling (and perhaps more future support) to improve their work. Along those lines, it could be nice to see sample benchmark data in commit messages for large, global-scope eclass work just to reinforce that it was taken into account. Tim ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-04 7:13 ` Tim Harder @ 2023-07-04 10:44 ` Gerion Entrup 2023-07-04 21:56 ` Robin H. Johnson 2023-07-06 6:09 ` Zoltan Puskas 1 sibling, 1 reply; 23+ messages in thread From: Gerion Entrup @ 2023-07-04 10:44 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 3062 bytes --] Am Dienstag, 4. Juli 2023, 09:13:30 CEST schrieb Tim Harder: > On 2023-07-03 Mon 04:17, Florian Schmaus wrote: > >On 30/06/2023 13.33, Eray Aslan wrote: > >>On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote: > >>>Why do we have to keep exporting the related variables that generally > >>>cause these size issues to the environment? > >> > >>I really do not want to make a +1 response but this is an excellent > >>question that we need to answer before implementing EGO_SUM. > > > >Could you please discuss why you make the reintroduction of EGO_SUM > >dependent on this question? > > Just to be clear, I don't particularly care about EGO_SUM enough to gate > its reintroduction (and don't have any leverage to do so anyway). I'm > just tired of the circular discussions around env issues that all seem > to avoid actual fixes, catering instead to functionality used by a > vanishingly small subset of ebuilds in the main repo that compels a > certain design mostly due to how portage functioned before EAPI 0. > > Other than that, supporting EGO_SUM (or any other language ecosystem > trending towards distro-unfriendly releases) is fine as long as devs are > cognizant how the related global-scope eclass design affects everyone > running or working on the raw repo. I hope devs continue leveraging the > relatively recent benchmark tooling (and perhaps more future support) to > improve their work. Along those lines, it could be nice to see sample > benchmark data in commit messages for large, global-scope eclass work > just to reinforce that it was taken into account. > > Tim Hi, just to be curious about the whole discussion. I did not follow in the deepest detail but what I got is: - EGO_SUM blows up the Manifest file, since every little Go module needs to be respected. A lot of these Manifest files lead to a extremely increased Portage tree size. EGO_SUM is just one example (though the biggest one). Statically linked languages like Rust etc. have the same problem. - The current solution is to prepackage all modules, put it somewhere on a webserver and just manifest that file. This make the Portage tree small in size again, but requires a webserver/mirror and is thus unfriendly for overlay devs. I'm not sure if it was mentioned before but has anyone considered hash trees / Merkle trees for the manifest file? The idea would be to hash the standard manifest file a second time if it gets too big and write down that hash as new manifest file and leave EGO_SUM as is. When Portage tries to install the package, it can download all modules, build the "normal" Manifest file like normally, but instead of directly compare it to the Manifest in the tree it can hash it again and compare that to the provided Manifest. With this, Portage should have more less the same guarantees about the validity of the source code, but the manifest file consists of just two hashes again. What one would loose is the direct comparison of file names (they are included in the "meta"-hash, though) or do I miss something? Gerion [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 659 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-04 10:44 ` Gerion Entrup @ 2023-07-04 21:56 ` Robin H. Johnson 2023-07-04 23:09 ` Oskari Pirhonen 0 siblings, 1 reply; 23+ messages in thread From: Robin H. Johnson @ 2023-07-04 21:56 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 2185 bytes --] On Tue, Jul 04, 2023 at 12:44:39PM +0200, Gerion Entrup wrote: > just to be curious about the whole discussion. I did not follow in the > deepest detail but what I got is: > - EGO_SUM blows up the Manifest file, since every little Go module needs > to be respected. A lot of these Manifest files lead to a extremely > increased Portage tree size. EGO_SUM is just one example (though the > biggest one). Statically linked languages like Rust etc. have the same > problem. > - The current solution is to prepackage all modules, put it somewhere on > a webserver and just manifest that file. This make the Portage tree > small in size again, but requires a webserver/mirror and is thus > unfriendly for overlay devs. > > I'm not sure if it was mentioned before but has anyone considered hash > trees / Merkle trees for the manifest file? The idea would be to hash > the standard manifest file a second time if it gets too big and write > down that hash as new manifest file and leave EGO_SUM as is. This is out-of-tree/indirect Manifests, that I proposed here, more than a year ago: https://marc.info/?l=gentoo-dev&m=168280762310716&w=2 https://marc.info/?l=gentoo-dev&m=165472088822215&w=2 Developing it requires PMS work in addition to package manager development, because it introduces phases. - primary fetch of $SRC_URI per ebuild, including indirect Manifest - primary validation of distfiles - secondary fetch of $SRC_URI per indirect Manifest - secondary validation of additional distfiles A significantly impacted use case is "emerge -f", it now needs to run downloads twice. The rest of the posts also go into the matter of duplication within EGO_SUM & the indirect Manifests: limiting the growth requires some form of content-addressed layout. It's absolutely something we should get developed, but it's a lot of work. The indirect Manifests still provide a hosting challenge for overlays. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robbat2@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 1113 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-04 21:56 ` Robin H. Johnson @ 2023-07-04 23:09 ` Oskari Pirhonen 2023-07-05 18:40 ` Gerion Entrup 0 siblings, 1 reply; 23+ messages in thread From: Oskari Pirhonen @ 2023-07-04 23:09 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 4030 bytes --] On Tue, Jul 04, 2023 at 21:56:26 +0000, Robin H. Johnson wrote: > On Tue, Jul 04, 2023 at 12:44:39PM +0200, Gerion Entrup wrote: > > just to be curious about the whole discussion. I did not follow in the > > deepest detail but what I got is: > > - EGO_SUM blows up the Manifest file, since every little Go module needs > > to be respected. A lot of these Manifest files lead to a extremely > > increased Portage tree size. EGO_SUM is just one example (though the > > biggest one). Statically linked languages like Rust etc. have the same > > problem. > > - The current solution is to prepackage all modules, put it somewhere on > > a webserver and just manifest that file. This make the Portage tree > > small in size again, but requires a webserver/mirror and is thus > > unfriendly for overlay devs. > > > > I'm not sure if it was mentioned before but has anyone considered hash > > trees / Merkle trees for the manifest file? The idea would be to hash > > the standard manifest file a second time if it gets too big and write > > down that hash as new manifest file and leave EGO_SUM as is. > This is out-of-tree/indirect Manifests, that I proposed here, more than > a year ago: > https://marc.info/?l=gentoo-dev&m=168280762310716&w=2 > https://marc.info/?l=gentoo-dev&m=165472088822215&w=2 > > Developing it requires PMS work in addition to package manager > development, because it introduces phases. > > - primary fetch of $SRC_URI per ebuild, including indirect Manifest > - primary validation of distfiles > - secondary fetch of $SRC_URI per indirect Manifest > - secondary validation of additional distfiles > > A significantly impacted use case is "emerge -f", it now needs to run > downloads twice. > I'm not sure double downloading is required. Consider a flow similar to this: 1. distfiles are fetched as per the ebuild 2. distfiles are hashed into a temporary Manifest 3. temporary Manifest is hashed and compared with the hashes stored in the in-tree Manifest for the direct Manifest A new Manifest format would be required in order to differentiate the current ones from an indirect one. This may require PMS changes, although I suspect ammending GLEP 74 may be enough since the PMS seems to just refer to the GLEP for a description of Manifests. This would also either rely on a stable ordering of Manifest contents when generating it or having a separate file listing in the indirect Manifest which corresponds to the order in the direct Manifest. For the latter, it should also have separate entries for different package versions so that every single distfile for every single version of said package does not need to be fetched in order to build the direct Manifest. I'm imagining something along these lines: INDIRECT true PACKAGE category/package-version distfile1 distfile2 ... ALGO1 hash1 ALGO2 hash2 ... PACKAGE ... Here `ALGO1` and `hash1` correspond to the hash of the direct Manifest containing the distfiles (and potentially other files if a repo does not have thin-manifests enabled) and their hashes in the order specified previously. The indirect Manifest as described above would be large-ish for a package that has lots of distfiles, but likely much smaller than if each distfile had its set of hashes stored directly. Please correct me if there's some detail I've overlooked. - Oskari > The rest of the posts also go into the matter of duplication within > EGO_SUM & the indirect Manifests: limiting the growth requires some form > of content-addressed layout. > > It's absolutely something we should get developed, but it's a lot of > work. > > The indirect Manifests still provide a hosting challenge for overlays. > > -- > Robin Hugh Johnson > Gentoo Linux: Dev, Infra Lead, Foundation Treasurer > E-Mail : robbat2@gentoo.org > GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 > GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-04 23:09 ` Oskari Pirhonen @ 2023-07-05 18:40 ` Gerion Entrup 2023-07-05 19:32 ` Rich Freeman 2023-07-06 2:48 ` Oskari Pirhonen 0 siblings, 2 replies; 23+ messages in thread From: Gerion Entrup @ 2023-07-05 18:40 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 4639 bytes --] Am Mittwoch, 5. Juli 2023, 01:09:30 CEST schrieb Oskari Pirhonen: > On Tue, Jul 04, 2023 at 21:56:26 +0000, Robin H. Johnson wrote: > > On Tue, Jul 04, 2023 at 12:44:39PM +0200, Gerion Entrup wrote: > > > just to be curious about the whole discussion. I did not follow in the > > > deepest detail but what I got is: > > > - EGO_SUM blows up the Manifest file, since every little Go module needs > > > to be respected. A lot of these Manifest files lead to a extremely > > > increased Portage tree size. EGO_SUM is just one example (though the > > > biggest one). Statically linked languages like Rust etc. have the same > > > problem. > > > - The current solution is to prepackage all modules, put it somewhere on > > > a webserver and just manifest that file. This make the Portage tree > > > small in size again, but requires a webserver/mirror and is thus > > > unfriendly for overlay devs. > > > > > > I'm not sure if it was mentioned before but has anyone considered hash > > > trees / Merkle trees for the manifest file? The idea would be to hash > > > the standard manifest file a second time if it gets too big and write > > > down that hash as new manifest file and leave EGO_SUM as is. > > This is out-of-tree/indirect Manifests, that I proposed here, more than > > a year ago: > > https://marc.info/?l=gentoo-dev&m=168280762310716&w=2 > > https://marc.info/?l=gentoo-dev&m=165472088822215&w=2 > > > > Developing it requires PMS work in addition to package manager > > development, because it introduces phases. > > > > - primary fetch of $SRC_URI per ebuild, including indirect Manifest > > - primary validation of distfiles > > - secondary fetch of $SRC_URI per indirect Manifest > > - secondary validation of additional distfiles > > > > A significantly impacted use case is "emerge -f", it now needs to run > > downloads twice. > > > > I'm not sure double downloading is required. Consider a flow similar to > this: > > 1. distfiles are fetched as per the ebuild > 2. distfiles are hashed into a temporary Manifest > 3. temporary Manifest is hashed and compared with the hashes stored in > the in-tree Manifest for the direct Manifest This is exactly, what I meant. A webstorage is not needed. A second download process is also not needed. Just an additional Manifest format is needed for ebuilds with more than n distfiles. > A new Manifest format would be required in order to differentiate the > current ones from an indirect one. This may require PMS changes, > although I suspect ammending GLEP 74 may be enough since the PMS seems > to just refer to the GLEP for a description of Manifests. > > This would also either rely on a stable ordering of Manifest contents > when generating it or having a separate file listing in the indirect > Manifest which corresponds to the order in the direct Manifest. For the > latter, it should also have separate entries for different package > versions so that every single distfile for every single version of said > package does not need to be fetched in order to build the direct > Manifest. > > I'm imagining something along these lines: > > INDIRECT true > PACKAGE category/package-version distfile1 distfile2 ... ALGO1 hash1 ALGO2 hash2 ... > PACKAGE ... Maybe it is reasonable to skip the distfile names at all (or just provide a hash value of the concatenated file names). Then the manifest would just contain two/three hashes (for as many distfiles as the ebuild needs). Since these kind of indirect Manifests should be more rare than the normal ones, a slightly longer processing time does not have much impact I would say. > Here `ALGO1` and `hash1` correspond to the hash of the direct Manifest > containing the distfiles (and potentially other files if a repo does not > have thin-manifests enabled) and their hashes in the order specified > previously. > > The indirect Manifest as described above would be large-ish for a > package that has lots of distfiles, but likely much smaller than if each > distfile had its set of hashes stored directly. Without storing the filenames, the Manifest file would have the same small size for any amount of distfiles needed. Gerion > Please correct me if there's some detail I've overlooked. > > - Oskari > > > The rest of the posts also go into the matter of duplication within > > EGO_SUM & the indirect Manifests: limiting the growth requires some form > > of content-addressed layout. > > > > It's absolutely something we should get developed, but it's a lot of > > work. > > > > The indirect Manifests still provide a hosting challenge for overlays. > > > > > [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 659 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-05 18:40 ` Gerion Entrup @ 2023-07-05 19:32 ` Rich Freeman 2023-07-06 2:48 ` Oskari Pirhonen 1 sibling, 0 replies; 23+ messages in thread From: Rich Freeman @ 2023-07-05 19:32 UTC (permalink / raw To: gentoo-dev On Wed, Jul 5, 2023 at 2:40 PM Gerion Entrup <gerion.entrup@flump.de> wrote: > > Am Mittwoch, 5. Juli 2023, 01:09:30 CEST schrieb Oskari Pirhonen: > > On Tue, Jul 04, 2023 at 21:56:26 +0000, Robin H. Johnson wrote: > > > > > > Developing it requires PMS work in addition to package manager > > > development, because it introduces phases. > > > > > > - primary fetch of $SRC_URI per ebuild, including indirect Manifest > > > - primary validation of distfiles > > > - secondary fetch of $SRC_URI per indirect Manifest > > > - secondary validation of additional distfiles > > > > > > A significantly impacted use case is "emerge -f", it now needs to run > > > downloads twice. > > > > I'm not sure double downloading is required. Consider a flow similar to > > this: > > > > 1. distfiles are fetched as per the ebuild > > 2. distfiles are hashed into a temporary Manifest > > 3. temporary Manifest is hashed and compared with the hashes stored in > > the in-tree Manifest for the direct Manifest > > This is exactly, what I meant. A webstorage is not needed. A second > download process is also not needed. Just an additional Manifest format > is needed for ebuilds with more than n distfiles. > I suspect that Robin was proposing indirect manfests AND src uris, and not just indirect manifests. In any case, if he wasn't, then I'd suggest it would make sense to have that so that we don't need giant lists of src_uris or go sums or whatever in ebuilds. Sure, the manifests are even larger than the original file references, but those will still be long. Plus if a file is used by 5 versions of an ebuild it will be present in the manifests once per hash function, but in the ebuilds 5 times. I agree though that if only the manifests are moved to a fetched file then you could fetch that on the first pass, though you'd still need the extra logic to parse it. I'm not sure it really is much of a difference to the effort involved. Aren't go sums already content hashes? It might make even more sense to create some kind of modular manifest verification logic in portage so that the same eclass that handles EGO_SUM could tell the package manager how to check the integrity of the files that are fetched. Well, assuming we trust whatever hash function they're using (I'm afraid to check - maybe this isn't such a great idea...). -- Rich ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-05 18:40 ` Gerion Entrup 2023-07-05 19:32 ` Rich Freeman @ 2023-07-06 2:48 ` Oskari Pirhonen 1 sibling, 0 replies; 23+ messages in thread From: Oskari Pirhonen @ 2023-07-06 2:48 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 6065 bytes --] On Wed, Jul 05, 2023 at 20:40:34 +0200, Gerion Entrup wrote: > Am Mittwoch, 5. Juli 2023, 01:09:30 CEST schrieb Oskari Pirhonen: > > On Tue, Jul 04, 2023 at 21:56:26 +0000, Robin H. Johnson wrote: > > > On Tue, Jul 04, 2023 at 12:44:39PM +0200, Gerion Entrup wrote: > > > > just to be curious about the whole discussion. I did not follow in the > > > > deepest detail but what I got is: > > > > - EGO_SUM blows up the Manifest file, since every little Go module needs > > > > to be respected. A lot of these Manifest files lead to a extremely > > > > increased Portage tree size. EGO_SUM is just one example (though the > > > > biggest one). Statically linked languages like Rust etc. have the same > > > > problem. > > > > - The current solution is to prepackage all modules, put it somewhere on > > > > a webserver and just manifest that file. This make the Portage tree > > > > small in size again, but requires a webserver/mirror and is thus > > > > unfriendly for overlay devs. > > > > > > > > I'm not sure if it was mentioned before but has anyone considered hash > > > > trees / Merkle trees for the manifest file? The idea would be to hash > > > > the standard manifest file a second time if it gets too big and write > > > > down that hash as new manifest file and leave EGO_SUM as is. > > > This is out-of-tree/indirect Manifests, that I proposed here, more than > > > a year ago: > > > https://marc.info/?l=gentoo-dev&m=168280762310716&w=2 > > > https://marc.info/?l=gentoo-dev&m=165472088822215&w=2 > > > > > > Developing it requires PMS work in addition to package manager > > > development, because it introduces phases. > > > > > > - primary fetch of $SRC_URI per ebuild, including indirect Manifest > > > - primary validation of distfiles > > > - secondary fetch of $SRC_URI per indirect Manifest > > > - secondary validation of additional distfiles > > > > > > A significantly impacted use case is "emerge -f", it now needs to run > > > downloads twice. > > > > > > > I'm not sure double downloading is required. Consider a flow similar to > > this: > > > > 1. distfiles are fetched as per the ebuild > > 2. distfiles are hashed into a temporary Manifest > > 3. temporary Manifest is hashed and compared with the hashes stored in > > the in-tree Manifest for the direct Manifest > > This is exactly, what I meant. A webstorage is not needed. A second > download process is also not needed. Just an additional Manifest format > is needed for ebuilds with more than n distfiles. > > > > A new Manifest format would be required in order to differentiate the > > current ones from an indirect one. This may require PMS changes, > > although I suspect ammending GLEP 74 may be enough since the PMS seems > > to just refer to the GLEP for a description of Manifests. > > > > This would also either rely on a stable ordering of Manifest contents > > when generating it or having a separate file listing in the indirect > > Manifest which corresponds to the order in the direct Manifest. For the > > latter, it should also have separate entries for different package > > versions so that every single distfile for every single version of said > > package does not need to be fetched in order to build the direct > > Manifest. > > > > I'm imagining something along these lines: > > > > INDIRECT true > > PACKAGE category/package-version distfile1 distfile2 ... ALGO1 hash1 ALGO2 hash2 ... > > PACKAGE ... > > Maybe it is reasonable to skip the distfile names at all (or just > provide a hash value of the concatenated file names). Then the manifest > would just contain two/three hashes (for as many distfiles as the ebuild > needs). Since these kind of indirect Manifests should be more rare than > the normal ones, a slightly longer processing time does not have much > impact I would say. > My reasoning behind having the list of files is so that the intermediat/direct Manifest can be accurately recreated. Consider the following (not-so-)hypothetical Manifest: DIST dist.tar.gz 84703 BLAKE2B ... SHA512 ... DIST dist.tar.gz.asc 228 BLAKE2B ... SHA512 ... EBUILD package-r1.ebuild 1535 BLAKE2B ... SHA512 ... EBUILD package.ebuild 1536 BLAKE2B ... SHA512 ... MISC metadata.xml 959 BLAKE2B ... SHA512 ... It is "well behaved" because pkgdev created it. My main concern is if $OTHER_TOOLING generates the Manifest in a different order which would mean the Manifest may be correct, but you get a false negative since the hashes don't match what is in the in-tree indirect Manifest. Having the order specified in the indirect Manifest renders this moot because $OTHER_TOOLING would have to respect this in order to correctly handle indirect Manifests. Additionally, in repos without thin-manifests, the SRC_URI is not enough to build up the Manifest. This may or may not be an issue depending on if a repo's metadata/layout.conf is parsed as part of the Manifest verification process. > > > > Here `ALGO1` and `hash1` correspond to the hash of the direct Manifest > > containing the distfiles (and potentially other files if a repo does not > > have thin-manifests enabled) and their hashes in the order specified > > previously. > > > > The indirect Manifest as described above would be large-ish for a > > package that has lots of distfiles, but likely much smaller than if each > > distfile had its set of hashes stored directly. > > Without storing the filenames, the Manifest file would have the same > small size for any amount of distfiles needed. > Assuming layout.conf is parsed when the Manifest is verified (thus handling the thick Maniffest case), the file list can be omitted if GLEP 74 is ammended to specify an ordering on the entries. Side note: Portage itself does not seem to care about the ordering. I tested this by copying a package tree, moving some entries around, and running `ebuild /path/to/ebuild clean unpack`. - Oskari [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-04 7:13 ` Tim Harder 2023-07-04 10:44 ` Gerion Entrup @ 2023-07-06 6:09 ` Zoltan Puskas 2023-07-06 19:46 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open Hank Leininger 2023-07-08 20:49 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Sam James 1 sibling, 2 replies; 23+ messages in thread From: Zoltan Puskas @ 2023-07-06 6:09 UTC (permalink / raw To: gentoo-dev On Tue, Jul 04, 2023 at 01:13:30AM -0600, Tim Harder wrote: > On 2023-07-03 Mon 04:17, Florian Schmaus wrote: > >On 30/06/2023 13.33, Eray Aslan wrote: > >>On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote: > >>>Why do we have to keep exporting the related variables that generally > >>>cause these size issues to the environment? > >> > >>I really do not want to make a +1 response but this is an excellent > >>question that we need to answer before implementing EGO_SUM. > > > >Could you please discuss why you make the reintroduction of EGO_SUM > >dependent on this question? > > Just to be clear, I don't particularly care about EGO_SUM enough to gate > its reintroduction (and don't have any leverage to do so anyway). I'm > just tired of the circular discussions around env issues that all seem > to avoid actual fixes, catering instead to functionality used by a > vanishingly small subset of ebuilds in the main repo that compels a > certain design mostly due to how portage functioned before EAPI 0. > > Other than that, supporting EGO_SUM (or any other language ecosystem > trending towards distro-unfriendly releases) is fine as long as devs are > cognizant how the related global-scope eclass design affects everyone > running or working on the raw repo. I hope devs continue leveraging the > relatively recent benchmark tooling (and perhaps more future support) to > improve their work. Along those lines, it could be nice to see sample > benchmark data in commit messages for large, global-scope eclass work > just to reinforce that it was taken into account. > > Tim > I've been following the EGO_SUM thread for quite some time now. One other thing I did not see mentioned in favour of EGO_SUM so far: reproducibility. The problem with external tarballs is that they are gone once the ebuild is dropped from the tree. Should a user ever want to roll back to a previous version of an application, either by checking out on older version of the portage tree or copying said ebuild into their local overlay, they still cannot simply run an emerge on the it as they have to somehow recreate the tarball itself too. While upstream may not host everything forever, it's pretty much guaranteed to be available for much longer than Gentoo's custom tarball bundles of dependencies. Regarding space we are also likely making trade-off. By deprecating EGO_SUM we are saving space in the portage tree but in exchange inflating distfiles as it will start accumulating the same dependencies potentially multiple times since now the content is hidden in tarballs containing a combination of dependencies. This is essentially the source file version of "statically linking". Finally a personal opinion: I find dependency tarballs opaque. With EGO_SUM the ebuild defines all the upstream sources it needs to build the package as well as how to build it, but with the dependency tarball the sources are all hidden and makes verification all that much harder. Zoltan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open 2023-07-06 6:09 ` Zoltan Puskas @ 2023-07-06 19:46 ` Hank Leininger 2023-07-08 20:49 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Sam James 1 sibling, 0 replies; 23+ messages in thread From: Hank Leininger @ 2023-07-06 19:46 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 2245 bytes --] On Thu, Jul 6, 2023 Zoltan Puskas wrote: > I've been following the EGO_SUM thread for quite some time now. One > other thing I did not see mentioned in favour of EGO_SUM so far: > reproducibility. > The problem with external tarballs is that they are gone once the > ebuild is dropped from the tree. Should a user ever want to roll back > to a previous version of an application, either by checking out on > older version of the portage tree or copying said ebuild into their > local overlay, they still cannot simply run an emerge on the it as > they have to somehow recreate the tarball itself too. > While upstream may not host everything forever, it's pretty much > guaranteed to be available for much longer than Gentoo's custom > tarball bundles of dependencies. I see this brought up every once in a while in these EGO_SUM threads, but I think reproducable tarballs are a solved problem, or at least, the tools exist and we just need to decide how to best equip people with them. thesamesam/sam-gentoo-scripts has maint/bump-go which builds these tarballs smartly and reproducably: - use --sort=name to order files inside in a consistent way - use consistent owner:group (portage:portage) - use consistent LC and TZ settings - set a standard timestamp (since 'go mod download' doesn't preserve upstream timestamps anyway, this loses no useful information) With that, multiple developers can independently generate a -deps tarball for a given Go package version with checksums that match. The main distro tarball's checksums are verified against Manifest, and then within it are the list and checksums of the individual downloads which would be verified by go mod download (right?) and the resulting -deps files should also match Manifest entries. So a similar approach could be used in the case of expired ::gentoo versions being installed, or overlays using -deps files without a way to host them. Set things up so this can be done easily on demand or perhaps automatically as needed (maybe through a variation on pkg_nofetch in a Go eclass; that part is not obvious to me). Thanks, -- Hank Leininger <hlein@korelogic.com> 9606 3BF9 B593 4CBC E31A A384 6200 F6E3 781E 3DD7 [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-06 6:09 ` Zoltan Puskas 2023-07-06 19:46 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open Hank Leininger @ 2023-07-08 20:49 ` Sam James 1 sibling, 0 replies; 23+ messages in thread From: Sam James @ 2023-07-08 20:49 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 2324 bytes --] Zoltan Puskas <zoltan@sinustrom.info> writes: > On Tue, Jul 04, 2023 at 01:13:30AM -0600, Tim Harder wrote: >> On 2023-07-03 Mon 04:17, Florian Schmaus wrote: >> >On 30/06/2023 13.33, Eray Aslan wrote: >> >>On Fri, Jun 30, 2023 at 03:38:11AM -0600, Tim Harder wrote: >> >>>Why do we have to keep exporting the related variables that generally >> >>>cause these size issues to the environment? >> >> >> >>I really do not want to make a +1 response but this is an excellent >> >>question that we need to answer before implementing EGO_SUM. >> > >> >Could you please discuss why you make the reintroduction of EGO_SUM >> >dependent on this question? >> >> Just to be clear, I don't particularly care about EGO_SUM enough to gate >> its reintroduction (and don't have any leverage to do so anyway). I'm >> just tired of the circular discussions around env issues that all seem >> to avoid actual fixes, catering instead to functionality used by a >> vanishingly small subset of ebuilds in the main repo that compels a >> certain design mostly due to how portage functioned before EAPI 0. >> >> Other than that, supporting EGO_SUM (or any other language ecosystem >> trending towards distro-unfriendly releases) is fine as long as devs are >> cognizant how the related global-scope eclass design affects everyone >> running or working on the raw repo. I hope devs continue leveraging the >> relatively recent benchmark tooling (and perhaps more future support) to >> improve their work. Along those lines, it could be nice to see sample >> benchmark data in commit messages for large, global-scope eclass work >> just to reinforce that it was taken into account. >> >> Tim >> > > I've been following the EGO_SUM thread for quite some time now. One other thing > I did not see mentioned in favour of EGO_SUM so far: reproducibility. > > The problem with external tarballs is that they are gone once the ebuild is > dropped from the tree. Should a user ever want to roll back to a previous > version of an application, either by checking out on older version of the > portage tree or copying said ebuild into their local overlay, they still cannot > simply run an emerge on the it as they have to somehow recreate the tarball > itself too. I believe Hank's email coves this. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 377 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-06-30 8:22 ` Sam James 2023-06-30 9:38 ` Tim Harder @ 2023-07-03 10:17 ` Florian Schmaus 2023-07-03 11:12 ` [gentoo-dev] EGO_SUM Ulrich Mueller 2023-07-08 21:21 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Sam James 1 sibling, 2 replies; 23+ messages in thread From: Florian Schmaus @ 2023-07-03 10:17 UTC (permalink / raw To: gentoo-dev, Sam James [-- Attachment #1.1.1: Type: text/plain, Size: 3698 bytes --] On 30/06/2023 10.22, Sam James wrote: > Florian Schmaus <flow@gentoo.org> writes: >> [[PGP Signed Part:Undecided]] >> [in reply to a gentoo-project@ post, but it was asked to continue this >> on gentoo-dev@] >> On 28/06/2023 16.46, Sam James wrote: >>> and questions remain unanswered on the >>> ML (why not implement a check in pkgcheck similar to what is in Portage, >>> for example)? >> >> On 2023-05-30 [1], I proposed a limit in the range of 2 to 1.5 MiB for >> the total package-directory size. I only care a little about the tool >> that checks this limit, but pkgcheck is an obvious choice. I also >> suggested that we review this policy once the number of Go packages >> has doubled or two years after this policy was established (whatever >> comes first). >> >> But I fear you may be referring to another kind of check. You may be >> talking about a check that forbids EGO_SUM in ::gentoo but allows it >> overlays. > > My position on this has been consistent: > a check is needed to statically > determine when the environment size is too big. Copying the Portage > check into pkgcheck (in terms of the metrics) would satisfy this. It is not as easy as merely copying existing portage code into pkgcheck (unless I am missing something). I've talked to arthurzam, and there appears to be a .environment file created by pkgcheck, which we could use to approximate the exported environment. Another option would be to have pkgcheck count the EGO_SUM entries. The tree-sitter API for Bash, which pkgcheck already uses, seems to allow for that. But that would be different from the check in portage. Although, IMHO, counting EGO_SUM entries would be sufficient. > That is, regardless of raw size, I'm asking for a calculation based on > the contents of EGO_SUM where, if exceeded, the package will not be > installable on some systems. You didn't have an issue implementing this > for Portage and I've mentioned this a bunch of times since, so I thought > it was clear what I was hoping to see. So pkgcheck counting EGO_SUM entries would be sufficient for the purpose of having a static check that notices if the ebuild would likely run into the environment limit? To find a common compromise, I would possibly invest my time in developing such a test. Even though I do not deem such a check a strict prerequisite to reintroduce EGO_SUM. >> Intelligibly, EGO_SUM can be considered ugly. Compared to a >> traditional Gentoo package, EGO_SUM-based ones are larger. The same is >> true for Rust packages. However, looking at the bigger picture, >> EGO_SUM's advantages outweigh its disadvantages. >> > > Again, am on record as being fine with the general EGO_SUM approach, > even if I wish we didn't need it, as I see it as inevitable for things > like yarn, .NET, and of course Rust as we already have it. > > Just ideally not huge ones, and certainly not huge ones which then > aren't even reliably installable because of environment size. Talking about "reliably installable" makes it sound to me like there are cases where installing a EGO_SUM-based package sometimes works and sometimes not. But the kernel-limit is fixed and not even configurable, besides, of course patching the source (and in the absence of architectures with a page size below 4 KiB) [1]. Any developer testing whether or not an ebuild is installable would become immediately aware if the ebuild runs into the environment limit, or not. That said, static code checks are always preferable over dynamic ones. - Flow 1: https://elixir.bootlin.com/linux/v6.4.1/source/include/uapi/linux/binfmts.h#L15 [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 17273 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM 2023-07-03 10:17 ` Florian Schmaus @ 2023-07-03 11:12 ` Ulrich Mueller 2023-07-08 21:21 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Sam James 1 sibling, 0 replies; 23+ messages in thread From: Ulrich Mueller @ 2023-07-03 11:12 UTC (permalink / raw To: Florian Schmaus; +Cc: gentoo-dev, Sam James [-- Attachment #1: Type: text/plain, Size: 831 bytes --] >>>>> On Mon, 03 Jul 2023, Florian Schmaus wrote: > So pkgcheck counting EGO_SUM entries would be sufficient for the > purpose of having a static check that notices if the ebuild would > likely run into the environment limit? > To find a common compromise, I would possibly invest my time in > developing such a test. Even though I do not deem such a check a > strict prerequisite to reintroduce EGO_SUM. The so-called "environment limit" is 32 pages, i.e. normally 128 KiB. With the A variable anywhere near this, the size of the Manifest file would be close to 1 MiB. IMHO this is way too large to be used on a regular basis. I am aware that we have some packages with large Manifests (71 packages above 50 KiB, 6 packages above 200 KiB, out of 18812 packages in total), but these should really remain the exception. Ulrich [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 507 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-03 10:17 ` Florian Schmaus 2023-07-03 11:12 ` [gentoo-dev] EGO_SUM Ulrich Mueller @ 2023-07-08 21:21 ` Sam James 1 sibling, 0 replies; 23+ messages in thread From: Sam James @ 2023-07-08 21:21 UTC (permalink / raw To: Florian Schmaus; +Cc: gentoo-dev, Sam James [-- Attachment #1: Type: text/plain, Size: 4314 bytes --] Florian Schmaus <flow@gentoo.org> writes: > [[PGP Signed Part:Undecided]] > On 30/06/2023 10.22, Sam James wrote: >> Florian Schmaus <flow@gentoo.org> writes: >>> [[PGP Signed Part:Undecided]] >>> [in reply to a gentoo-project@ post, but it was asked to continue this >>> on gentoo-dev@] >>> On 28/06/2023 16.46, Sam James wrote: >>>> and questions remain unanswered on the >>>> ML (why not implement a check in pkgcheck similar to what is in Portage, >>>> for example)? >>> >>> On 2023-05-30 [1], I proposed a limit in the range of 2 to 1.5 MiB for >>> the total package-directory size. I only care a little about the tool >>> that checks this limit, but pkgcheck is an obvious choice. I also >>> suggested that we review this policy once the number of Go packages >>> has doubled or two years after this policy was established (whatever >>> comes first). >>> >>> But I fear you may be referring to another kind of check. You may be >>> talking about a check that forbids EGO_SUM in ::gentoo but allows it >>> overlays. >> My position on this has been consistent: > a check is needed to >> statically >> determine when the environment size is too big. Copying the Portage >> check into pkgcheck (in terms of the metrics) would satisfy this. > > It is not as easy as merely copying existing portage code into > pkgcheck (unless I am missing something). > That's why I said "in terms of the metrics". > I've talked to arthurzam, and there appears to be a .environment file > created by pkgcheck, which we could use to approximate the exported > environment. > > Another option would be to have pkgcheck count the EGO_SUM > entries. The tree-sitter API for Bash, which pkgcheck already uses, > seems to allow for that. But that would be different from the check in > portage. Although, IMHO, counting EGO_SUM entries would be sufficient. Right. > > >> That is, regardless of raw size, I'm asking for a calculation based on >> the contents of EGO_SUM where, if exceeded, the package will not be >> installable on some systems. You didn't have an issue implementing this >> for Portage and I've mentioned this a bunch of times since, so I thought >> it was clear what I was hoping to see. > > So pkgcheck counting EGO_SUM entries would be sufficient for the > purpose of having a static check that notices if the ebuild would > likely run into the environment limit? > If you check it actually fires in some of the old broken scenarios (see Bugzilla), then yes. But I'd be interested in your thoughts on radhermit's reply (please reply there). > To find a common compromise, I would possibly invest my time in > developing such a test. Even though I do not deem such a check a > strict prerequisite to reintroduce EGO_SUM. Yes, you've made clear you disagree. > > >>> Intelligibly, EGO_SUM can be considered ugly. Compared to a >>> traditional Gentoo package, EGO_SUM-based ones are larger. The same is >>> true for Rust packages. However, looking at the bigger picture, >>> EGO_SUM's advantages outweigh its disadvantages. >>> >> Again, am on record as being fine with the general EGO_SUM approach, >> even if I wish we didn't need it, as I see it as inevitable for things >> like yarn, .NET, and of course Rust as we already have it. >> Just ideally not huge ones, and certainly not huge ones which then >> aren't even reliably installable because of environment size. > > Talking about "reliably installable" makes it sound to me like there > are cases where installing a EGO_SUM-based package sometimes works and > sometimes not. But the kernel-limit is fixed and not even > configurable, besides, of course patching the source (and in the > absence of architectures with a page size below 4 KiB) [1]. > ulm's reply notes that this is a limitation in the Linux kernel, so I have no idea why musl tinderboxes seemed to disproportionately hit these issues and I assume one of us either missing something or it was just a crazy fluke. > Any developer testing whether or notan ebuild is installable would > become immediately aware if the ebuild runs into the environment > limit, or not. > This clearly didn't happen with the previous examples (see what I said above too), as there were times when they installed for some people, but not in CI/tinderboxes. I don't know why and it merits investigation. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 377 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <cdf5ddb7-8f65-74cf-5594-3e3eec86c915@gentoo.org>]
[parent not found: <1913d3c2-5f54-acea-0ed3-930371ea1884@gentoo.org>]
[parent not found: <CAAr7Pr9+zq2NV=7zhj5e+4LWOmNavCrfMstNTqkthk5uxQVNtg@mail.gmail.com>]
* [gentoo-dev] Re: Flow's Manifesto and questions for nominees (was: Re: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) [not found] ` <CAAr7Pr9+zq2NV=7zhj5e+4LWOmNavCrfMstNTqkthk5uxQVNtg@mail.gmail.com> @ 2023-07-14 7:14 ` Florian Schmaus 2023-07-14 7:33 ` Sam James 2023-07-14 8:39 ` [gentoo-dev] Re: Flow's Manifesto and questions for nominees Ulrich Mueller 0 siblings, 2 replies; 23+ messages in thread From: Florian Schmaus @ 2023-07-14 7:14 UTC (permalink / raw To: Alec Warner, gentoo-dev; +Cc: gentoo-project [-- Attachment #1.1.1: Type: text/plain, Size: 7592 bytes --] Posted to gentoo-dev@ since we are now entering a technical discussion again. For those who did not follow gentoo-project@, the previous posts include: https://marc.info/?l=gentoo-project&m=168918875000738&w=2 https://marc.info/?l=gentoo-project&m=168881103930591&w=2 On 12/07/2023 21.28, Alec Warner wrote: > On Wed, Jul 12, 2023 at 12:07 PM Florian Schmaus <flow@gentoo.org> wrote: >> Apologies for not replying to everyone individually. >> >> I thank my fellow council candidates who took the time to reply to this >> sensitive and obviously controversial matter. I understand that not >> everyone feels comfortable taking a stance in this discussion. >> >> I asked the other council candidates about their opinion on EGO_SUM. >> Unfortunately, some replies included only a rather shallow answer. A few >> focused on criticism of my actions and how I approach the issue. Which >> is obviously fine. I read it all and have empathy for everyone who feels >> aggravated. You may or may not share the complaints. But let us focus on >> the actual matter for a moment. >> >> Even the voices raised for a restricted reintroduction of EGO_SUM just >> mention an abstract limit [1]. A concrete limit is not mentioned, >> although I asked for it and provided my idea including specific limits. >> Not knowing the concrete figures others have in mind makes it difficult >> to find a compromise. For example, a fellow council candidate postulated >> that it would be quicker for me to implement a limit-check in pkgcheck >> than discuss EGO_SUM. I wish that were the case. Unfortunately it is >> potentially not trivial to implement if we want such a check to be >> robust. But even worse, a specific limit must be known before >> implementing such a check. And we currently have none. > > I think my concern here is that I don't expect the Council to really > 'vote on a specific limit.' The limit is an implementation detail, it > can change, it shouldn't require a council vote to change. > > So my advice is "pick something reasonable that you think holds up to > scrutiny, and implement with that" and "expect the limit to change, > either because of the scrutiny, or because it might change in the > future" and implement your check accordingly (so e.g. the limit is > easily changeable.) Please find below why this may not be enough. >> But the real crux of an EGO_SUM reintroduction with a limit is the >> following. Either the limit is too restrictive, and most packages are >> affected by it and can not use EGO_SUM, which ultimately only >> corresponds to the current state. Or the limit only affects a fraction >> of the packages, so you should not bother having a limit. > > Again the idea is there is already a limit ( the aforementioned > environment limit ) and one of the goals is to have a QA check that > says your ebuild is approaching that limit so you can do something > productive about it, as well as to avoid ebuilds that are not > installable. So just implement that. If you need a number, I think > "90% of the env limit" is defensible (but again, any reasonable number > will do fine.) EGO_SUM affects two dimensions that could be limited/restricted: A) the process environment, which may run into the Linux kernel environment limit on exec(3) B) the size of the package directory, where EGO_SUM affects the size of ebuilds and the Manifest I would be happy to put in any effort required to implement A) in pkgcheck, as I did for portage, if this check is the only thing that keeps us from reintroducing EGO_SUM. Unfortunately, some argue that we need to limit B). Much of the effort I put into researching the EGO_SUM situation was analyzing how EGO_SUM's impact on package-directory size affects Gentoo. The result of the analysis strongly indicates that rather large package-directories can be sustained by ::gentoo in the foreseeable future. Especially since we are only talking about ~250 EGO_SUM packages currently, and a significant fraction of those packages will not have enormous package directories. And I also suggested that the policy is reconsidered at least every two years or once the number of EGO_SUM packages has doubled (whatever comes first). My investigation of the history of EGO_SUM's deprecation has not surfaced any technical issue which justified EGO_SUM's deprecation with regard to B). It appears that technical issues do not drive the desire to limit B), but by esthetic preferences, which are highly subjective. A), however, is a different beast. There is undeniably a kernel-enforced limit that we could hit due to an extremely large EGO_SUM (among other things). However, the only bug report I know that runs into this kernel limit was with texlive (bug #719202). The low number of recorded bugs caused by the environment limit matches with the fact that even the ebuild with the most EGO_SUM entries that I ever analyzed, app-containers/cri-o-1.23.1 (2022-02-16) with 2052 EGO_SUM entries, does *not* run into the environment limit. >> The deprecation of EGO_SUM was and is unnecessary, a security issue, and >> was almost wholly *not* driven by technical problems. EGO_SUM should be >> re-instated. >> >> I know that some think likewise. I also know that others disagree. The >> latter group includes some prominent and visible Gentoo developers. >> People to whom I am thankful for their work on Gentoo and to whom Gentoo >> owes a lot. However, it is unclear what the majority of Gentoo >> developers thinks. I could very well be that the consensus amongst >> Gentoo developers agrees with some of my fellow council candidates and >> would like to keep the current state. It would be great if we find that >> out. If we had a mechanism to perform a non-binding opinion poll amongst >> Gentoo developers, and if that poll turns out that the consensus is to >> keep EGO_SUM deprecated, then I could save myself a lot of time and effort. > > I'm confused why you are asking about the 'consensus amongst > developers' and then ask the council to vote. If I knew that the majority of Gentoo developer's is fine with the deprecation of EGO_SUM, then I would not put in effort in re-instating EGO_SUM. >> However, as of now, my conscience demands that I try to improve this >> situation for the sake of our users. In a previous mail, I wrote that I >> seek closure by asking the council to vote on that matter. And I will, >> of course, accept any outcome of that vote. > > My impression of the situation is that: > - Currently if asked, the council would likely vote no. > - They have requested you implement a QA check with a limit, and if > you did that, many swing voters would vote yes. > > My guidance from above is "implement the check with some reasonable > limit" to unblock your swing voters, so they vote yes... > > We don't need everyone to vote on what the limit is ..it's just > wasting time IMHO. It is not about everyone voting on that matter. It is about asking everyone of their opinion on that matter, in a non-binding opinion poll where multiple options can be ranked [1]. Chances are that this would surface the consensus amongst Gentoo developers, and ideally, the Council would take the result of the poll into consideration when voting on that matter. - Flow 1: I think that it is probably trivial to re-purpose our current voting infrastructure to perform opinion poll using the condorcet method. [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 17273 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* [gentoo-dev] Re: Flow's Manifesto and questions for nominees (was: Re: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-14 7:14 ` [gentoo-dev] Re: Flow's Manifesto and questions for nominees (was: " Florian Schmaus @ 2023-07-14 7:33 ` Sam James 2023-07-14 8:19 ` Sam James 2023-07-14 9:07 ` Florian Schmaus 2023-07-14 8:39 ` [gentoo-dev] Re: Flow's Manifesto and questions for nominees Ulrich Mueller 1 sibling, 2 replies; 23+ messages in thread From: Sam James @ 2023-07-14 7:33 UTC (permalink / raw To: gentoo-project; +Cc: Alec Warner, gentoo-dev Florian Schmaus <flow@gentoo.org> writes: > [[PGP Signed Part:Undecided]] > Posted to gentoo-dev@ since we are now entering a technical discussion > again. > > For those who did not follow gentoo-project@, the previous posts include: > > https://marc.info/?l=gentoo-project&m=168918875000738&w=2 > https://marc.info/?l=gentoo-project&m=168881103930591&w=2 > > On 12/07/2023 21.28, Alec Warner wrote: >> On Wed, Jul 12, 2023 at 12:07 PM Florian Schmaus <flow@gentoo.org> wrote: >>> Apologies for not replying to everyone individually. >>> >>> I thank my fellow council candidates who took the time to reply to this >>> sensitive and obviously controversial matter. I understand that not >>> everyone feels comfortable taking a stance in this discussion. >>> >>> I asked the other council candidates about their opinion on EGO_SUM. >>> Unfortunately, some replies included only a rather shallow answer. A few >>> focused on criticism of my actions and how I approach the issue. Which >>> is obviously fine. I read it all and have empathy for everyone who feels >>> aggravated. You may or may not share the complaints. But let us focus on >>> the actual matter for a moment. >>> >>> Even the voices raised for a restricted reintroduction of EGO_SUM just >>> mention an abstract limit [1]. A concrete limit is not mentioned, >>> although I asked for it and provided my idea including specific limits. >>> Not knowing the concrete figures others have in mind makes it difficult >>> to find a compromise. For example, a fellow council candidate postulated >>> that it would be quicker for me to implement a limit-check in pkgcheck >>> than discuss EGO_SUM. I wish that were the case. Unfortunately it is I think this misrepresents my point. All I said was that a bound should be added matching what's in Portage right now. Please in future respond to me directly if you're going to claim something about what I've said. > [...] > EGO_SUM affects two dimensions that could be limited/restricted: > A) the process environment, which may run into the Linux kernel > environment limit on exec(3) > B) the size of the package directory, where EGO_SUM affects the size of > ebuilds and the Manifest > > [...] > > A), however, is a different beast. There is undeniably a > kernel-enforced limit that we could hit due to an extremely large > EGO_SUM (among other things). However, the only bug report I know that > runs into this kernel limit was with texlive (bug #719202). The low > number of recorded bugs caused by the environment limit matches with > the fact that even the ebuild with the most EGO_SUM entries that I > ever analyzed, app-containers/cri-o-1.23.1 (2022-02-16) with 2052 > EGO_SUM entries, does *not* run into the environment limit. > I thought I'd gave you a list before, but maybe it was someone else. Anyway, a non-exhaustive list (I remember maybe two more but I got bored): * https://bugs.gentoo.org/829545 ("app-admin/vault-1.9.1 - find: The environment is too large for exec().") * https://bugs.gentoo.org/829684 ("app-metrics/prometheus-2.31.1 - find: The environment is too large for exec().") * https://bugs.gentoo.org/830187 (you're CC'd on this) ("go lang ebuild: SRC_URI too long that it causes "Argument list too long" error") * https://bugs.gentoo.org/831265 ("sys-cluster/minikube-1.24.0 - find: The environment is too large for exec().") * a0be89b772474e3336d3de699d71482aa89d2444 ("app-emulation/nerdctl: drop 0.14.0") Other related bugs (as it's useful as a summary of where we are): * https://bugs.gentoo.org/540146 ("sys-apps/portage: limit no of exported variables in EAPI 6") * https://bugs.gentoo.org/720180 ("sys-apps/portage: add support to delay export of "A" variable until last moment") * https://bugs.gentoo.org/721088 ("[Future EAPI] Don't export A") * https://bugs.gentoo.org/833567 ("[Future EAPI] src_fetch_extra phase the runs after src_unpack") I am not aware of a bug (yet?) for radhermit's suggestion wrt external helpers which is related but different to exporting fewer variables. thanks, sam ^ permalink raw reply [flat|nested] 23+ messages in thread
* [gentoo-dev] Re: Flow's Manifesto and questions for nominees (was: Re: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-14 7:33 ` Sam James @ 2023-07-14 8:19 ` Sam James 2023-07-14 9:07 ` Florian Schmaus 1 sibling, 0 replies; 23+ messages in thread From: Sam James @ 2023-07-14 8:19 UTC (permalink / raw To: Sam James; +Cc: gentoo-project, Alec Warner, gentoo-dev [-- Attachment #1: Type: text/plain, Size: 4933 bytes --] Sam James <sam@gentoo.org> writes: > Florian Schmaus <flow@gentoo.org> writes: > >> [[PGP Signed Part:Undecided]] >> Posted to gentoo-dev@ since we are now entering a technical discussion >> again. >> >> For those who did not follow gentoo-project@, the previous posts include: >> >> https://marc.info/?l=gentoo-project&m=168918875000738&w=2 >> https://marc.info/?l=gentoo-project&m=168881103930591&w=2 >> >> On 12/07/2023 21.28, Alec Warner wrote: >>> On Wed, Jul 12, 2023 at 12:07 PM Florian Schmaus <flow@gentoo.org> wrote: >>>> Apologies for not replying to everyone individually. >>>> >>>> I thank my fellow council candidates who took the time to reply to this >>>> sensitive and obviously controversial matter. I understand that not >>>> everyone feels comfortable taking a stance in this discussion. >>>> >>>> I asked the other council candidates about their opinion on EGO_SUM. >>>> Unfortunately, some replies included only a rather shallow answer. A few >>>> focused on criticism of my actions and how I approach the issue. Which >>>> is obviously fine. I read it all and have empathy for everyone who feels >>>> aggravated. You may or may not share the complaints. But let us focus on >>>> the actual matter for a moment. >>>> >>>> Even the voices raised for a restricted reintroduction of EGO_SUM just >>>> mention an abstract limit [1]. A concrete limit is not mentioned, >>>> although I asked for it and provided my idea including specific limits. >>>> Not knowing the concrete figures others have in mind makes it difficult >>>> to find a compromise. For example, a fellow council candidate postulated >>>> that it would be quicker for me to implement a limit-check in pkgcheck >>>> than discuss EGO_SUM. I wish that were the case. Unfortunately it is > > I think this misrepresents my point. All I said was that a bound should > be added matching what's in Portage right now. > > Please in future respond to me directly if you're going to claim something about what I've said. > >> [...] >> EGO_SUM affects two dimensions that could be limited/restricted: >> A) the process environment, which may run into the Linux kernel >> environment limit on exec(3) >> B) the size of the package directory, where EGO_SUM affects the size of >> ebuilds and the Manifest >> >> [...] >> >> A), however, is a different beast. There is undeniably a >> kernel-enforced limit that we could hit due to an extremely large >> EGO_SUM (among other things). However, the only bug report I know that >> runs into this kernel limit was with texlive (bug #719202). The low >> number of recorded bugs caused by the environment limit matches with >> the fact that even the ebuild with the most EGO_SUM entries that I >> ever analyzed, app-containers/cri-o-1.23.1 (2022-02-16) with 2052 >> EGO_SUM entries, does *not* run into the environment limit. >> > > I thought I'd gave you a list before, but maybe it was someone else. > > Anyway, a non-exhaustive list (I remember maybe two more but I got bored): > * https://bugs.gentoo.org/829545 ("app-admin/vault-1.9.1 - find: The environment is too large for exec().") > * https://bugs.gentoo.org/829684 ("app-metrics/prometheus-2.31.1 - find: The environment is too large for exec().") > * https://bugs.gentoo.org/830187 (you're CC'd on this) ("go lang ebuild: SRC_URI too long that it causes "Argument list too long" error") > * https://bugs.gentoo.org/831265 ("sys-cluster/minikube-1.24.0 - find: The environment is too large for exec().") > * a0be89b772474e3336d3de699d71482aa89d2444 ("app-emulation/nerdctl: drop 0.14.0") > Sorry, as I said this, I came across some more. These are the ones I was thinking of: * https://bugs.gentoo.org/830266 ("app-admin/filebeat-7.16.2 fails to compile: Assertion failed: bc_ctl.arg_max >= LINE_MAX (xargs.c: main: 511)") * https://bugs.gentoo.org/832964 ("sys-cluster/kops-1.21.0 fails to compile: Assertion failed: bc_ctl.arg_max >= LINE_MAX (xargs.c: main: 511)") * https://bugs.gentoo.org/833961 ("net-p2p/go-ipfs-0.11.0 - Assertion failed: bc_ctl.arg_max >= LINE_MAX (xargs.c: main: 511)") * https://bugs.gentoo.org/835712 ("dev-util/packer-1.7.9 fails to compile: Assertion failed: bc_ctl.arg_max >= LINE_MAX (xargs.c: main: 511)") > Other related bugs (as it's useful as a summary of where we are): > * https://bugs.gentoo.org/540146 ("sys-apps/portage: limit no of exported variables in EAPI 6") > * https://bugs.gentoo.org/720180 ("sys-apps/portage: add support to delay export of "A" variable until last moment") > * https://bugs.gentoo.org/721088 ("[Future EAPI] Don't export A") > * https://bugs.gentoo.org/833567 ("[Future EAPI] src_fetch_extra phase the runs after src_unpack") > > I am not aware of a bug (yet?) for radhermit's suggestion wrt external > helpers which is related but different to exporting fewer variables. > > thanks, > sam [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 377 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* [gentoo-dev] Re: Flow's Manifesto and questions for nominees (was: Re: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) 2023-07-14 7:33 ` Sam James 2023-07-14 8:19 ` Sam James @ 2023-07-14 9:07 ` Florian Schmaus 1 sibling, 0 replies; 23+ messages in thread From: Florian Schmaus @ 2023-07-14 9:07 UTC (permalink / raw To: gentoo-dev; +Cc: Sam James [-- Attachment #1.1.1: Type: text/plain, Size: 5220 bytes --] On 14/07/2023 09.33, Sam James wrote: > > Florian Schmaus <flow@gentoo.org> writes: > >> [[PGP Signed Part:Undecided]] >> Posted to gentoo-dev@ since we are now entering a technical discussion >> again. >> >> For those who did not follow gentoo-project@, the previous posts include: >> >> https://marc.info/?l=gentoo-project&m=168918875000738&w=2 >> https://marc.info/?l=gentoo-project&m=168881103930591&w=2 >> >> On 12/07/2023 21.28, Alec Warner wrote: >>> On Wed, Jul 12, 2023 at 12:07 PM Florian Schmaus <flow@gentoo.org> wrote: >>>> Apologies for not replying to everyone individually. >>>> >>>> I thank my fellow council candidates who took the time to reply to this >>>> sensitive and obviously controversial matter. I understand that not >>>> everyone feels comfortable taking a stance in this discussion. >>>> >>>> I asked the other council candidates about their opinion on EGO_SUM. >>>> Unfortunately, some replies included only a rather shallow answer. A few >>>> focused on criticism of my actions and how I approach the issue. Which >>>> is obviously fine. I read it all and have empathy for everyone who feels >>>> aggravated. You may or may not share the complaints. But let us focus on >>>> the actual matter for a moment. >>>> >>>> Even the voices raised for a restricted reintroduction of EGO_SUM just >>>> mention an abstract limit [1]. A concrete limit is not mentioned, >>>> although I asked for it and provided my idea including specific limits. >>>> Not knowing the concrete figures others have in mind makes it difficult >>>> to find a compromise. For example, a fellow council candidate postulated >>>> that it would be quicker for me to implement a limit-check in pkgcheck >>>> than discuss EGO_SUM. I wish that were the case. Unfortunately it is > > I think this misrepresents my point. All I said was that a bound should > be added matching what's in Portage right now. > > Please in future respond to me directly if you're going to claim something about what I've said. > >> [...] >> EGO_SUM affects two dimensions that could be limited/restricted: >> A) the process environment, which may run into the Linux kernel >> environment limit on exec(3) >> B) the size of the package directory, where EGO_SUM affects the size of >> ebuilds and the Manifest >> >> [...] >> >> A), however, is a different beast. There is undeniably a >> kernel-enforced limit that we could hit due to an extremely large >> EGO_SUM (among other things). However, the only bug report I know that >> runs into this kernel limit was with texlive (bug #719202). The low >> number of recorded bugs caused by the environment limit matches with >> the fact that even the ebuild with the most EGO_SUM entries that I >> ever analyzed, app-containers/cri-o-1.23.1 (2022-02-16) with 2052 >> EGO_SUM entries, does *not* run into the environment limit. >> > > I thought I'd gave you a list before, but maybe it was someone else. > > Anyway, a non-exhaustive list (I remember maybe two more but I got bored): > * https://bugs.gentoo.org/829545 ("app-admin/vault-1.9.1 - find: The environment is too large for exec().") > * https://bugs.gentoo.org/829684 ("app-metrics/prometheus-2.31.1 - find: The environment is too large for exec().") > * https://bugs.gentoo.org/830187 (you're CC'd on this) ("go lang ebuild: SRC_URI too long that it causes "Argument list too long" error") > * https://bugs.gentoo.org/831265 ("sys-cluster/minikube-1.24.0 - find: The environment is too large for exec().") > * a0be89b772474e3336d3de699d71482aa89d2444 ("app-emulation/nerdctl: drop 0.14.0") Thanks for providing this valuable information, Sam. I was indeed not aware of those bugs. They all seem to be fixed before 2022-02-16, that is the date of the ::gentoo tree I mostly analyzed (which was selected because it was just before EGO_SUM was deprecated). Limiting the process environment to 90% of the kernel-enforced limit, as antarus also suggested (potentially by approximating the EGO_SUM entries) would have probably prevented those bugs. As I previously wrote, I would be happy to work on a pkgcheck for that, if the limit is only about the kernel's process environment limit (A). However this still leaves us with some that seem to also demand a limit with regard to the package-directory size (B). > Other related bugs (as it's useful as a summary of where we are): > * https://bugs.gentoo.org/540146 ("sys-apps/portage: limit no of exported variables in EAPI 6") > * https://bugs.gentoo.org/720180 ("sys-apps/portage: add support to delay export of "A" variable until last moment") > * https://bugs.gentoo.org/721088 ("[Future EAPI] Don't export A") > * https://bugs.gentoo.org/833567 ("[Future EAPI] src_fetch_extra phase the runs after src_unpack") > > I am not aware of a bug (yet?) for radhermit's suggestion wrt external > helpers which is related but different to exporting fewer variables. Improving, that is, reducing, what portage exports to child processes of the ebuild is sensible. But it is only indirectly related to EGO_SUM and not a strict prerequisite to re-introduce EGO_SUM. - Flow [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 17273 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [gentoo-dev] Re: Flow's Manifesto and questions for nominees 2023-07-14 7:14 ` [gentoo-dev] Re: Flow's Manifesto and questions for nominees (was: " Florian Schmaus 2023-07-14 7:33 ` Sam James @ 2023-07-14 8:39 ` Ulrich Mueller 1 sibling, 0 replies; 23+ messages in thread From: Ulrich Mueller @ 2023-07-14 8:39 UTC (permalink / raw To: Florian Schmaus; +Cc: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 302 bytes --] >>>>> On Fri, 14 Jul 2023, Florian Schmaus wrote: > Posted to gentoo-dev@ since we are now entering a technical discussion > again. Please avoid crossposting, because that doesn't work well. (For example, the posting will have different Reply-To headers in gentoo-project and in gentoo-dev.) Ulrich [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 507 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2023-07-14 9:07 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <2ZKWN4KF.MKEFFMWE.LGPKYP47@RTL7EJXF.RN4PF6UF.MDFBGF3C> [not found] ` <be450641-94ff-a0d9-51da-3a7a3abcc6c7@gentoo.org> [not found] ` <b7309a3f-2980-b390-a16a-0518cce1da75@gentoo.org> [not found] ` <87y1k33aoy.fsf@gentoo.org> 2023-06-30 8:15 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Florian Schmaus 2023-06-30 8:22 ` Sam James 2023-06-30 9:38 ` Tim Harder 2023-06-30 11:33 ` Eray Aslan 2023-07-03 10:17 ` Florian Schmaus 2023-07-04 7:13 ` Tim Harder 2023-07-04 10:44 ` Gerion Entrup 2023-07-04 21:56 ` Robin H. Johnson 2023-07-04 23:09 ` Oskari Pirhonen 2023-07-05 18:40 ` Gerion Entrup 2023-07-05 19:32 ` Rich Freeman 2023-07-06 2:48 ` Oskari Pirhonen 2023-07-06 6:09 ` Zoltan Puskas 2023-07-06 19:46 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open Hank Leininger 2023-07-08 20:49 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Sam James 2023-07-03 10:17 ` Florian Schmaus 2023-07-03 11:12 ` [gentoo-dev] EGO_SUM Ulrich Mueller 2023-07-08 21:21 ` [gentoo-dev] EGO_SUM (was: [gentoo-project] Gentoo Council Election 202306 ... Nominations Open in Just Over 24 Hours.) Sam James [not found] ` <cdf5ddb7-8f65-74cf-5594-3e3eec86c915@gentoo.org> [not found] ` <1913d3c2-5f54-acea-0ed3-930371ea1884@gentoo.org> [not found] ` <CAAr7Pr9+zq2NV=7zhj5e+4LWOmNavCrfMstNTqkthk5uxQVNtg@mail.gmail.com> 2023-07-14 7:14 ` [gentoo-dev] Re: Flow's Manifesto and questions for nominees (was: " Florian Schmaus 2023-07-14 7:33 ` Sam James 2023-07-14 8:19 ` Sam James 2023-07-14 9:07 ` Florian Schmaus 2023-07-14 8:39 ` [gentoo-dev] Re: Flow's Manifesto and questions for nominees Ulrich Mueller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox