On 30/05/2023 18.52, Florian Schmaus wrote: > > I am thankful that the council considered my request to vote on the > topic. However, the council decided not to vote on this in its last > session and to return the issue to the mailing lists. > > Some see the requirement of some limitations as necessity it comes to > reinstating EGO_SUM. Unfortunately, I could not see specific numbers > mentioned since June 2022 in the three EGO_SUM threads [1, 2, 3] I am > aware of. > > To prevent harm from Gentoo, we should reach an agreement that everyone > can live with. To achieve a consensus, and since I can not rule out that > I missed a post that includes specific numbers, please share your ideas > on how EGO_SUM could be reinstated in ::gentoo by replying to this mail. I still want to ask why in ::gentoo should it be enabled? I'm trying to understand why? If you speak about overlays, then I agree that it should be allowed there, but I don't see any benefit to it existence in ::gentoo. My reason for that difference: the existence of gentoo-devs with access to ~devspace. Currently the best solution *per package* is to speak with upstream, to add a CI workflow which create a source tarball which includes `vendor` dir. This is the best way, and I'm doing that for multiple upstream of some random Go packages in ::gentoo. But I know the disadvantage - requirement to speak with upstream, explain why, and add it to the system. This is best long-run solution, but more hardships. > Having EGO_SUM would significantly increase the security of Gentoo's > users (amongst other benefits). While technically correct, we return to same "confidence" issue in the dev (a dev can add malicious code into ebuild). Yes, adding malicious code inside vendor tarball to hide it is easier and robbat2 demonstrated it as working. How can we solve it? One weird idea I have is to use vendor tarball consisting of multiple tarballs per package, and include hash for it inside the vendor tarball. I think you can compare the manifest stored in `go.sum` file in source code with the once from the tarball (verification of that claim needed). As a result I think we can offline verify it. > Personally, I do not see that we currently need any form of limitation > to reinstate EGO_SUM. I substantiated this with data based on a two-year > history analysis of gentoo.git. The summary is that the > - size increase of ::gentoo is unproblematic for users > - additional sync delta of ::gentoo is unproblematic for users > - higher rate of gentoo.git's increase is unproblematic for developers > when we reinstate EGO_SUM in ::gentoo. Why "unproblematic"? Where I leave I have quite high RTT, meaning each download takes long initial time until fetches with good speed. Fetching a lot of small files is really bad for me (even from mirror in same country, sigh). Having big deltas hit hard the git packs, higher load on a lot of places. Thinking on infra side, I remember stories of the issues when go.pkg was doing full `git clone` (not shallow copy) of the whole gentoo.git repository. Now imagine we allow the huge and frequent deltas of go modules to run, image how fast we get to huge full repository. Yes, now we blacklist this stupid failure of go.pkg, but it might happen with other service. Full git clones aren't that rare. Also note that Go packages tend to update frequently (because of all the bundling and security issues). The fact you don't see a lot of updates in ::gentoo is because many of them are under less active developers (not to offend anyone, it is fine to skip bumps were a good place, not my place to criticize!). Also please remember the issue of scale. Look at the amount of packages under dev-python. There are a lot of tools written in Go. > Therefore, we could (and IMHO should) simply un-deprecate EGO_SUM. > However, I would review this decision once the number of Go packages has > doubled or in two years (whatever comes first). > > Many share the concerns of an EGO_SUM-less world. I know that some seek > a compromise by reinstating EGO_SUM with some limitations. The ::gentoo > repository is able to handle packages (at least) up to the range of 2 to > 1.5 MiB total package-directory size. Therefore I propose a limit in > that range. My solution is as such: 1. Undeprecate EGO_SUM in eclass 2. Forbid it's usage in ::gentoo (done by pkgcheck, error level, will fail CI and as such we can see the misuse). Overlays are allowed. 3. Maintainer starts talks with upstreams to add release workflow to create vendored source tarball, in hopes of it succeeding. "Start early, to future profit". I see this flow similar to the "always try to upstream patches". 4. Until upstream adds it, in ::gentoo use vendor tarballs. I also think many devs agree with this solution, but I can't talk for them, so I'll be happy agreeing devs can at least reply shortly their agreement or disagreement. > - Flow > > > 1: https://www.mail-archive.com/gentoo-dev@lists.gentoo.org/msg95175.html > 2: https://www.mail-archive.com/gentoo-dev@lists.gentoo.org/msg95279.html > 3: https://www.mail-archive.com/gentoo-dev@lists.gentoo.org/msg97310.html I must say this conversation around EGO_SUM makes me a little sad the long time it takes, and sometimes it feels like it derails to bad directions (I mean less helpful once) too often. I think we should go to the way Flow - suggest concrete action items (something easier for Council / all devs to vote). Also sorry this mail is a little jumping all over, it is quite hard for me to write long mails in English, so if paragraphs are less coherent, I'll be happy to explain them more :) -- Arthur Zamarin arthurzam@gentoo.org Gentoo Linux developer (Python, pkgcore stack, Arch Teams, GURU)