public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Zoltan Puskas <zoltan@sinustrom.info>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] Proposal to undeprecate EGO_SUM
Date: Mon, 27 Jun 2022 01:43:19 +0200	[thread overview]
Message-ID: <1a712a66f55e241ce6b6084eb19e1f34@sinustrom.info> (raw)
In-Reply-To: <20220613074411.341909-1-flow@gentoo.org>

Hi,

I've been working on adding a go based ebuild to Gentoo yesterday and I 
got this warning form portage saying that EGO_SUM is deprecated and 
should be avoided. Since I remember there was an intense discussion 
about this on the ML I went back and have re-read the threads before 
writing this piece. I'd like to provide my perspective as user, a 
proxied maintainer, and overlay owner. I also run a private mirror on my 
LAN to serve my hosts in order to reduce load on external mirrors.

Before diving in I think it's worth reading mgorny's blog post "The 
modern packager’s security nightmare"[1] as it's relevant to the 
discussion, and something I deeply agree with.

With all that being said, I feel that the tarball idea is a bad due to 
many reasons.

 From security point of view, I understand that we still have to trust 
maintainers not to do funky stuff, but I think this issue goes beyond 
that.

First of all one of the advantages of Gentoo is that it gets it's source 
code from upstream (yes, I'm aware of mirrors acting as a cache layer), 
which means that poisoning source code needs to be done at upstream 
level (effectively means hacking GitHub, PyPi, or some standalone 
project's Gitea/cgit/gitlab/etc. instance or similar), sources which 
either have more scrutiny or have a limited blast radius.

Additionally if an upstream dependency has a security issue it's easier 
to scan all EGO_SUM content and find packages that potentially depend on 
a broken dependency and force a re-pinning and rebuild. The tarball 
magic hides this completely and makes searching very expensive.

In fact using these vendor tarballs is the equivalent of "static 
linking" in the packaging space. Why are we introducing the same issue 
in the repository space? This kills the reusability of already 
downloaded dependencies and bloats storage requirements. This is 
especially bad on laptops, where SSD free space might be limited, in 
case the user does not nuke their distfiles after each upgrade.

Considering that BTRFS (and possibly other filesystems) support on the 
fly compression the physical cost of a few inflated ebuilds and 
Manifests is actually way smaller than the logical size would indicate. 
Compare that to the huge incompressible tarballs that now we need to 
store.

As a proxied maintainer or overlay owner hosting these huge tarballs 
also becomes problem (i.e. we need some public space with potentially 
gigabytes of free space and enough bandwidth to push that to users). 
Pushing toward vendor tarballs creates an extra expense on every level 
(Gentoo infra, mirrors, proxy maintainers, overlay owners, users).

If bloating portage is a big issue and we frown upon go stuff anyway (or 
only a few users need these packages), why not consider moving all go 
packages into an officially supported go packages only overlay? I 
understand that this would not solve the kernel buffer issue where we 
run out of environment variable space, but it would debloat the main 
portage tree.

It also breaks reproducibility. With EGO_SUM I can check out an older 
version of portage tree (well to some extent) and rebuild packages since 
dependency upstream is very likely to host old versions of their source. 
With the tarballs this breaks since as soon as an ebuild is dropped from 
mainline portage the vendor tarballs follow them too. There is no way 
for the user to roll back a package a few weeks back (e.g. if new 
version has bugs), unlike with EGO_SUM.

In fact I feel this goes against the spirit of portage too, since now 
instead of "just describing" how to obtain sources and build them, now 
it now depends on essentially ephemeral blobs, which happens to be 
externalized from the portage tree itself. I'm aware that we have 
ebuilds that pull in patches and other stuff from dev space already, but 
we shouldn't make this even worse.

Finally with EGO_SUM we had a nice tool get-ego-vendor which produced 
the EGO_SUM for maintainers which has made maintenance easier. However I 
haven't found any new guidance yet on how to maintain go packages with 
the new tarball method (e.g. what needs to go into the vendor tarball, 
what changes are needed in ebuilds). Overall this complifates further 
ebuild development and verification of PRs.

In summary, IMHO the EGO_SUM way of handling of go packages has more 
benefits than drawbacks compared to the vendor tarballs.

Cheers,
Zoltan

[1] 
https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/


  parent reply	other threads:[~2022-06-26 23:43 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13  7:44 [gentoo-dev] Proposal to undeprecate EGO_SUM Florian Schmaus
2022-06-13  7:44 ` [gentoo-dev] [PATCH] go-module.eclass: " Florian Schmaus
2022-06-13  9:49   ` Andrew Ammerlaan
2022-06-13 10:25     ` Florian Schmaus
2022-06-17 15:53   ` William Hubbs
2022-06-13  8:29 ` [gentoo-dev] Proposal to " Michał Górny
2022-06-13  8:49   ` Ulrich Mueller
2022-06-13  9:34     ` Florian Schmaus
2022-06-13 10:26       ` Ulrich Mueller
2022-06-17 16:27         ` William Hubbs
2022-10-12 13:01           ` Florian Schmaus
2022-06-13  9:30   ` Florian Schmaus
2022-06-13 11:03     ` Michał Górny
2022-06-14  9:37   ` Michał Górny
2022-06-14 10:29     ` Florian Schmaus
2022-06-14 16:33       ` [gentoo-dev] " Holger Hoffstätte
2022-06-14 17:03         ` Florian Schmaus
2022-06-15  5:53           ` Michał Górny
2022-06-17 19:04             ` Michał Górny
2022-06-14 17:34 ` [gentoo-dev] " Arsen Arsenović
2022-06-26 23:43 ` Zoltan Puskas [this message]
2022-06-27  6:09   ` Oskari Pirhonen
2022-06-27  7:14     ` Zoltan Puskas
2022-07-15 21:34   ` William Hubbs
2022-07-16 11:24     ` Florian Schmaus
2022-07-16 11:58       ` Joonas Niilola
2022-07-16 17:51         ` William Hubbs
2022-07-16 18:31           ` Arthur Zamarin
2022-07-16 18:46             ` Robin H. Johnson
2022-07-16 19:35               ` William Hubbs
2022-07-16 20:20                 ` Ulrich Mueller
2022-07-17  1:37                   ` William Hubbs
2022-09-28 15:28 ` Florian Schmaus
2022-09-28 16:31   ` Ulrich Mueller
2022-09-30  0:36     ` William Hubbs
2022-09-30 14:53       ` Florian Schmaus
2022-09-30 15:48         ` William Hubbs
2022-09-30 19:18         ` Sam James
2022-10-11 10:06           ` [gentoo-dev] RFC: check A's size in go-module.eclass Florian Schmaus
2022-10-11 10:06             ` [gentoo-dev] [PATCH] go-module.eclass: ensure that A is less than 112 KiB Florian Schmaus
2022-10-11 15:26               ` Mike Gilbert
2022-10-11 15:58                 ` Florian Schmaus
2022-10-11 15:33             ` [gentoo-dev] RFC: check A's size in go-module.eclass Mike Gilbert
2022-09-30 19:49         ` [gentoo-dev] Proposal to undeprecate EGO_SUM Alec Warner
2022-10-01  0:06           ` William Hubbs
2022-10-01 13:42           ` Florian Schmaus
2022-10-01 16:36             ` Ulrich Mueller
2022-10-01 17:21               ` Florian Schmaus
2022-10-01 20:59                 ` William Hubbs
2022-09-30 20:07       ` Arsen Arsenović
2022-09-30 23:49         ` William Hubbs
2022-09-28 21:23   ` John Helmert III
2022-09-30 13:57     ` Florian Schmaus
2022-09-30 14:36       ` Jaco Kroon
2022-09-30 14:53         ` Florian Schmaus
2022-09-30 15:10           ` Jaco Kroon
2022-09-30 15:32             ` Zoltan Puskas
2022-09-30 19:02   ` Georgy Yakovlev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1a712a66f55e241ce6b6084eb19e1f34@sinustrom.info \
    --to=zoltan@sinustrom.info \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox