From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 5166815810F for ; Fri, 9 Jun 2023 10:08:05 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id E057EE0878; Fri, 9 Jun 2023 10:07:55 +0000 (UTC) Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 82D57E0872 for ; Fri, 9 Jun 2023 10:07:55 +0000 (UTC) Message-ID: <770cb2e8-9475-5137-4979-50396e4996a9@gentoo.org> Date: Fri, 9 Jun 2023 12:07:50 +0200 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [gentoo-dev] EGO_SUM Content-Language: en-US To: gentoo-dev@lists.gentoo.org References: <49ce8700-6c96-9360-51cf-2a989f666752@gentoo.org> From: Florian Schmaus In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Archives-Salt: cded92bc-305b-433f-90f9-8b8580f5cf72 X-Archives-Hash: 63fb157edc8c6754ffa971a5d8d28e56 On 01/06/2023 21.55, William Hubbs wrote: >> The EGO_SUM alternatives >> - do not have the same level of trust and therefore have a negative >> impact on security (a dubious tarball someone put somewhere, especially >> when proxy-maint) > > For this, I would argue that vetting the tarball falls to the developer > who is proxying. If you don't trust the proxy maintainer you > are pushing for, it is easy to make a dependency tarball yourself and > add it to your dev space. > >> - are not easily verifiable > > I don't have a response to this other than to say that go does its > own verification of modules with the dependency tarballs that it can't > do with vendor tarballs. Yes, go has "go mod verify", which was added to the go-mod eclass after I asked on 2022-10-21 in #gentoo-dev if the eclass verifies the dependency tarball. robbat2 was so kind to provide a proof of concept of the security issue I was pointing out, which is available under https://gist.github.com/robbat2/82f4c208b6674e707081eda689096d55. This demonstration of the issue triggered https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=733b4944c1a061269f96219cc96530f89d8f439e, which made the go-module.eclass run "go mod verify". Unfortunately, a malicious contributor can trivially sidestep this verification step, rendering it ineffective. First, neither portage [1] nor PMS require that a later (source) archive can not override an existing file. This looseness allows, for example, the (non-upstream) dependency tarball, to override (upstream's) go.sum. Secondly, a dependency tarball could create the vendor/ directory, preventing the condition under which the go-module.eclass runs "go mod verify". Both approaches allow the dependency tarball to inject malicious code. With the first approach, "go mod verify" completes successfully; with the second, "go mod verify" is simply not invoked. The verification, as is, is ineffective. >> Last but not least, we have the same situation in the Rust ecosystem, >> but we allow the EGO_SUM "equivalent" there. > > I'm not sure it is quite the same because Rust projects tend to have > much smaller numbers of dependencies. I am curious to know of any specific reason why Rust projects generally get by with fewer dependencies. This impression may be deceiving, caused by the fact that the Go-lang ecosystem hosts several projects with a more significant number of dependencies. If you look at the analysis [2], you find that under the top 10 Go packages by EGO_SUM entry count are cri-o, prometheus, k3s, and k3d, among others. If someone rewrites any of those in Rust, they would probably end up with the same number of dependencies. > Another thing to consider is that using EGO_SUM adds a significant > amount of processing to the go-module eclass. > I was advised recently that this isn't a good idea since bash is > slow, so I am considering moving most of that processing into > get-ego-vendor by having it generate the contents of SRC_URI directly > instead of using the eclass code to do that. Was this analyzed and quantified? Is this hurting us? The cache regeneration of an ebuild tree is an embarrassingly parallel operation, so this would need to be exponentially complex [3] to be of any significance. It may be possible to tune the existing EGO_SUM handling. We should keep EGO_SUM if viable, as it directly maps Go's go.sum and makes developing Go-lang ebuilds as frictionless as possible. - Flow 1: https://github.com/gentoo/portage/pull/1030 2: https://dev.gentoo.org/~flow/gentoo-tree-analysis-results/2023-05-17T100838-gentoo-at-2022-02-16-60dc7a03ff2f/post-processed-ego-sum.txt 3: something similar to what was recently found in the latex ebuilds, see https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=6ee282f0645dcfccf1836b9cc7ae55556629eb8b