From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 2D7B4158041 for ; Sat, 30 Mar 2024 23:49:31 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 5CDC8E2A91; Sat, 30 Mar 2024 23:49:26 +0000 (UTC) Received: from james.steelbluetech.co.uk (james.steelbluetech.co.uk [78.40.151.100]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id C88E2E2A8B for ; Sat, 30 Mar 2024 23:49:24 +0000 (UTC) Received: from ukinbox.ecrypt.net (hq2.ehuk.net [10.0.10.2]) by james.steelbluetech.co.uk (Postfix) with ESMTP id 2D743BFC18 for ; Sat, 30 Mar 2024 23:49:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.10.3 james.steelbluetech.co.uk 2D743BFC18 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ehuk.net; s=default; t=1711842563; bh=21GiBF8I0gF0CO1Ljqj3g9BQx7W8AYwSrKQj7MHeku8=; h=In-Reply-To:References:Date:Subject:From:To:Reply-To:From; b=GP7BVr+JpQHJCBqkKmSjHOYeb96i2ATZqOBab5/vrlLloJ737JVBz6nZZHmdw0+vr MkwcMVKwa8/xAvGqIjWHQgXkUFdHhjk1ggS5FwHxRgYLXTbhk8ZkbCNYsCXU07llSq K/Z7Qa7TwedRpLRbm+bY1wj/Nij3YwUJ/p/o1rIm4heOuoBWy9GKcEmrUiEnL6iIpK RzGF1w0bWAGKQ2vuO8V3FtAmdrnMb37AIqhqbz7Ue7mE6tzL/lkouiG0eJMgdp23tK VY4AznND8Pqi/5WAzIQFsa8FLM86blQE9FIKEuEHAt00/1y/UROKOROGOKgnEgM0Z8 rrpksb/fFSLGA== Message-ID: <9e80705f804c6f7209240f8876a31c14.squirrel@ukinbox.ecrypt.net> In-Reply-To: References: <20240329204315.3b29449b@Akita> <1671d927-55d5-6f01-2b54-b33981406945@gmail.com> Date: Sat, 30 Mar 2024 23:49:23 -0000 Subject: Re: [gentoo-dev] Current unavoidable use of xz utils in Gentoo From: "Eddie Chapman" To: gentoo-dev@lists.gentoo.org User-Agent: SquirrelMail/1.5.2 [SVN] Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang X-Archives-Salt: 975632c5-c1ae-4bc3-a8af-1d963c4b6ad9 X-Archives-Hash: 5205ad41e3b5f5decc72971cc09d8435 Eddie Chapman wrote: > Michał Górny wrote: > >> On Sat, 2024-03-30 at 14:57 +0000, Eddie Chapman wrote: >> >> >>> Note, I'm not advocating ripping xz-utils out of tree, all I'm saying >>> is wouldn't it be nice if there were at least 2 alternatives to >>> choose from? That doesn't have to be disruptive in any way, people who >>> wish to continue using and trusting xz-utils should be able to >>> continue to do so without any friction whatsoever. >> >> So, you're basically saying we should go out of our way, recompress all >> distfiles using two alternative compression formats, increase mirror >> load four times and add a lot of complexity to ebuilds, right? >> >> -- >> Best regards, >> Michał Górny >> > Yes that's a very good point, that was something I was wondering in > weighing up both sides, what the costs would be practically, as I don't > know the realities of running Gentoo infrastructure. And maybe the costs > is just too high of a price to pay. > > I wonder if increased use of git repos rather than distributed tarballs > could be part of a solution to those issues, although that could put quite > a storage burden on every user. Unless they were all shallow git pulls > and the user could optionally choose to tar up the git directory after > clone with compression. But yes granted then there is even more ebuild > complexity. > I've been thinking a little about how Gentoo without compression/decompression of distfiles could work, as a feature, without any impact on the existing world order, and no increased stress on Gentoo infra. I was wondering how palatable the following idea might be to others ... The basis of the idea is to add a feature to Portage which would let a person optionally indicate in make.conf that whenever a path in SRC_URI resolves to a file with a compression extension (.gz, .bz2, .xz, etc), that Portage should attempt to fetch it without the compression extension. So as an example, lets take sys-apps/pciutils, which currently has: SRC_URI="https://mj.ucw.cz/download/linux/pci/${P}.tar.gz" the feature would tell portage to simply translate this to: SRC_URI="https://mj.ucw.cz/download/linux/pci/${P}.tar" So perhaps it could be a flag that goes in FEATURES= called something like "strip_dist_comp" or something similar, or maybe someone has a better idea about that. Now, of course, I'm not proposing that Gentoo infra keeps uncompressed versions of distfiles. So by default Portage would encounter a 404 error when it tries to fetch the uncompressed file from Gentoo mirrors. However, this feature would then pave the way for a person to then configure Portage to fetch distfiles from their own server as well as Gentoo mirrors, and that person could then keep their own uncompressed versions of distfiles on their server, for however many and whichever distfiles they might wish to keep there, as the compressed version would get fetched from a Gentoo mirror if the uncompressed version is not there. Such a person would then have to obtain or create their own uncompressed distfile independently. A caveat of this solution would be that one would have to disable checksum verification (and gpg checks?) for this to work, as of course there would be no checksum for the uncompressed version in the Manifest, and Gentoo infra certainly should not be expected to especially uncompress each distfile once in order to generate an extra checksum for the Manifest. In fact I'd consider than undesirable, as anyone paranoid enough to want to do this would not trust such a checksum anyway, since it would be a checksum of a file that has been compressed at source and then decompressed on Gentoo infra, potentially introducing vulnerabilities. However, the lack of checksum is not a problem for someone who wants to keep distfiles on their own server, as such a person can also be responsible themselves for first verifying whatever they put on there, and for keeping said server secured from tampering. This seems to me to be something that would probably be relatively straightforward to implement within Portage, maybe with just a few lines around the python code that fetches the SRC_URI, and zero extra work or resources required from Gentoo infra. I'd consider it a feature for anyone who wants to eliminate a whole potential class of vulnerabilities that may or may not be present either now or in future in compression algorithm tools. Surely that would be a nice feature to have for some folk?