From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 7E970138334 for ; Wed, 21 Nov 2018 11:20:43 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id BA311E0895; Wed, 21 Nov 2018 11:20:39 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 506F9E086A for ; Wed, 21 Nov 2018 11:20:39 +0000 (UTC) Received: from pomiot (d202-252.icpnet.pl [109.173.202.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: mgorny) by smtp.gentoo.org (Postfix) with ESMTPSA id 290E5335C63; Wed, 21 Nov 2018 11:20:35 +0000 (UTC) Message-ID: <1542799232.30154.7.camel@gentoo.org> Subject: Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format From: =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?= To: gentoo-dev@lists.gentoo.org Date: Wed, 21 Nov 2018 12:20:32 +0100 In-Reply-To: <20181121104554.GB28829@gentoo.org> References: <1542453700.31427.2.camel@gentoo.org> <20181118091644.GA880@gentoo.org> <1542533931.1293.23.camel@gentoo.org> <20181118110048.GB880@gentoo.org> <1542792798.16894.17.camel@gentoo.org> <20181121104554.GB28829@gentoo.org> Organization: Gentoo Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="=-D0gE4gwTJMNLqLASQQFn" X-Mailer: Evolution 3.26.6 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org Mime-Version: 1.0 X-Archives-Salt: 47f39432-20df-4fbb-b514-54d675b10de3 X-Archives-Hash: f1cb0b1d256742c438a3a58655e4bf5c --=-D0gE4gwTJMNLqLASQQFn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2018-11-21 at 11:45 +0100, Fabian Groffen wrote: > > > > > > 5. **Metadata is not compressed.** This is not a significant p= roblem, > > > > > > it is just listed for completeness. > > > > > >=20 > > > > > >=20 > > > > > > Goals for a new container format > > > > > > -------------------------------- > > > > > >=20 > > > > > > The following goals have been set for a replacement format: > > > > > >=20 > > > > > > 1. **The packages must remain contained in a single file.** As= a matter > > > > > > of user convenience, it should be possible to transfer binar= y > > > > > > packages without having to use multiple files, and to instal= l them > > > > > > from any location. > > > > > >=20 > > > > > > 2. **The file format must be entirely based on common file form= ats, > > > > > > respecting best practices, with as little customization as n= ecessary > > > > > > to satisfy the requirements.** In particular, it is unaccep= table > > > > > > to create new binary formats. > > > > >=20 > > > > > I take this as your personal opinion. I don't quite get why it i= s > > > > > unacceptable to create a new binary format though. In particular= when > > > > > you're looking for efficiency, such format could serve your purpo= ses. > > > > > As long as it's clearly defined, I don't see the problem with a b= inary > > > > > format either. > > > > > Could you add why it is you think binary formats are unacceptable= here? > > > >=20 > > > > Because custom binary formats require specialized tooling, and are > > > > a royal PITA when the user wants to do something that the author of > > > > specialized tooling just happened not to think worthwhile, or when > > > > the tooling is not available for some reason. And before you ask r= eally > > > > silly questions, yes, I did fight binary packages over hex editor > > > > at some point. > > >=20 > > > Which I still don't understand, to be frank. I think even Portage > > > exposes python APIs to get to the data. > >=20 > > Compare the time needed to make a trivial (but unforeseen) change > > on a format that's transparent vs a format that requires you to learn > > its spec and/or API, write a program and debug it. >=20 > I was under the impression you could unpack a tbz2 into data and xpak, > then unpack both, modify the contents with an editor or whatever, and > then pack the whole stuff back into a tbz2 again. This can be done > worst case scenario by emerge -k , modifying the vdb and quickpkg > afterwards. In the described example, the whole necessity of modifying the binary package arises from it being broken, therefore unsuitable for 'emerge -k'. > I know that with portage-utils you can do this easily with the qtbz2 and > qxpak commands. No need to do anything with a hex editor, or know > anything about how it's done. Actually, you need to: a. know that portage-utils has the appropriate tools (it's non-obvious), b. know how to use portage-utils. This is non-obvious. It took me a while to figure out that I need to use qtbz2 before using qxpak (why would it work only on split data when the format is explicitly written to be used on top of compressed archive?!). > Obvious advantage of your approach is that you don't need q* tools, but > can use tar instead. The editting is as trivial though. In your case > you need a special procedure to reconstruct the binpkg should you want > to keep your special properties (label, order) which equates to q* tools > somewhat. Except you don't need to keep them. The spec is quite explicit that they're optimizations and that the package must work even if they're lost as a part of editing exercise. >=20 > > > > The most trivial case is an attempted recovery of a broken system. > > > > If you don't have Portage working and don't have portage-utils > > > > installed, do you really prefer a custom format which will require = you > > > > to fetch and compile special tools? Or is one that can be processe= d > > > > with tools you're quite likely to have on every system, like tar? > > >=20 > > > Well, I think the idea behind the original binpkg format was to use t= ar > > > directly on the files in emergency scenarios like these... > > > The assumption was bzip2 decompressor and tar being available. > > > I think it is an example of how you add something, while still allowi= ng > > > to fallback on existing tools. > >=20 > > Except progress in compressors has made it work less and less reliably.= =20 > > It's mostly an example how to be *clever*. However, being clever > > usually doesn't pay off in the long term, compared to doing things *in = a > > simple way*. >=20 > We agree it is hackish, and we agree we can do without. You simply > exaggerate the problem, IMO, which mostly isn't there, because it works > fine today. It can also be solved today using shell tools. >=20 > % head -c `grep -abo 'XPAKPACK' $EPREFIX/usr/portage/packages/sys-apps/se= d-4.5.tbz2 | sed 's/:.*$//'` $EPREFIX/usr/portage/packages/sys-apps/sed-4.5= .tbz2 | tar -jxf - >=20 > results in no warnings/errors from bzip about trailing garbage, possible > thanks to the spec being smart enough about this. Well, you aren't going to call that simple, are you? Plus, I think your solution would fail if bzip2 output just happened to contain 'XPAKPACK' string. Not saying it's likely to happen but relying on fixed strings not happening accidentally is not good design. --=20 Best regards, Micha=C5=82 G=C3=B3rny --=-D0gE4gwTJMNLqLASQQFn Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQKTBAABCgB9FiEEXr8g+Zb7PCLMb8pAur8dX/jIEQoFAlv1P4BfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDVF QkYyMEY5OTZGQjNDMjJDQzZGQ0E0MEJBQkYxRDVGRjhDODExMEEACgkQur8dX/jI EQqBWBAAxfwGY96PPOrjkB4HexDOmdI69V863VY1fDAp+AEaclqjse0o/x05flFb r5wyjwoukQ7hel6SvcsWHK+rLJzL3BjoqMWVWfE8EEDmPRwoagp19mAcTV/TkBEu A0zD53WtzZjenf5KXaS4Nq32Vog9ohNb76q9J8lk8qZv7ZMRoIF7lxt+6ThrNnIC i2mNmfLBu3WaFv7ySXs4jXDwmvEKfnA9b8O0B8XoqGLSzL2s2rmbfqw28rXDVkgs zFsxQBqIyYr4tgbVfpILvJWSPAUYC3XfJy0D6CmfGXkxFrjKM9ZA0VPFD2q0Ju4M Z157+QsyKTmS5Sem0HYpDBA7dWQ4aRftiH4QkICrfXNIP2m2yWlMg8C7bTUGlEfJ mWhPj8UA7qJbyMOqx3ftKJai2ZC0O6386frCYBTgbQmam8kJ1kWS27SFvqL+5Seb mIIuNmoIOuQNAzc03lq+SwadlNIjSTCFK03Ru0s2DOQMjILp+o2//srPxFiG2mXl mq7yFrf5NXsAy7X9OFYC4V465JmJiegt97Vt1SOyti7o4wYLpibW7JUHcCei7EBA f4nwZT2pk95aKBu+li9QIUbwFI9DumiPGjZIADXxjisNy8Rwfw3W+kKQsmH0WWrY x19kme//N58dYmJDJ/Acdx5wN+NKwjit354AYfmDkPrQYBc2bUQ= =mGbq -----END PGP SIGNATURE----- --=-D0gE4gwTJMNLqLASQQFn--