From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id C576A158094 for ; Wed, 21 Sep 2022 17:32:02 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id CEB94E0919; Wed, 21 Sep 2022 17:32:00 +0000 (UTC) Received: from smtp.gentoo.org (mail.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id B5B9FE0919 for ; Wed, 21 Sep 2022 17:32:00 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id BE2F9341194 for ; Wed, 21 Sep 2022 17:31:59 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 18FC65E0 for ; Wed, 21 Sep 2022 17:31:58 +0000 (UTC) From: "Michał Górny" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Michał Górny" Message-ID: <1663781501.06c577a0e72864859fbb2fb1cb7b7e8d60a78d79.mgorny@gentoo> Subject: [gentoo-commits] data/glep:master commit in: / X-VCS-Repository: data/glep X-VCS-Files: glep-0074.rst X-VCS-Directories: / X-VCS-Committer: mgorny X-VCS-Committer-Name: Michał Górny X-VCS-Revision: 06c577a0e72864859fbb2fb1cb7b7e8d60a78d79 X-VCS-Branch: master Date: Wed, 21 Sep 2022 17:31:58 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: b7d3e301-ade6-4f7d-900b-29694e7e9dee X-Archives-Hash: d212b0f0be5eb05f1bfaf429d7a5ac56 commit: 06c577a0e72864859fbb2fb1cb7b7e8d60a78d79 Author: Michał Górny gentoo org> AuthorDate: Sun Sep 11 11:54:56 2022 +0000 Commit: Michał Górny gentoo org> CommitDate: Wed Sep 21 17:31:41 2022 +0000 URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=06c577a0 glep-0074: Specify compressed file formats Signed-off-by: Michał Górny gentoo.org> glep-0074.rst | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 71 insertions(+), 10 deletions(-) diff --git a/glep-0074.rst b/glep-0074.rst index 3d7bbbd..7f53302 100644 --- a/glep-0074.rst +++ b/glep-0074.rst @@ -27,7 +27,8 @@ Changes ======= v1.3 - Formally specified the current set of hash algorithms supported. + Formally specified the current set of hash algorithms and compressed + Manifest formats supported. v1.2 Specified the newline convention used for Manifests. @@ -432,9 +433,8 @@ compression and this specification. The compressed Manifest files are required to be suffixed for their compression algorithm. This suffix should be used to recognize -the compression and decompress Manifests transparently. The exact list -of algorithms and their corresponding suffixes are outside the scope -of this specification. +the compression and decompress Manifests transparently. The supported +formats are specified in `compressed file formats`_ section. The top-level Manifest file must not be compressed. Since the OpenPGP signature covers the uncompressed text and is compressed itself, @@ -455,6 +455,46 @@ uncompressed content and the specification is free to choose either of the files using the same base name. +Compressed file formats +----------------------- + +.. table:: Table 2. Defined compressed file formats + :widths: auto + + =========== ====== ==================== =========== + Tool name Suffix Specification Notes + =========== ====== ==================== =========== + bzip2 .bz2 (none known) + gzip .gz RFC 1952 [#RFC1952]_ Recommended + lz4 .lz4 (none known) + lzip .lz RFC draft [#LZIP]_ + lzma .lzma (none known) Deprecated + lzop .lzo (none known) + xz .xz xz [#XZ]_ + zstd .zst RFC 8878 [#RFC8878]_ + =========== ====== ==================== =========== + +Any new formats must be added to this specification prior to being used +for Manifest files. Adding a new compressed file format is considered +a backwards-compatible change to the GLEP. It is recommended that new +formats use their reference (most common) file suffixes. + +An implementation can implement an arbitrary subset of the listed +formats. For best interoperability, it should implement at least +the recommended formats. Using deprecated formats should be avoided. + +If multiple Manifest variants coexist using different compressed file +formats, the implementation may choose to use an arbitrary subset +of them. However, all of them must be verified against the hashes stored +in the containing Manifest. Should they be decompressed, the resulting +contents must be identical. + +If the compressed file format is unsupported and a variant using +a supported format coexists, the other variant should be used. However, +at least one supported variant must exist for the verification +to succeed. + + Combining multiple Manifest trees (informational) ------------------------------------------------- @@ -1033,12 +1073,19 @@ into a compressed sub-Manifest in the top directory (e.g. ``Manifest.sub.gz``), and including a ``MANIFEST`` entry for this file in a signed, uncompressed top-level Manifest. -The existence of additional entries for uncompressed Manifest checksums -was debated. However, plain entries for the uncompressed file would -be confusing if only the compressed file existed, and conflicting -if both uncompressed and compressed variants existed. Furthermore, -it has been pointed out that ``DIST`` entries do not have -an uncompressed variant either. +The existence of additional entries for checksums of Manifest contents +after uncompressing was debated. However, plain entries for +the uncompressed file would be confusing if only the compressed file +existed. Furthermore, it has been pointed out that ``DIST`` entries +do not have an uncompressed variant either. + +The specification permits coexistence of multiple variants of the same +Manifest file using different compression for historical compatibility. +However, there does not seem to be any real benefit from including +a compressed Manifest file if the uncompressed variant needs to exist +anyway. Providing different compressed variants could technically +improve interoperability, though the same result could probably +be achieved by using a more commonly supported format (e.g. gzip). Performance considerations @@ -1171,6 +1218,20 @@ References (archived at 2017-11-29) (https://web.archive.org/web/20171129084214/http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html) +.. [#RFC1952] RFC 1952: GZIP file format specification version 4.3 + (https://www.rfc-editor.org/rfc/rfc1952) + +.. [#LZIP] RFC draft: Lzip Compressed Format and the 'application/lzip' + Media Type + (https://datatracker.ietf.org/doc/html/draft-diaz-lzip) + +.. [#XZ] The .xz File Format + (https://tukaani.org/xz/xz-file-format.txt) + +.. [#RFC8878] RFC 8878: Zstandard Compression and the 'application/zstd' + Media Type + (https://www.rfc-editor.org/rfc/rfc8878) + .. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers" (https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)