public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: "Michał Górny" <mgorny@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Cc: "Michał Górny" <mgorny@gentoo.org>
Subject: [gentoo-dev] [PATCH v2 3/3] glep-0074: Specify compressed file formats
Date: Mon, 12 Sep 2022 15:38:50 +0200	[thread overview]
Message-ID: <20220912133850.1202010-4-mgorny@gentoo.org> (raw)
In-Reply-To: <20220912133850.1202010-1-mgorny@gentoo.org>

Signed-off-by: Michał Górny <mgorny@gentoo.org>
---
 glep-0074.rst | 81 ++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 71 insertions(+), 10 deletions(-)

diff --git a/glep-0074.rst b/glep-0074.rst
index 9a0e92b..6523272 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -27,7 +27,8 @@ Changes
 =======
 
 v1.3
-  Formally specified the current set of hash algorithms supported.
+  Formally specified the current set of hash algorithms and compressed
+  Manifest formats supported.
 
 v1.2
   Specified the newline convention used for Manifests.
@@ -432,9 +433,8 @@ compression and this specification.
 
 The compressed Manifest files are required to be suffixed for their
 compression algorithm. This suffix should be used to recognize
-the compression and decompress Manifests transparently. The exact list
-of algorithms and their corresponding suffixes are outside the scope
-of this specification.
+the compression and decompress Manifests transparently. The supported
+formats are specified in `compressed file formats`_ section.
 
 The top-level Manifest file must not be compressed. Since the OpenPGP
 signature covers the uncompressed text and is compressed itself,
@@ -455,6 +455,46 @@ uncompressed content and the specification is free to choose either
 of the files using the same base name.
 
 
+Compressed file formats
+-----------------------
+
+.. table:: Table 2. Defined compressed file formats
+   :widths: auto
+
+   ===========  ======  ====================  ===========
+   Tool name    Suffix  Specification         Notes
+   ===========  ======  ====================  ===========
+   bzip2        .bz2    (none known)
+   gzip         .gz     RFC 1952 [#RFC1952]_  Recommended
+   lz4          .lz4    (none known)
+   lzip         .lz     RFC draft [#LZIP]_
+   lzma         .lzma   (none known)          Deprecated
+   lzop         .lzo    (none known)
+   xz           .xz     xz [#XZ]_
+   zstd         .zst    RFC 8878 [#RFC8878]_
+   ===========  ======  ====================  ===========
+
+Any new formats must be added to this specification prior to being used
+for Manifest files. Adding a new compressed file format is considered
+a backwards-compatible change to the GLEP. It is recommended that new
+formats use their reference (most common) file suffixes.
+
+An implementation can implement an arbitrary subset of the listed
+formats. For best interoperability, it should implement at least
+the recommended formats. Using deprecated formats should be avoided.
+
+If multiple Manifest variants coexist using different compressed file
+formats, the implementation may choose to use an arbitrary subset
+of them. However, all of them must be verified against the hashes stored
+in the containing Manifest. Should they be decompressed, the resulting
+contents must be identical.
+
+If the compressed file format is unsupported and a variant using
+a supported format coexists, the other variant should be used. However,
+at least one supported variant must exist for the verification
+to succeed.
+
+
 Combining multiple Manifest trees (informational)
 -------------------------------------------------
 
@@ -996,12 +1036,19 @@ into a compressed sub-Manifest in the top directory (e.g.
 ``Manifest.sub.gz``), and including a ``MANIFEST`` entry for this file
 in a signed, uncompressed top-level Manifest.
 
-The existence of additional entries for uncompressed Manifest checksums
-was debated. However, plain entries for the uncompressed file would
-be confusing if only the compressed file existed, and conflicting
-if both uncompressed and compressed variants existed. Furthermore,
-it has been pointed out that ``DIST`` entries do not have
-an uncompressed variant either.
+The existence of additional entries for checksums of Manifest contents
+after uncompressing was debated. However, plain entries for
+the uncompressed file would be confusing if only the compressed file
+existed. Furthermore, it has been pointed out that ``DIST`` entries
+do not have an uncompressed variant either.
+
+The specification permits coexistence of multiple variants of the same
+Manifest file using different compression for historical compatibility.
+However, there does not seem to be any real benefit from including
+a compressed Manifest file if the uncompressed variant needs to exist
+anyway. Providing different compressed variants could technically
+improve interoperability, though the same result could probably
+be achieved by using a more commonly supported format (e.g. gzip).
 
 
 Performance considerations
@@ -1134,6 +1181,20 @@ References
    (archived at 2017-11-29)
    (https://web.archive.org/web/20171129084214/http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
 
+.. [#RFC1952] RFC 1952: GZIP file format specification version 4.3
+   (https://www.rfc-editor.org/rfc/rfc1952)
+
+.. [#LZIP] RFC draft: Lzip Compressed Format and the 'application/lzip'
+   Media Type
+   (https://datatracker.ietf.org/doc/html/draft-diaz-lzip)
+
+.. [#XZ] The .xz File Format
+   (https://tukaani.org/xz/xz-file-format.txt)
+
+.. [#RFC8878] RFC 8878: Zstandard Compression and the 'application/zstd'
+   Media Type
+   (https://www.rfc-editor.org/rfc/rfc8878)
+
 .. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
    (https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
 
-- 
2.37.3



  parent reply	other threads:[~2022-09-12 13:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-12 13:38 [gentoo-dev] [PATCH v2 0/3] glep-0074: Explicitly specify hashes and compressed Manifest formats Michał Górny
2022-09-12 13:38 ` [gentoo-dev] [PATCH v2 1/3] glep-0074: Add a changelog Michał Górny
2022-09-12 13:38 ` [gentoo-dev] [PATCH v2 2/3] glep-0074: Specify supported hash algorithms Michał Górny
2022-09-14 20:50   ` Michał Górny
2022-09-12 13:38 ` Michał Górny [this message]
2022-09-18 18:31 ` [gentoo-dev] [PATCH v3 0/3] glep-0074: Explicitly specify hashes and compressed Manifest formats Michał Górny
2022-09-18 18:31   ` [gentoo-dev] [PATCH v3 1/3] glep-0074: Add a changelog Michał Górny
2022-09-18 18:31   ` [gentoo-dev] [PATCH v3 2/3] glep-0074: Specify supported hash algorithms Michał Górny
2022-09-18 18:31   ` [gentoo-dev] [PATCH v3 3/3] glep-0074: Specify compressed file formats Michał Górny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220912133850.1202010-4-mgorny@gentoo.org \
    --to=mgorny@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox