* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-29 19:05 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-29 19:05 UTC (permalink / raw
To: gentoo-commits
commit: 1c6b710facff36be31f3f53c10fdbc1b70f52e5d
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sat Oct 28 11:49:39 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Sun Oct 29 19:04:41 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=1c6b710f
glep-0074: Update based on feedback from Robin H. Johnson
glep-0074.rst | 66 ++++++++++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 52 insertions(+), 14 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index e9f8bad..425381f 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-26
+Last-Modified: 2017-10-29
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -49,7 +49,7 @@ This specification is designed with the following goals in mind:
1. It should provide means to ensure the authenticity of the complete
repository, including preventing the injection of additional files.
-2. Alike the original Manifest2, the files should be split into two
+2. Like the original Manifest2, the files should be split into two
groups — files whose authenticity is critical, and those whose
mismatch may be accepted in non-strict mode. The same classification
should apply both to files listed in Manifests, and to stray files
@@ -115,11 +115,11 @@ The file entries (except for ``IGNORE``) can be specified for regular
files only. Symbolic links are followed when opening files. It is
an error to specify an entry for a different file type.
-All the files covered by a Manifest tree must reside on the same
-filesystem. It is an error to specify entries applying to files
-on another filesystem. If subdirectories of the Manifest tree reside
-on a different filesystem, they must be explicitly excluded
-via ``IGNORE``.
+All the local (non-``DIST``) files covered by a Manifest tree must
+reside on the same filesystem. It is an error to specify entries
+applying to files on another filesystem. If subdirectories
+of the Manifest tree reside on a different filesystem, they must
+be explicitly excluded via ``IGNORE``.
File verification
@@ -156,7 +156,8 @@ The Manifest files can specify the following tags:
combined date and time in UTC timezone, i.e. using the following
``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optionally used
in the top-level Manifest file. The package manager can use it
- to detect an outdated repository checkout.
+ to detect an outdated repository checkout as described in `Timestamp
+ verification`_.
``MANIFEST <path> <size> <checksums>…``
Specifies a sub-Manifest. The sub-Manifest must be verified like
@@ -209,6 +210,28 @@ allowed at the package directory level:
to ``files/`` subdirectory.
+Timestamp verification
+----------------------
+
+The Manifest file can contain a ``TIMESTAMP`` entry to account
+for attacks against tree update distribution. If such an entry
+is present, it should be updated every time at least one
+of the Manifests changes. Every unique timestamp value must correspond
+to a single tree state.
+
+During the verification process, the client should compare the timestamp
+against the update time obtained from a local clock or a trusted time
+source. If the comparison result indicates that the Manifest at the time
+of receiving was already significantly outdated, the client should
+either fail the verification or require manual confirmation from user.
+
+Furthermore, the Manifest provider may employ additional methods
+of distributing the timestamps of recently generated Manifests
+using a secure channel from a trusted source for exact comparison.
+The exact details of such a solution are outside the scope of this
+specification.
+
+
Algorithm for full-tree verification
------------------------------------
@@ -218,8 +241,9 @@ can be used:
1. Collect all files present in the repository into *present* set.
2. Start at the top-level Manifest file. Verify its OpenPGP signature.
- Optionally verify the ``TIMESTAMP`` entry if present. Remove
- the top-level Manifest from the *present* set.
+ Optionally verify the ``TIMESTAMP`` entry if present as specified
+ in `timestamp verification`. Remove the top-level Manifest
+ from the *present* set.
3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
files according to `file verification`_ section, and include their
@@ -232,7 +256,11 @@ can be used:
5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
``EBUILD`` and ``AUX`` entries into the *covered* set.
-6. Verify all the files in the union of the *present* and *covered*
+6. Verify the entries in *covered* set for incompatible duplicates
+ and collisions with ignored files as explained in `Manifest file
+ locations and nesting`_.
+
+7. Verify all the files in the union of the *present* and *covered*
sets, according to `file verification`_ section.
@@ -489,8 +517,15 @@ The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
to include a generation timestamp in the Manifest. A similar feature
was originally proposed in GLEP 58 [#GLEP58]_.
-The timestamp can be used to detect delay or replay attacks against
-Gentoo mirrors.
+A malicious third-party may use the principles of exclusion and replay
+to deny an update to clients, while at the same time recording
+the identity of clients to attack. The timestamp field can be used
+to detect that.
+
+In order to provide a more complete protection, the Gentoo
+Infrastructure should provide an ability to obtain the timestamps
+of all Manifests from a recent timeframe over a secure channel
+from a trusted source for comparison.
Strictly speaking, this is already provided by the various
``metadata/timestamp.*`` files provided already by Gentoo which are also
@@ -662,7 +697,10 @@ ensured:
the deprecated ``EBUILD`` tag (rather than ``DATA``),
- the Manifest files inside the package directory can be signed
- to provide authenticity verification.
+ to provide authenticity verification,
+
+- if the Manifest files inside the package directory are compressed,
+ a uncompressed file of identical content must coexist.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-29 19:05 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-29 19:05 UTC (permalink / raw
To: gentoo-commits
commit: ae28a67b2402e3b37535234441bc97670ba535c4
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sun Oct 22 13:19:20 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Fri Oct 27 21:20:21 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=ae28a67b
glep-0074: Full-tree verification using Manifest files
glep-0074.rst | 749 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 749 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
new file mode 100644
index 0000000..e9f8bad
--- /dev/null
+++ b/glep-0074.rst
@@ -0,0 +1,749 @@
+---
+GLEP: 74
+Title: Full-tree verification using Manifest files
+Author: Michał Górny <mgorny@gentoo.org>,
+ Robin Hugh Johnson <robbat2@gentoo.org>,
+ Ulrich Müller <ulm@gentoo.org>
+Type: Standards Track
+Status: Draft
+Version: 1
+Created: 2017-10-21
+Last-Modified: 2017-10-26
+Post-History: 2017-10-26
+Content-Type: text/x-rst
+Requires: 59, 61
+Replaces: 44, 58, 60
+---
+
+Abstract
+========
+
+This GLEP extends the Manifest file format to cover full-tree file
+integrity and authenticity checks.The format aims to be future-proof,
+efficient and provide means of backwards compatibility.
+
+
+Motivation
+==========
+
+The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
+means of verifying the integrity of distfiles and package files
+in Gentoo. Combined with OpenPGP signatures, they provide means to
+ensure the authenticity of the covered files. However, as noted
+in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
+authenticity verification as they do not cover any files outside
+the package directory. In particular, they provide multiple ways
+for a third party to inject malicious code into the ebuild environment.
+
+Historically, the topic of providing authenticity coverage for the whole
+repository has been mentioned multiple times. The most noteworthy effort
+are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
+They were accepted by the Council in 2010 but have never been
+implemented. When potential implementation work started in 2017, a new
+discussion about the specification arose. It prompted the creation
+of a competing GLEP that would provide a redesigned alternative to
+the old GLEPs.
+
+This specification is designed with the following goals in mind:
+
+1. It should provide means to ensure the authenticity of the complete
+ repository, including preventing the injection of additional files.
+
+2. Alike the original Manifest2, the files should be split into two
+ groups — files whose authenticity is critical, and those whose
+ mismatch may be accepted in non-strict mode. The same classification
+ should apply both to files listed in Manifests, and to stray files
+ present only in the repository.
+
+3. The format should be universal enough to work both for the Gentoo
+ repository and third-party repositories of different characteristics.
+
+4. The Manifest files should be verifiable stand-alone, that is without
+ knowing any details about the underlying repository format.
+
+
+Specification
+=============
+
+Manifest file format
+--------------------
+
+This specification reuses and extends the Manifest file format defined
+in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
+repurposed as a generic *tag* that could also indicate additional
+(non-checksum) metadata. Appropriately, those tags can be followed by
+other space-separated values.
+
+Unless specified otherwise, the paths used in the Manifest files
+are relative to the directory containing the Manifest file. The paths
+must not reference the parent directory (``..``).
+
+
+Manifest file locations and nesting
+-----------------------------------
+
+The ``Manifest`` file located in the root directory of the repository
+is called top-level Manifest, and it is used to perform the full-tree
+verification. In order to verify the authenticity, it must be signed
+using OpenPGP, using the armored cleartext format.
+
+The top-level Manifest may reference sub-Manifests contained
+in subdirectories of the repository. The sub-Manifests are traditionally
+named ``Manifest``; however, the implementation must support arbitrary
+names, including the possibility of multiple (split) Manifests
+for a single directory. The sub-Manifest can only cover the files inside
+the directory tree where it resides.
+
+The sub-Manifest can also be signed using OpenPGP armored cleartext
+format. However, the signature verification can be omitted if it is
+covered by a signed top-level Manifest.
+
+The Manifest files can also specify ``IGNORE`` entries to skip Manifest
+verification of subdirectories and/or files. Files and directories
+starting with a dot are always implicitly ignored. All files that
+are not ignored must be covered by at least one of the Manifests.
+
+A single file may be matched by multiple identical or equivalent
+Manifest entries, if and only if the entries have the same semantics,
+specify the same size and the checksums common to both entries match.
+It is an error for a single file to be matched by multiple entries
+of different semantics, file size or checksum values. It is an error
+to specify another entry for a file matching ``IGNORE``, or one of its
+subdirectories.
+
+The file entries (except for ``IGNORE``) can be specified for regular
+files only. Symbolic links are followed when opening files. It is
+an error to specify an entry for a different file type.
+
+All the files covered by a Manifest tree must reside on the same
+filesystem. It is an error to specify entries applying to files
+on another filesystem. If subdirectories of the Manifest tree reside
+on a different filesystem, they must be explicitly excluded
+via ``IGNORE``.
+
+
+File verification
+-----------------
+
+When verifying a file against the Manifest, the following rules are
+used:
+
+- if a file listed in Manifest is not present, then the verification
+ for the file fails,
+
+- if a file listed in Manifest is present but has a different size
+ or one of the checksums does not match, the verification fails,
+
+- if a file is present but not listed in Manifest, the verification
+ fails,
+
+- otherwise, the verification succeeds.
+
+Unless specified otherwise, the package manager must not allow using
+any files for which the verification failed. The package manager may
+reject any package or even the whole repository if it may refer to files
+for which the verification failed.
+
+
+New Manifest tags
+-----------------
+
+The Manifest files can specify the following tags:
+
+``TIMESTAMP <iso8601>``
+ Specifies a timestamp of when the Manifest file was last updated.
+ The timestamp must be a valid second-precision ISO8601 extended format
+ combined date and time in UTC timezone, i.e. using the following
+ ``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optionally used
+ in the top-level Manifest file. The package manager can use it
+ to detect an outdated repository checkout.
+
+``MANIFEST <path> <size> <checksums>…``
+ Specifies a sub-Manifest. The sub-Manifest must be verified like
+ a regular file. If the verification succeeds, the entries from
+ the sub-Manifest are included for verification as described
+ in `Manifest file locations and nesting`_.
+
+``IGNORE <path>``
+ Ignores a subdirectory or file from Manifest checks. If the specified
+ path is present, it and its contents are omitted from the Manifest
+ verification (always pass).
+
+``DATA <path> <size> <checksums>…``
+ Specifies a file subject to obligatory Manifest verification.
+ The file is required to pass verification. Used for all files directly
+ affecting package manager operation (ebuilds, eclasses, profiles).
+
+``MISC <path> <size> <checksums>…``
+ Specifies a file subject to non-obligatory Manifest verification.
+ The package manager may ignore a verification failure if operating
+ in non-strict mode. Used for files that do not affect the installed
+ packages (``metadata.xml``, ``use.desc``).
+
+``OPTIONAL <path>``
+ Specifies a file that would be subject to non-obligatory Manifest
+ verification if it existed. The package may ignore a stray file
+ matching this entry if operating in non-strict mode. Used for paths
+ that would match ``MISC`` if they existed.
+
+``DIST <filename> <size> <checksums>…``
+ Specifies a distfile entry used to verify files fetched as part
+ of ``SRC_URI``. The filename must match the filename used to store
+ the fetched file as specified in the PMS [#PMS-FETCH]_. The package
+ manager must reject the fetched file if it fails verification.
+ ``DIST`` entries apply to all packages below the Manifest file
+ specifying them.
+
+
+Deprecated Manifest tags
+------------------------
+
+For backwards compatibility, the following tags are additionally
+allowed at the package directory level:
+
+``EBUILD <filename> <size> <checksums>…``
+ Equivalent to the ``DATA`` type.
+
+``AUX <filename> <size> <checksums>…``
+ Equivalent to the ``DATA`` type, except that the filename is relative
+ to ``files/`` subdirectory.
+
+
+Algorithm for full-tree verification
+------------------------------------
+
+In order to perform full-tree verification, the following algorithm
+can be used:
+
+1. Collect all files present in the repository into *present* set.
+
+2. Start at the top-level Manifest file. Verify its OpenPGP signature.
+ Optionally verify the ``TIMESTAMP`` entry if present. Remove
+ the top-level Manifest from the *present* set.
+
+3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
+ files according to `file verification`_ section, and include their
+ entries in the current Manifest entry list (using paths relative
+ to directories containing the Manifests).
+
+4. Process all ``IGNORE`` entries. Remove any paths matching them
+ from the *present* set.
+
+5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
+ ``EBUILD`` and ``AUX`` entries into the *covered* set.
+
+6. Verify all the files in the union of the *present* and *covered*
+ sets, according to `file verification`_ section.
+
+
+Algorithm for finding parent Manifests
+--------------------------------------
+
+In order to find the top-level Manifest from the current directory
+the following algorithm can be used:
+
+1. Store the current directory as *original* and the device ID
+ of the containing filesystem (``st_dev``) as *startdev*,
+
+2. If the device ID of the containing filesystem (``st_dev``)
+ of the current directory is different than *startdev*, stop.
+
+3. If the current directory contains a ``Manifest`` file:
+
+ a. If a ``IGNORE`` entry in the ``Manifest`` file covers
+ the *original* directory (or one of the parent directories), stop.
+
+ b. Otherwise, store the current directory as *last_found*.
+
+4. If the current directory is the root system directory (``/``), stop.
+
+5. Otherwise, enter the parent directory and jump to step 2.
+
+Once the algorithm stops, *last_found* will contain the relevant
+top-level Manifest. If *last_found* is null, then the directory tree
+does not contain any valid top-level Manifest candidates and one should
+be created in the *original* directory.
+
+Once the top-level Manifest is found, its ``MANIFEST`` entries should
+be used to find any sub-Manifests below the top-level Manifest,
+up to and including the *original* directory. Note that those
+sub-Manifests can use different filenames than ``Manifest``.
+
+
+Checksum algorithms
+-------------------
+
+This section is informational only. Specifying the exact set
+of supported algorithms is outside the scope of this specification.
+
+The algorithm names reserved at the time of writing are:
+
+- ``MD5`` [#MD5]_,
+- ``RMD160`` — RIPEMD-160 [#RIPEMD160]_,
+- ``SHA1`` [#SHS]_,
+- ``SHA256`` and ``SHA512`` — SHA-2 family of hashes [#SHS]_,
+- ``WHIRLPOOL`` [#WHIRLPOOL]_,
+- ``BLAKE2B`` and ``BLAKE2S`` — BLAKE2 family of hashes [#BLAKE2]_,
+- ``SHA3_256`` and ``SHA3_512`` — SHA-3 family of hashes [#SHA3]_,
+- ``STREEBOG256`` and ``STREEBOG512`` — Streebog family of hashes
+ [#STREEBOG]_.
+
+The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
+It is recommended that any new hashes are named after the Python
+``hashlib`` module algorithm names, transformed into uppercase.
+
+
+Manifest compression
+--------------------
+
+The topic of Manifest file compression is covered by GLEP 61 [#GLEP61]_.
+This section merely addresses interoperability issues between Manifest
+compression and this specification.
+
+The compressed Manifest files are required to be suffixed for their
+compression algorithm. This suffix should be used to recognize
+the compression and decompress Manifests transparently. The exact list
+of algorithms and their corresponding suffixes are outside the scope
+of this specification.
+
+Whenever this specification refers to top-level Manifest file,
+the implementation should account for compressed variants of this file
+with appropriate suffixes (e.g. ``Manifest.gz``).
+
+Whenever this specification refers to sub-Manifests, they can use any
+names but are also required to use a specific compression suffix.
+The ``MANIFEST`` entries are required to specify the full name including
+compression suffix, and the verification is performed on the compressed
+file.
+
+The specification permits uncompressed Manifests to exist alongside
+their compressed counterparts, and multiple compressed formats
+to coexist. If that is the case, the files must have the same
+uncompressed content and the specification is free to choose either
+of the files using the same base name.
+
+
+Rationale
+=========
+
+Stand-alone format
+------------------
+
+The first question that needed to be asked before proceeding with
+the design was whether the Manifest file format was supposed to be
+stand-alone, or tightly bound to the repository format.
+
+The stand-alone format has been selected because of its three
+advantages:
+
+1. It is more future-proof. If an incompatible change to the repository
+ format is introduced, only developers need to be upgrade the tools
+ they use to generate the Manifests. The tools used to verify
+ the updated Manifests will continue to work.
+
+2. It is more flexible and universal. With a dedicated tool,
+ the Manifest files can be used to sign and verify arbitrary file
+ sets.
+
+3. It keeps the verification tool simpler. In particular, we can easily
+ write an independent verification tool that could work on any
+ distribution without needing to depend on a package manager
+ implementation or rewrite parts of it.
+
+Designing a stand-alone format requires that the Manifest carries enough
+information to perform the verification following all the rules specific
+to the Gentoo repository.
+
+
+Tree design
+-----------
+
+The second important point of the design was determining whether
+the Manifest files should be structured hierarchically, or independent.
+Both options have their advantages.
+
+In the hierarchical model, each sub-Manifest file is covered by a higher
+level Manifest. As a result, only the top-level Manifest has to be
+OpenPGP-signed, and subsequent Manifests need to be only verified by
+checksum stored in the parent Manifest. This has the following
+implications:
+
+- Verifying any set of files in the repository requires using checksums
+ from the most relevant Manifests and the parent Manifests.
+
+- The OpenPGP signature of the top-level Manifest needs to be verified
+ only once per process.
+
+- Altering any set of files requires updating the relevant Manifests,
+ and their parent Manifests up to the top-level Manifest, and signing
+ the last one.
+
+- As a result, the top-level Manifest changes on every commit,
+ and various middle-level Manifests change (and need to be transferred)
+ frequently.
+
+In the independent model, each sub-Manifest file is independent
+of the parent Manifests. As a result, each of them needs to be signed
+and verified independently. However, the parent Manifests still need
+to list sub-Manifests (albeit without verification data) in order
+to detect removal or replacement of subdirectories. This has
+the following implications:
+
+- Verifying any set of files in the repository requires using checksums
+ and verifying signatures of the most relevant Manifest files.
+
+- Altering any set of files requires updating the relevant Manifests
+ and signing them again.
+
+- Parent Manifests are updated only when Manifests are added or removed
+ from subdirectories. As a result, they change infrequently.
+
+While both models have their advantages, the hierarchical model was
+selected because it reduces the number of OpenPGP operations
+which are comparatively costly to the minimum.
+
+
+Tree layout restrictions
+------------------------
+
+The algorithm is meant to work primarily with ebuild repositories which
+normally contain only files and directories. Directories provide
+no useful metadata for verification, and specifying special entries
+for additional file types is purposeless. Therefore, the specification
+is restricted to dealing with regular files.
+
+The Gentoo repository does not use symbolic links. Some Gentoo
+repositories do, however. To provide a simple solution for dealing with
+symlinks without having to take care to implement special handling for
+them, the common behavior of implicitly resolving them is used.
+Therefore, symbolic links to files are stored as if they were regular
+files, and symbolic links to directories are followed as if they were
+regular directories.
+
+Dotfiles are implicitly ignored as that is a common notion used
+in software written for POSIX systems. All other filenames require
+explicit ``IGNORE`` lines.
+
+The algorithm is restricted to work on a single filesystem. This is
+mostly relevant when scanning for top-level Manifest — we do not want
+to cross filesystem boundaries then. However, to ensure consistent
+bidirectional behavior we need to also ban them when operating downwards
+the tree.
+
+The directories and files on different filesystems needs to be ignored
+explicitly as implicitly skipping them would cause confusion.
+In particular, tools might then claim that a file does not exist when
+it clearly does because it was skipped due to filesystem boundaries.
+
+
+File verification model
+-----------------------
+
+The verification model aims to provide full coverage against different
+forms of attack. In particular, three different kinds of manipulation
+are considered:
+
+1. Alteration of the file content.
+
+2. Removal of a file.
+
+3. Addition of a new file.
+
+In order to prevent against all three, the system requires that all
+files in the repository are listed in Manifests and verified against
+them.
+
+As a special case, ignores are allowed to account for directories
+that are not part of the repository but were traditionally placed inside
+it. Those directories were ``distfiles``, ``local`` and ``packages``. It
+could be also used to ignore VCS directories such as ``CVS``.
+
+
+Non-obligatory Manifest verification
+------------------------------------
+
+While this specification recommends all tools to use strict verification
+by default, it allows declaring some files as non-obligatory like
+the original Manifest2 format did. This could be used on files that do
+not affect the normal package manager operation.
+
+It aims to account for two use cases:
+
+1. Stripping down files that are not strictly required to install
+ packages from repository checkouts.
+
+2. Accounting for automatically generated files that might be updated
+ by standard tooling.
+
+The traditional ``MISC`` type is amended with a complementary
+``OPTIONAL`` tag to account for files that are not provided
+in the specific repository. It aims to ensure that the same path would
+be non-fatal when provided by the repository but fatal when created
+by the user tooling.
+
+
+Timestamp field
+---------------
+
+The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
+to include a generation timestamp in the Manifest. A similar feature
+was originally proposed in GLEP 58 [#GLEP58]_.
+
+The timestamp can be used to detect delay or replay attacks against
+Gentoo mirrors.
+
+Strictly speaking, this is already provided by the various
+``metadata/timestamp.*`` files provided already by Gentoo which are also
+covered by the Manifest. However, including the value in the Manifest
+itself has a little cost and provides the ability to perform
+the verification stand-alone.
+
+
+New vs deprecated tags
+----------------------
+
+Out of the four types defined by Manifest2, two are reused and two are
+marked deprecated.
+
+The ``DIST`` and ``MISC`` tags are reused since they can be relatively
+clearly marked into the new concept.
+
+The ``EBUILD`` tag could potentially be reused for generic file
+verification data. However, it would be confusing if all the different
+data files were marked as ``EBUILD``. Therefore, an equivalent ``DATA``
+type was introduced as a replacement.
+
+The ``AUX`` tag is deprecated as it is redundant to ``DATA``, and has
+the limiting property of implicit ``files/`` path prefix.
+
+
+Finding top-level Manifest
+--------------------------
+
+The development of a reference implementation for this GLEP has brought
+the following problem: how to find all the relevant Manifests when
+the Manifest tool is run inside a subdirectory of the repository?
+
+One of the options would be to provide a bi-directional linking
+of Manifests via a ``PARENT`` tag. However, that would not solve
+the problem when a new Manifest file is being created.
+
+Instead, an algorithm for iterating over parent directories is proposed.
+Since there is no obligatory explicit indicator for the top-level
+Manifest, the algorithm assumes that the top-level Manifest
+is the highest ``Manifest`` in the directory hierarchy that can cover
+the current directory. This generally makes sense since the Manifest
+files are required to provide coverage for all subdirectories, so all
+Manifests starting from that one need to be updated.
+
+If independent Manifest trees are nested in the directory structure,
+then an ``IGNORE`` entry needs to be used to separate them.
+
+Since sub-Manifests can use any filenames, the Manifest finding
+algorithm must not short-cut the procedure by storing all ``Manifest``
+files along the parent directories. Instead, it needs to retrace
+the relevant sub-Manifest files along ``MANIFEST`` entries
+in the top-level Manifest.
+
+
+Injecting ChangeLogs into the checkout
+--------------------------------------
+
+One of the problems considered in the new Manifest format was that
+of injecting historical and autogenerated ChangeLog into the repository.
+Normally we are not including those files to reduce the checkout size.
+However, some users have shown interest in them and Infra is working
+on providing them via an additional rsync module.
+
+If such files were injected into the repository, they would cause strict
+verification failures of Manifests. To account for this, Infra could
+provide either ``OPTIONAL`` entries for the Manifest files to allow them
+in non-strict verification mode, or ``IGNORE`` entries to allow them
+in the strict mode.
+
+
+Splitting distfile checksums from file checksums
+------------------------------------------------
+
+Another problem with the current Manifest format is that the checksums
+for fetched files are combined with checksums for local files
+in a single file inside the package directory. It has been specifically
+pointed out that:
+
+- since distfiles are sometimes reused across different packages,
+ the repeating checksums are redundant,
+
+- mirror admins were interested in the possibility of verifying all
+ the distfiles with a single tool.
+
+This specification does not provide a clean solution to this problem.
+It technically permits moving ``DIST`` entries to higher-level Manifests
+but the usefulness of such a solution is doubtful.
+
+However, for the second problem we will probably deliver a dedicated
+tool working with this Manifest format.
+
+
+Hash algorithms
+---------------
+
+While maintaining a consistent supported hash set is important
+for interoperability, it is no good fit for the generic layout of this
+GLEP. Furthermore, it would require updating the GLEP in the future
+every time the used algorithms change.
+
+Instead, the specification focuses on listing the currently used
+algorithm names for interoperability, and sets a recommendation
+for consistent naming of algorithms in the future. The Python
+``hashlib`` module is used as a reference since it is used
+as the provider of hash functions for most of the Python software,
+including Portage and PkgCore.
+
+The basic rules for changing hash algorithms are defined in GLEP 59
+[#GLEP59]_. The implementations can focus only on those algorithms
+that are actually used or planned on being used. It may be feasible
+to devise a new GLEP that specifies the currently used hashes (or update
+GLEP 59 accordingly).
+
+
+Manifest compression
+--------------------
+
+The support for Manifest compression is introduced with minimal changes
+to the file format. The ``MANIFEST`` entries are required to provide
+the real (compressed) file path for compatibility with other file
+entries and to avoid confusion.
+
+The existence of additional entries for uncompressed Manifest checksums
+was debated. However, plain entries for the uncompressed file would
+be confusing if only compressed file existed, and conflicting if both
+uncompressed and compressed variants existed. Furthermore, it has been
+pointed out that ``DIST`` entries do not have uncompressed variant
+either.
+
+
+Performance considerations
+--------------------------
+
+Performing a full-tree verification on every sync raises some
+performance concerns for end-user systems. The initial testing has shown
+that a cold-cache verification on a btrfs file system can take up around
+4 minutes, with the process being mostly I/O bound. On the other hand,
+it can be expected that the verification will be performed directly
+after syncing, taking advantage of warm filesystem cache.
+
+To improve speed on I/O and/or CPU-restrained systems even further,
+the algorithms can be easily extended to perform incremental
+verification. Given that rsync does not preserve mtimes by default,
+the tool can take advantage of mtime and Manifest comparisons to recheck
+only the parts of the repository that have changed.
+
+Furthermore, the package manager implementations can restrict checking
+only to the parts of the repository that are actually being used.
+
+
+Backwards Compatibility
+=======================
+
+This GLEP provides optional means of preserving backwards compatibility.
+To preserve the backwards compatibility, the following needs to be
+ensured:
+
+- all files within the package directory must be covered by ``Manifest``
+ file inside that package directory,
+
+- all distfiles used by the package must be covered by ``Manifest``
+ file inside the package directory,
+
+- all files inside the ``files/`` subdirectory of a package directory
+ need to be use the deprecated ``AUX`` tag (rather than ``DATA``),
+
+- all ``.ebuild`` files inside the package directory need to use
+ the deprecated ``EBUILD`` tag (rather than ``DATA``),
+
+- the Manifest files inside the package directory can be signed
+ to provide authenticity verification.
+
+Once the backwards compatibility is no longer a concern, the above
+no longer needs to hold and the deprecated tags can be removed.
+
+
+Reference Implementation
+========================
+
+The reference implementation for this GLEP is being developed
+as the gemato project [#GEMATO]_.
+
+
+Credits
+=======
+
+Thanks to all the people whose contributions were invaluable
+to the creation of this GLEP. This includes but is not limited to:
+
+- Robin Hugh Johnson,
+- Ulrich Müller.
+
+Additionally, thanks to Robin Hugh Johnson for the original
+MataManifest GLEP series which served both as inspiration and source
+of many concepts used in this GLEP. Recursively, also thanks to all
+the people who contributed to the original GLEPs.
+
+
+References
+==========
+
+.. [#GLEP44] GLEP 44: Manifest2 format
+ (https://www.gentoo.org/glep/glep-0044.html)
+
+.. [#GLEP57] GLEP 57: Security of distribution of Gentoo software
+ - Overview
+ (https://www.gentoo.org/glep/glep-0057.html)
+
+.. [#GLEP58] GLEP 58: Security of distribution of Gentoo software
+ - Infrastructure to User distribution - MetaManifest
+ (https://www.gentoo.org/glep/glep-0058.html)
+
+.. [#GLEP59] GLEP 59: Manifest2 hash policies and security implications
+ (https://www.gentoo.org/glep/glep-0059.html)
+
+.. [#GLEP60] GLEP 60: Manifest2 filetypes
+ (https://www.gentoo.org/glep/glep-0060.html)
+
+.. [#GLEP61] GLEP 61: Manifest2 compression
+ (https://www.gentoo.org/glep/glep-0061.html)
+
+.. [#PMS-FETCH] Package Manager Specification: Dependency Specification
+ Format - SRC_URI
+ (https://projects.gentoo.org/pms/6/pms.html#x1-940008.2.10)
+
+.. [#MD5] RFC1321: The MD5 Message-Digest Algorithm
+ (https://www.ietf.org/rfc/rfc1321.txt)
+
+.. [#RIPEMD160] The hash function RIPEMD-160
+ (https://homes.esat.kuleuven.be/~bosselae/ripemd160.html)
+
+.. [#SHS] FIPS PUB 180-4: Secure Hash Standard (SHS)
+ (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf)
+
+.. [#WHIRLPOOL] The WHIRLPOOL Hash Function
+ (http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
+
+.. [#BLAKE2] BLAKE2 — fast secure hashing
+ (https://blake2.net/)
+
+.. [#SHA3] FIPS PUB 202: SHA-3 Standard: Permutation-Based Hash
+ and Extendable-Output Functions
+ (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf)
+
+.. [#STREEBOG] GOST R 34.11-2012: Streebog Hash Function
+ (https://www.streebog.net/)
+
+.. [#GEMATO] gemato: Gentoo Manifest Tool
+ (https://github.com/mgorny/gemato/)
+
+Copyright
+=========
+This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
+Unported License. To view a copy of this license, visit
+http://creativecommons.org/licenses/by-sa/3.0/.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-30 16:52 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-30 16:52 UTC (permalink / raw
To: gentoo-commits
commit: e953eaff6de4207cf6135d85db8016a4d9a6fe2f
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:29:41 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Oct 30 16:45:22 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=e953eaff
glep-0074: Add two example files for reference
glep-0074.rst | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index a37ad34..65f32c3 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -369,6 +369,34 @@ uncompressed content and the specification is free to choose either
of the files using the same base name.
+An example Manifest file (informational)
+----------------------------------------
+
+An example top-level Manifest file for the Gentoo repository would have
+the following content::
+
+ TIMESTAMP 2017-10-30T10:11:12Z
+ IGNORE distfiles
+ IGNORE local
+ IGNORE lost+found
+ IGNORE packages
+ MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
+ ...
+ MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
+ ...
+
+An example modern Manifest (disregarding backwards compatibility)
+for a package directory would have the following content::
+
+ DATA SphinxTrain-0.9.1-r1.ebuild 932 SHA256 3d3b.. SHA512 be4d..
+ DATA SphinxTrain-1.0.8.ebuild 912 SHA256 f681.. SHA512 0749..
+ DATA files/gcc.patch 816 SHA256 b56e.. SHA512 2468..
+ DATA files/gcc34.patch 333 SHA256 c107.. SHA512 9919..
+ DIST SphinxTrain-0.9.1-beta.tar.gz 469617 SHA256 c1a4.. SHA512 1b33..
+ DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d..
+ MISC metadata.xml 664 SHA256 97c6.. SHA512 1175..
+
+
Rationale
=========
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-30 16:52 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-30 16:52 UTC (permalink / raw
To: gentoo-commits
commit: bbabc4dd646d142ae37a5e22f3acd3f8706b449f
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:27:51 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Oct 30 16:27:51 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=bbabc4dd
glep-0074: Split 'Directory tree coverage' section out
glep-0074.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index 1147e62..49fe0ca 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -98,6 +98,10 @@ The sub-Manifest can also be signed using OpenPGP armored cleartext
format. However, the signature verification can be omitted if it is
covered by a signed top-level Manifest.
+
+Directory tree coverage
+-----------------------
+
The Manifest files can also specify ``IGNORE`` entries to skip Manifest
verification of subdirectories and/or files. The package manager can
support injecting ignore paths to account for additional files created,
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-30 16:52 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-30 16:52 UTC (permalink / raw
To: gentoo-commits
commit: f98cabc0c30dc18f5b602865eb8e84abf429ba8d
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:28:34 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Oct 30 16:29:26 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=f98cabc0
glep-0074: Reorganize to have tag references after basic algos
Reorganize so that file & timestamp verification come first, then tag
references, then specialized algos and other informational sections.
Rename 'new Manifest tags' to 'modern ...' since some of them are old.
glep-0074.rst | 48 ++++++++++++++++++++++++------------------------
1 file changed, 24 insertions(+), 24 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index d476ff3..a37ad34 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -163,8 +163,30 @@ reject any package or even the whole repository if it may refer to files
for which the verification failed.
-New Manifest tags
------------------
+Timestamp verification
+----------------------
+
+The Manifest file can contain a ``TIMESTAMP`` entry to account
+for attacks against tree update distribution. If such an entry
+is present, it should be updated every time at least one
+of the Manifests changes. Every unique timestamp value must correspond
+to a single tree state.
+
+During the verification process, the client should compare the timestamp
+against the update time obtained from a local clock or a trusted time
+source. If the comparison result indicates that the Manifest at the time
+of receiving was already significantly outdated, the client should
+either fail the verification or require manual confirmation from user.
+
+Furthermore, the Manifest provider may employ additional methods
+of distributing the timestamps of recently generated Manifests
+using a secure channel from a trusted source for exact comparison.
+The exact details of such a solution are outside the scope of this
+specification.
+
+
+Modern Manifest tags
+--------------------
The Manifest files can specify the following tags:
@@ -228,28 +250,6 @@ allowed at the package directory level:
to ``files/`` subdirectory.
-Timestamp verification
-----------------------
-
-The Manifest file can contain a ``TIMESTAMP`` entry to account
-for attacks against tree update distribution. If such an entry
-is present, it should be updated every time at least one
-of the Manifests changes. Every unique timestamp value must correspond
-to a single tree state.
-
-During the verification process, the client should compare the timestamp
-against the update time obtained from a local clock or a trusted time
-source. If the comparison result indicates that the Manifest at the time
-of receiving was already significantly outdated, the client should
-either fail the verification or require manual confirmation from user.
-
-Furthermore, the Manifest provider may employ additional methods
-of distributing the timestamps of recently generated Manifests
-using a secure channel from a trusted source for exact comparison.
-The exact details of such a solution are outside the scope of this
-specification.
-
-
Algorithm for full-tree verification
------------------------------------
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-30 16:52 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-30 16:52 UTC (permalink / raw
To: gentoo-commits
commit: 62819e23f8aefb261879cb12cd8ff0aea7befeb0
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:45:28 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Oct 30 16:45:28 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=62819e23
glep-0074: Clarify OPTIONAL desc
glep-0074.rst | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 65f32c3..b7b5a8c 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -222,10 +222,11 @@ The Manifest files can specify the following tags:
packages (``metadata.xml``, ``use.desc``).
``OPTIONAL <path>``
- Specifies a file that would be subject to non-obligatory Manifest
- verification if it existed. The package may ignore a stray file
- matching this entry if operating in non-strict mode. Used for paths
- that would match ``MISC`` if they existed.
+ Specifies a file that does not exist in the distribution but if it
+ did, it would be marked as ``MISC``. In the strict mode, the file
+ must not exist for the verification to pass. The package manager
+ may ignore a stray file matching this entry if operating in non-strict
+ mode.
``DIST <filename> <size> <checksums>…``
Specifies a distfile entry used to verify files fetched as part
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-30 16:52 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-30 16:52 UTC (permalink / raw
To: gentoo-commits
commit: 56b06b01676c486facf372d639e9fba0a694defd
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:28:16 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Oct 30 16:28:16 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=56b06b01
glep-0074: Rewrite the file verificaton to cover OPTIONAL
glep-0074.rst | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 49fe0ca..d476ff3 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -135,16 +135,27 @@ File verification
When verifying a file against the Manifest, the following rules are
used:
-- if a file listed in Manifest is not present, then the verification
- for the file fails,
+1. If the file is covered directly or indirectly by an entry
+ of the ``IGNORE`` type, the verification always succeeds.
-- if a file listed in Manifest is present but has a different size
- or one of the checksums does not match, the verification fails,
+2. If the file is covered by an entry of the ``MANIFEST``, ``DATA``,
+ ``MISC``, ``EBUILD`` or ``AUX`` type:
-- if a file is present but not listed in Manifest, the verification
- fails,
+ a. if the file is not present, then the verification fails,
-- otherwise, the verification succeeds.
+ b. if the file is present but has a different size or one
+ of the checksums does not match, the verification fails,
+
+ c. otherwise, the verification succeeds.
+
+3. If the file is covered by an entry of the ``OPTIONAL`` type:
+
+ a. if the file is present, then the verification fails,
+
+ b. otherwise, the verification succeeds.
+
+4. If the file is present but not listed in Manifest, the verification
+ fails.
Unless specified otherwise, the package manager must not allow using
any files for which the verification failed. The package manager may
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-10-30 16:52 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-10-30 16:52 UTC (permalink / raw
To: gentoo-commits
commit: fe62b50b708262fca2d7d40b017abe97c04a6109
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:27:31 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Oct 30 16:27:31 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=fe62b50b
glep-0074: Apply more suggestions from Robin
glep-0074.rst | 40 +++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 425381f..1147e62 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-29
+Last-Modified: 2017-10-30
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -99,9 +99,12 @@ format. However, the signature verification can be omitted if it is
covered by a signed top-level Manifest.
The Manifest files can also specify ``IGNORE`` entries to skip Manifest
-verification of subdirectories and/or files. Files and directories
-starting with a dot are always implicitly ignored. All files that
-are not ignored must be covered by at least one of the Manifests.
+verification of subdirectories and/or files. The package manager can
+support injecting ignore paths to account for additional files created,
+modified or removed by user's processes that would not be ignored
+by existing rules. Files and directories starting with a dot are always
+implicitly ignored. All files that are not ignored must be covered
+by at least one of the Manifests.
A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
@@ -517,21 +520,25 @@ The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
to include a generation timestamp in the Manifest. A similar feature
was originally proposed in GLEP 58 [#GLEP58]_.
-A malicious third-party may use the principles of exclusion and replay
-to deny an update to clients, while at the same time recording
-the identity of clients to attack. The timestamp field can be used
-to detect that.
+A malicious third-party may use the principles of exclusion or replay
+[#C08]_ to deny an update to clients, while at the same time recording
+the identity of clients to attack. The timestamp field can be used to
+detect that.
In order to provide a more complete protection, the Gentoo
Infrastructure should provide an ability to obtain the timestamps
of all Manifests from a recent timeframe over a secure channel
from a trusted source for comparison.
-Strictly speaking, this is already provided by the various
-``metadata/timestamp.*`` files provided already by Gentoo which are also
-covered by the Manifest. However, including the value in the Manifest
-itself has a little cost and provides the ability to perform
-the verification stand-alone.
+Strictly speaking, this information is already provided by the various
+``metadata/timestamp*`` files that are already present. However,
+including the value in the Manifest itself has a little cost
+and provides the ability to perform the verification stand-alone.
+
+Furthermore, some of the timestamp files are added very late
+in the distribution process, past the Manifest generation phase. Those
+files will most likely receive ``IGNORE`` entries and therefore
+be not suitable to safe use.
New vs deprecated tags
@@ -699,8 +706,8 @@ ensured:
- the Manifest files inside the package directory can be signed
to provide authenticity verification,
-- if the Manifest files inside the package directory are compressed,
- a uncompressed file of identical content must coexist.
+- an uncompressed Manifest file must exist in the package directory,
+ and a compressed Manifest of identical content may be present.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
@@ -777,6 +784,9 @@ References
.. [#STREEBOG] GOST R 34.11-2012: Streebog Hash Function
(https://www.streebog.net/)
+.. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
+ (https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
+
.. [#GEMATO] gemato: Gentoo Manifest Tool
(https://github.com/mgorny/gemato/)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: f32fa7c4795adceb63bbf4c2876569fd7318c559
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:29:41 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=f32fa7c4
glep-0074: Add two example files for reference
glep-0074.rst | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index a37ad34..65f32c3 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -369,6 +369,34 @@ uncompressed content and the specification is free to choose either
of the files using the same base name.
+An example Manifest file (informational)
+----------------------------------------
+
+An example top-level Manifest file for the Gentoo repository would have
+the following content::
+
+ TIMESTAMP 2017-10-30T10:11:12Z
+ IGNORE distfiles
+ IGNORE local
+ IGNORE lost+found
+ IGNORE packages
+ MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
+ ...
+ MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
+ ...
+
+An example modern Manifest (disregarding backwards compatibility)
+for a package directory would have the following content::
+
+ DATA SphinxTrain-0.9.1-r1.ebuild 932 SHA256 3d3b.. SHA512 be4d..
+ DATA SphinxTrain-1.0.8.ebuild 912 SHA256 f681.. SHA512 0749..
+ DATA files/gcc.patch 816 SHA256 b56e.. SHA512 2468..
+ DATA files/gcc34.patch 333 SHA256 c107.. SHA512 9919..
+ DIST SphinxTrain-0.9.1-beta.tar.gz 469617 SHA256 c1a4.. SHA512 1b33..
+ DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d..
+ MISC metadata.xml 664 SHA256 97c6.. SHA512 1175..
+
+
Rationale
=========
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 686977b6deeea9b366d8a20c701916902633c990
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sat Oct 28 11:49:39 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=686977b6
glep-0074: Update based on feedback from Robin H. Johnson
glep-0074.rst | 66 ++++++++++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 52 insertions(+), 14 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index e9f8bad..425381f 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-26
+Last-Modified: 2017-10-29
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -49,7 +49,7 @@ This specification is designed with the following goals in mind:
1. It should provide means to ensure the authenticity of the complete
repository, including preventing the injection of additional files.
-2. Alike the original Manifest2, the files should be split into two
+2. Like the original Manifest2, the files should be split into two
groups — files whose authenticity is critical, and those whose
mismatch may be accepted in non-strict mode. The same classification
should apply both to files listed in Manifests, and to stray files
@@ -115,11 +115,11 @@ The file entries (except for ``IGNORE``) can be specified for regular
files only. Symbolic links are followed when opening files. It is
an error to specify an entry for a different file type.
-All the files covered by a Manifest tree must reside on the same
-filesystem. It is an error to specify entries applying to files
-on another filesystem. If subdirectories of the Manifest tree reside
-on a different filesystem, they must be explicitly excluded
-via ``IGNORE``.
+All the local (non-``DIST``) files covered by a Manifest tree must
+reside on the same filesystem. It is an error to specify entries
+applying to files on another filesystem. If subdirectories
+of the Manifest tree reside on a different filesystem, they must
+be explicitly excluded via ``IGNORE``.
File verification
@@ -156,7 +156,8 @@ The Manifest files can specify the following tags:
combined date and time in UTC timezone, i.e. using the following
``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optionally used
in the top-level Manifest file. The package manager can use it
- to detect an outdated repository checkout.
+ to detect an outdated repository checkout as described in `Timestamp
+ verification`_.
``MANIFEST <path> <size> <checksums>…``
Specifies a sub-Manifest. The sub-Manifest must be verified like
@@ -209,6 +210,28 @@ allowed at the package directory level:
to ``files/`` subdirectory.
+Timestamp verification
+----------------------
+
+The Manifest file can contain a ``TIMESTAMP`` entry to account
+for attacks against tree update distribution. If such an entry
+is present, it should be updated every time at least one
+of the Manifests changes. Every unique timestamp value must correspond
+to a single tree state.
+
+During the verification process, the client should compare the timestamp
+against the update time obtained from a local clock or a trusted time
+source. If the comparison result indicates that the Manifest at the time
+of receiving was already significantly outdated, the client should
+either fail the verification or require manual confirmation from user.
+
+Furthermore, the Manifest provider may employ additional methods
+of distributing the timestamps of recently generated Manifests
+using a secure channel from a trusted source for exact comparison.
+The exact details of such a solution are outside the scope of this
+specification.
+
+
Algorithm for full-tree verification
------------------------------------
@@ -218,8 +241,9 @@ can be used:
1. Collect all files present in the repository into *present* set.
2. Start at the top-level Manifest file. Verify its OpenPGP signature.
- Optionally verify the ``TIMESTAMP`` entry if present. Remove
- the top-level Manifest from the *present* set.
+ Optionally verify the ``TIMESTAMP`` entry if present as specified
+ in `timestamp verification`. Remove the top-level Manifest
+ from the *present* set.
3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
files according to `file verification`_ section, and include their
@@ -232,7 +256,11 @@ can be used:
5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
``EBUILD`` and ``AUX`` entries into the *covered* set.
-6. Verify all the files in the union of the *present* and *covered*
+6. Verify the entries in *covered* set for incompatible duplicates
+ and collisions with ignored files as explained in `Manifest file
+ locations and nesting`_.
+
+7. Verify all the files in the union of the *present* and *covered*
sets, according to `file verification`_ section.
@@ -489,8 +517,15 @@ The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
to include a generation timestamp in the Manifest. A similar feature
was originally proposed in GLEP 58 [#GLEP58]_.
-The timestamp can be used to detect delay or replay attacks against
-Gentoo mirrors.
+A malicious third-party may use the principles of exclusion and replay
+to deny an update to clients, while at the same time recording
+the identity of clients to attack. The timestamp field can be used
+to detect that.
+
+In order to provide a more complete protection, the Gentoo
+Infrastructure should provide an ability to obtain the timestamps
+of all Manifests from a recent timeframe over a secure channel
+from a trusted source for comparison.
Strictly speaking, this is already provided by the various
``metadata/timestamp.*`` files provided already by Gentoo which are also
@@ -662,7 +697,10 @@ ensured:
the deprecated ``EBUILD`` tag (rather than ``DATA``),
- the Manifest files inside the package directory can be signed
- to provide authenticity verification.
+ to provide authenticity verification,
+
+- if the Manifest files inside the package directory are compressed,
+ a uncompressed file of identical content must coexist.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: e8cfa4be127433e13ce962086cd5871176b78e78
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:28:34 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=e8cfa4be
glep-0074: Reorganize to have tag references after basic algos
Reorganize so that file & timestamp verification come first, then tag
references, then specialized algos and other informational sections.
Rename 'new Manifest tags' to 'modern ...' since some of them are old.
glep-0074.rst | 48 ++++++++++++++++++++++++------------------------
1 file changed, 24 insertions(+), 24 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index d476ff3..a37ad34 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -163,8 +163,30 @@ reject any package or even the whole repository if it may refer to files
for which the verification failed.
-New Manifest tags
------------------
+Timestamp verification
+----------------------
+
+The Manifest file can contain a ``TIMESTAMP`` entry to account
+for attacks against tree update distribution. If such an entry
+is present, it should be updated every time at least one
+of the Manifests changes. Every unique timestamp value must correspond
+to a single tree state.
+
+During the verification process, the client should compare the timestamp
+against the update time obtained from a local clock or a trusted time
+source. If the comparison result indicates that the Manifest at the time
+of receiving was already significantly outdated, the client should
+either fail the verification or require manual confirmation from user.
+
+Furthermore, the Manifest provider may employ additional methods
+of distributing the timestamps of recently generated Manifests
+using a secure channel from a trusted source for exact comparison.
+The exact details of such a solution are outside the scope of this
+specification.
+
+
+Modern Manifest tags
+--------------------
The Manifest files can specify the following tags:
@@ -228,28 +250,6 @@ allowed at the package directory level:
to ``files/`` subdirectory.
-Timestamp verification
-----------------------
-
-The Manifest file can contain a ``TIMESTAMP`` entry to account
-for attacks against tree update distribution. If such an entry
-is present, it should be updated every time at least one
-of the Manifests changes. Every unique timestamp value must correspond
-to a single tree state.
-
-During the verification process, the client should compare the timestamp
-against the update time obtained from a local clock or a trusted time
-source. If the comparison result indicates that the Manifest at the time
-of receiving was already significantly outdated, the client should
-either fail the verification or require manual confirmation from user.
-
-Furthermore, the Manifest provider may employ additional methods
-of distributing the timestamps of recently generated Manifests
-using a secure channel from a trusted source for exact comparison.
-The exact details of such a solution are outside the scope of this
-specification.
-
-
Algorithm for full-tree verification
------------------------------------
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 922e3beea71b97557eac92f56df35774fbdc3ced
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:27:51 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=922e3bee
glep-0074: Split 'Directory tree coverage' section out
glep-0074.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index 1147e62..49fe0ca 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -98,6 +98,10 @@ The sub-Manifest can also be signed using OpenPGP armored cleartext
format. However, the signature verification can be omitted if it is
covered by a signed top-level Manifest.
+
+Directory tree coverage
+-----------------------
+
The Manifest files can also specify ``IGNORE`` entries to skip Manifest
verification of subdirectories and/or files. The package manager can
support injecting ignore paths to account for additional files created,
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 051ba47beace1bce0f53d2a5a4e9b3fc5d92e990
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:45:28 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=051ba47b
glep-0074: Clarify OPTIONAL desc
glep-0074.rst | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 65f32c3..b7b5a8c 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -222,10 +222,11 @@ The Manifest files can specify the following tags:
packages (``metadata.xml``, ``use.desc``).
``OPTIONAL <path>``
- Specifies a file that would be subject to non-obligatory Manifest
- verification if it existed. The package may ignore a stray file
- matching this entry if operating in non-strict mode. Used for paths
- that would match ``MISC`` if they existed.
+ Specifies a file that does not exist in the distribution but if it
+ did, it would be marked as ``MISC``. In the strict mode, the file
+ must not exist for the verification to pass. The package manager
+ may ignore a stray file matching this entry if operating in non-strict
+ mode.
``DIST <filename> <size> <checksums>…``
Specifies a distfile entry used to verify files fetched as part
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 34ff419e94eb315ff5f5fdb33f1a974f03162399
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:28:16 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=34ff419e
glep-0074: Rewrite the file verificaton to cover OPTIONAL
glep-0074.rst | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 49fe0ca..d476ff3 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -135,16 +135,27 @@ File verification
When verifying a file against the Manifest, the following rules are
used:
-- if a file listed in Manifest is not present, then the verification
- for the file fails,
+1. If the file is covered directly or indirectly by an entry
+ of the ``IGNORE`` type, the verification always succeeds.
-- if a file listed in Manifest is present but has a different size
- or one of the checksums does not match, the verification fails,
+2. If the file is covered by an entry of the ``MANIFEST``, ``DATA``,
+ ``MISC``, ``EBUILD`` or ``AUX`` type:
-- if a file is present but not listed in Manifest, the verification
- fails,
+ a. if the file is not present, then the verification fails,
-- otherwise, the verification succeeds.
+ b. if the file is present but has a different size or one
+ of the checksums does not match, the verification fails,
+
+ c. otherwise, the verification succeeds.
+
+3. If the file is covered by an entry of the ``OPTIONAL`` type:
+
+ a. if the file is present, then the verification fails,
+
+ b. otherwise, the verification succeeds.
+
+4. If the file is present but not listed in Manifest, the verification
+ fails.
Unless specified otherwise, the package manager must not allow using
any files for which the verification failed. The package manager may
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 59600009f66c0443faa4725003c9e8badc63de51
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:27:31 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=59600009
glep-0074: Apply more suggestions from Robin
glep-0074.rst | 40 +++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 425381f..1147e62 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-29
+Last-Modified: 2017-10-30
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -99,9 +99,12 @@ format. However, the signature verification can be omitted if it is
covered by a signed top-level Manifest.
The Manifest files can also specify ``IGNORE`` entries to skip Manifest
-verification of subdirectories and/or files. Files and directories
-starting with a dot are always implicitly ignored. All files that
-are not ignored must be covered by at least one of the Manifests.
+verification of subdirectories and/or files. The package manager can
+support injecting ignore paths to account for additional files created,
+modified or removed by user's processes that would not be ignored
+by existing rules. Files and directories starting with a dot are always
+implicitly ignored. All files that are not ignored must be covered
+by at least one of the Manifests.
A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
@@ -517,21 +520,25 @@ The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
to include a generation timestamp in the Manifest. A similar feature
was originally proposed in GLEP 58 [#GLEP58]_.
-A malicious third-party may use the principles of exclusion and replay
-to deny an update to clients, while at the same time recording
-the identity of clients to attack. The timestamp field can be used
-to detect that.
+A malicious third-party may use the principles of exclusion or replay
+[#C08]_ to deny an update to clients, while at the same time recording
+the identity of clients to attack. The timestamp field can be used to
+detect that.
In order to provide a more complete protection, the Gentoo
Infrastructure should provide an ability to obtain the timestamps
of all Manifests from a recent timeframe over a secure channel
from a trusted source for comparison.
-Strictly speaking, this is already provided by the various
-``metadata/timestamp.*`` files provided already by Gentoo which are also
-covered by the Manifest. However, including the value in the Manifest
-itself has a little cost and provides the ability to perform
-the verification stand-alone.
+Strictly speaking, this information is already provided by the various
+``metadata/timestamp*`` files that are already present. However,
+including the value in the Manifest itself has a little cost
+and provides the ability to perform the verification stand-alone.
+
+Furthermore, some of the timestamp files are added very late
+in the distribution process, past the Manifest generation phase. Those
+files will most likely receive ``IGNORE`` entries and therefore
+be not suitable to safe use.
New vs deprecated tags
@@ -699,8 +706,8 @@ ensured:
- the Manifest files inside the package directory can be signed
to provide authenticity verification,
-- if the Manifest files inside the package directory are compressed,
- a uncompressed file of identical content must coexist.
+- an uncompressed Manifest file must exist in the package directory,
+ and a compressed Manifest of identical content may be present.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
@@ -777,6 +784,9 @@ References
.. [#STREEBOG] GOST R 34.11-2012: Streebog Hash Function
(https://www.streebog.net/)
+.. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
+ (https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
+
.. [#GEMATO] gemato: Gentoo Manifest Tool
(https://github.com/mgorny/gemato/)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 75bbb25e8db29a3d09b7da57f3b9fd9bddd79e11
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 2 18:19:35 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=75bbb25e
glep-0074: Remove OPTIONAL
glep-0074.rst | 29 ++++-------------------------
1 file changed, 4 insertions(+), 25 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index b7b5a8c..f256451 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -148,13 +148,7 @@ used:
c. otherwise, the verification succeeds.
-3. If the file is covered by an entry of the ``OPTIONAL`` type:
-
- a. if the file is present, then the verification fails,
-
- b. otherwise, the verification succeeds.
-
-4. If the file is present but not listed in Manifest, the verification
+3. If the file is present but not listed in Manifest, the verification
fails.
Unless specified otherwise, the package manager must not allow using
@@ -221,13 +215,6 @@ The Manifest files can specify the following tags:
in non-strict mode. Used for files that do not affect the installed
packages (``metadata.xml``, ``use.desc``).
-``OPTIONAL <path>``
- Specifies a file that does not exist in the distribution but if it
- did, it would be marked as ``MISC``. In the strict mode, the file
- must not exist for the verification to pass. The package manager
- may ignore a stray file matching this entry if operating in non-strict
- mode.
-
``DIST <filename> <size> <checksums>…``
Specifies a distfile entry used to verify files fetched as part
of ``SRC_URI``. The filename must match the filename used to store
@@ -272,8 +259,8 @@ can be used:
4. Process all ``IGNORE`` entries. Remove any paths matching them
from the *present* set.
-5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
- ``EBUILD`` and ``AUX`` entries into the *covered* set.
+5. Collect all files covered by ``DATA``, ``MISC``, ``EBUILD``
+ and ``AUX`` entries into the *covered* set.
6. Verify the entries in *covered* set for incompatible duplicates
and collisions with ignored files as explained in `Manifest file
@@ -550,12 +537,6 @@ It aims to account for two use cases:
2. Accounting for automatically generated files that might be updated
by standard tooling.
-The traditional ``MISC`` type is amended with a complementary
-``OPTIONAL`` tag to account for files that are not provided
-in the specific repository. It aims to ensure that the same path would
-be non-fatal when provided by the repository but fatal when created
-by the user tooling.
-
Timestamp field
---------------
@@ -643,9 +624,7 @@ on providing them via an additional rsync module.
If such files were injected into the repository, they would cause strict
verification failures of Manifests. To account for this, Infra could
-provide either ``OPTIONAL`` entries for the Manifest files to allow them
-in non-strict verification mode, or ``IGNORE`` entries to allow them
-in the strict mode.
+provide ``IGNORE`` entries to allow them to exist.
Splitting distfile checksums from file checksums
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 3d0bdf3bc2fb72de1f80cf42a57dd4002cff59f7
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Fri Oct 13 15:07:01 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Fri Oct 27 17:44:21 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=3d0bdf3b
glep-0001: Clearly indicate that 'Replaces' is multi-value
Use the plural 'glep numbers' form for the Replaces header to indicate
it may have multiple values. This is already allowed by the text
of the GLEP.
Bug: https://bugs.gentoo.org/577760
glep-0001.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/glep-0001.rst b/glep-0001.rst
index 6b9fb8a..addfa21 100644
--- a/glep-0001.rst
+++ b/glep-0001.rst
@@ -260,7 +260,7 @@ All other headers are required.
Post-History: <dates of postings to mailing lists>
Content-Type: <text/x-rst>
* Requires: <glep numbers>
- * Replaces: <glep number>
+ * Replaces: <glep numbers>
* Replaced-By: <glep number>
---
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: e5f475eebfbab4a7d0a090bfddbdea1e26c82d75
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sun Oct 22 13:19:20 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=e5f475ee
glep-0074: Full-tree verification using Manifest files
glep-0074.rst | 749 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 749 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
new file mode 100644
index 0000000..e9f8bad
--- /dev/null
+++ b/glep-0074.rst
@@ -0,0 +1,749 @@
+---
+GLEP: 74
+Title: Full-tree verification using Manifest files
+Author: Michał Górny <mgorny@gentoo.org>,
+ Robin Hugh Johnson <robbat2@gentoo.org>,
+ Ulrich Müller <ulm@gentoo.org>
+Type: Standards Track
+Status: Draft
+Version: 1
+Created: 2017-10-21
+Last-Modified: 2017-10-26
+Post-History: 2017-10-26
+Content-Type: text/x-rst
+Requires: 59, 61
+Replaces: 44, 58, 60
+---
+
+Abstract
+========
+
+This GLEP extends the Manifest file format to cover full-tree file
+integrity and authenticity checks.The format aims to be future-proof,
+efficient and provide means of backwards compatibility.
+
+
+Motivation
+==========
+
+The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
+means of verifying the integrity of distfiles and package files
+in Gentoo. Combined with OpenPGP signatures, they provide means to
+ensure the authenticity of the covered files. However, as noted
+in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
+authenticity verification as they do not cover any files outside
+the package directory. In particular, they provide multiple ways
+for a third party to inject malicious code into the ebuild environment.
+
+Historically, the topic of providing authenticity coverage for the whole
+repository has been mentioned multiple times. The most noteworthy effort
+are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
+They were accepted by the Council in 2010 but have never been
+implemented. When potential implementation work started in 2017, a new
+discussion about the specification arose. It prompted the creation
+of a competing GLEP that would provide a redesigned alternative to
+the old GLEPs.
+
+This specification is designed with the following goals in mind:
+
+1. It should provide means to ensure the authenticity of the complete
+ repository, including preventing the injection of additional files.
+
+2. Alike the original Manifest2, the files should be split into two
+ groups — files whose authenticity is critical, and those whose
+ mismatch may be accepted in non-strict mode. The same classification
+ should apply both to files listed in Manifests, and to stray files
+ present only in the repository.
+
+3. The format should be universal enough to work both for the Gentoo
+ repository and third-party repositories of different characteristics.
+
+4. The Manifest files should be verifiable stand-alone, that is without
+ knowing any details about the underlying repository format.
+
+
+Specification
+=============
+
+Manifest file format
+--------------------
+
+This specification reuses and extends the Manifest file format defined
+in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
+repurposed as a generic *tag* that could also indicate additional
+(non-checksum) metadata. Appropriately, those tags can be followed by
+other space-separated values.
+
+Unless specified otherwise, the paths used in the Manifest files
+are relative to the directory containing the Manifest file. The paths
+must not reference the parent directory (``..``).
+
+
+Manifest file locations and nesting
+-----------------------------------
+
+The ``Manifest`` file located in the root directory of the repository
+is called top-level Manifest, and it is used to perform the full-tree
+verification. In order to verify the authenticity, it must be signed
+using OpenPGP, using the armored cleartext format.
+
+The top-level Manifest may reference sub-Manifests contained
+in subdirectories of the repository. The sub-Manifests are traditionally
+named ``Manifest``; however, the implementation must support arbitrary
+names, including the possibility of multiple (split) Manifests
+for a single directory. The sub-Manifest can only cover the files inside
+the directory tree where it resides.
+
+The sub-Manifest can also be signed using OpenPGP armored cleartext
+format. However, the signature verification can be omitted if it is
+covered by a signed top-level Manifest.
+
+The Manifest files can also specify ``IGNORE`` entries to skip Manifest
+verification of subdirectories and/or files. Files and directories
+starting with a dot are always implicitly ignored. All files that
+are not ignored must be covered by at least one of the Manifests.
+
+A single file may be matched by multiple identical or equivalent
+Manifest entries, if and only if the entries have the same semantics,
+specify the same size and the checksums common to both entries match.
+It is an error for a single file to be matched by multiple entries
+of different semantics, file size or checksum values. It is an error
+to specify another entry for a file matching ``IGNORE``, or one of its
+subdirectories.
+
+The file entries (except for ``IGNORE``) can be specified for regular
+files only. Symbolic links are followed when opening files. It is
+an error to specify an entry for a different file type.
+
+All the files covered by a Manifest tree must reside on the same
+filesystem. It is an error to specify entries applying to files
+on another filesystem. If subdirectories of the Manifest tree reside
+on a different filesystem, they must be explicitly excluded
+via ``IGNORE``.
+
+
+File verification
+-----------------
+
+When verifying a file against the Manifest, the following rules are
+used:
+
+- if a file listed in Manifest is not present, then the verification
+ for the file fails,
+
+- if a file listed in Manifest is present but has a different size
+ or one of the checksums does not match, the verification fails,
+
+- if a file is present but not listed in Manifest, the verification
+ fails,
+
+- otherwise, the verification succeeds.
+
+Unless specified otherwise, the package manager must not allow using
+any files for which the verification failed. The package manager may
+reject any package or even the whole repository if it may refer to files
+for which the verification failed.
+
+
+New Manifest tags
+-----------------
+
+The Manifest files can specify the following tags:
+
+``TIMESTAMP <iso8601>``
+ Specifies a timestamp of when the Manifest file was last updated.
+ The timestamp must be a valid second-precision ISO8601 extended format
+ combined date and time in UTC timezone, i.e. using the following
+ ``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optionally used
+ in the top-level Manifest file. The package manager can use it
+ to detect an outdated repository checkout.
+
+``MANIFEST <path> <size> <checksums>…``
+ Specifies a sub-Manifest. The sub-Manifest must be verified like
+ a regular file. If the verification succeeds, the entries from
+ the sub-Manifest are included for verification as described
+ in `Manifest file locations and nesting`_.
+
+``IGNORE <path>``
+ Ignores a subdirectory or file from Manifest checks. If the specified
+ path is present, it and its contents are omitted from the Manifest
+ verification (always pass).
+
+``DATA <path> <size> <checksums>…``
+ Specifies a file subject to obligatory Manifest verification.
+ The file is required to pass verification. Used for all files directly
+ affecting package manager operation (ebuilds, eclasses, profiles).
+
+``MISC <path> <size> <checksums>…``
+ Specifies a file subject to non-obligatory Manifest verification.
+ The package manager may ignore a verification failure if operating
+ in non-strict mode. Used for files that do not affect the installed
+ packages (``metadata.xml``, ``use.desc``).
+
+``OPTIONAL <path>``
+ Specifies a file that would be subject to non-obligatory Manifest
+ verification if it existed. The package may ignore a stray file
+ matching this entry if operating in non-strict mode. Used for paths
+ that would match ``MISC`` if they existed.
+
+``DIST <filename> <size> <checksums>…``
+ Specifies a distfile entry used to verify files fetched as part
+ of ``SRC_URI``. The filename must match the filename used to store
+ the fetched file as specified in the PMS [#PMS-FETCH]_. The package
+ manager must reject the fetched file if it fails verification.
+ ``DIST`` entries apply to all packages below the Manifest file
+ specifying them.
+
+
+Deprecated Manifest tags
+------------------------
+
+For backwards compatibility, the following tags are additionally
+allowed at the package directory level:
+
+``EBUILD <filename> <size> <checksums>…``
+ Equivalent to the ``DATA`` type.
+
+``AUX <filename> <size> <checksums>…``
+ Equivalent to the ``DATA`` type, except that the filename is relative
+ to ``files/`` subdirectory.
+
+
+Algorithm for full-tree verification
+------------------------------------
+
+In order to perform full-tree verification, the following algorithm
+can be used:
+
+1. Collect all files present in the repository into *present* set.
+
+2. Start at the top-level Manifest file. Verify its OpenPGP signature.
+ Optionally verify the ``TIMESTAMP`` entry if present. Remove
+ the top-level Manifest from the *present* set.
+
+3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
+ files according to `file verification`_ section, and include their
+ entries in the current Manifest entry list (using paths relative
+ to directories containing the Manifests).
+
+4. Process all ``IGNORE`` entries. Remove any paths matching them
+ from the *present* set.
+
+5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
+ ``EBUILD`` and ``AUX`` entries into the *covered* set.
+
+6. Verify all the files in the union of the *present* and *covered*
+ sets, according to `file verification`_ section.
+
+
+Algorithm for finding parent Manifests
+--------------------------------------
+
+In order to find the top-level Manifest from the current directory
+the following algorithm can be used:
+
+1. Store the current directory as *original* and the device ID
+ of the containing filesystem (``st_dev``) as *startdev*,
+
+2. If the device ID of the containing filesystem (``st_dev``)
+ of the current directory is different than *startdev*, stop.
+
+3. If the current directory contains a ``Manifest`` file:
+
+ a. If a ``IGNORE`` entry in the ``Manifest`` file covers
+ the *original* directory (or one of the parent directories), stop.
+
+ b. Otherwise, store the current directory as *last_found*.
+
+4. If the current directory is the root system directory (``/``), stop.
+
+5. Otherwise, enter the parent directory and jump to step 2.
+
+Once the algorithm stops, *last_found* will contain the relevant
+top-level Manifest. If *last_found* is null, then the directory tree
+does not contain any valid top-level Manifest candidates and one should
+be created in the *original* directory.
+
+Once the top-level Manifest is found, its ``MANIFEST`` entries should
+be used to find any sub-Manifests below the top-level Manifest,
+up to and including the *original* directory. Note that those
+sub-Manifests can use different filenames than ``Manifest``.
+
+
+Checksum algorithms
+-------------------
+
+This section is informational only. Specifying the exact set
+of supported algorithms is outside the scope of this specification.
+
+The algorithm names reserved at the time of writing are:
+
+- ``MD5`` [#MD5]_,
+- ``RMD160`` — RIPEMD-160 [#RIPEMD160]_,
+- ``SHA1`` [#SHS]_,
+- ``SHA256`` and ``SHA512`` — SHA-2 family of hashes [#SHS]_,
+- ``WHIRLPOOL`` [#WHIRLPOOL]_,
+- ``BLAKE2B`` and ``BLAKE2S`` — BLAKE2 family of hashes [#BLAKE2]_,
+- ``SHA3_256`` and ``SHA3_512`` — SHA-3 family of hashes [#SHA3]_,
+- ``STREEBOG256`` and ``STREEBOG512`` — Streebog family of hashes
+ [#STREEBOG]_.
+
+The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
+It is recommended that any new hashes are named after the Python
+``hashlib`` module algorithm names, transformed into uppercase.
+
+
+Manifest compression
+--------------------
+
+The topic of Manifest file compression is covered by GLEP 61 [#GLEP61]_.
+This section merely addresses interoperability issues between Manifest
+compression and this specification.
+
+The compressed Manifest files are required to be suffixed for their
+compression algorithm. This suffix should be used to recognize
+the compression and decompress Manifests transparently. The exact list
+of algorithms and their corresponding suffixes are outside the scope
+of this specification.
+
+Whenever this specification refers to top-level Manifest file,
+the implementation should account for compressed variants of this file
+with appropriate suffixes (e.g. ``Manifest.gz``).
+
+Whenever this specification refers to sub-Manifests, they can use any
+names but are also required to use a specific compression suffix.
+The ``MANIFEST`` entries are required to specify the full name including
+compression suffix, and the verification is performed on the compressed
+file.
+
+The specification permits uncompressed Manifests to exist alongside
+their compressed counterparts, and multiple compressed formats
+to coexist. If that is the case, the files must have the same
+uncompressed content and the specification is free to choose either
+of the files using the same base name.
+
+
+Rationale
+=========
+
+Stand-alone format
+------------------
+
+The first question that needed to be asked before proceeding with
+the design was whether the Manifest file format was supposed to be
+stand-alone, or tightly bound to the repository format.
+
+The stand-alone format has been selected because of its three
+advantages:
+
+1. It is more future-proof. If an incompatible change to the repository
+ format is introduced, only developers need to be upgrade the tools
+ they use to generate the Manifests. The tools used to verify
+ the updated Manifests will continue to work.
+
+2. It is more flexible and universal. With a dedicated tool,
+ the Manifest files can be used to sign and verify arbitrary file
+ sets.
+
+3. It keeps the verification tool simpler. In particular, we can easily
+ write an independent verification tool that could work on any
+ distribution without needing to depend on a package manager
+ implementation or rewrite parts of it.
+
+Designing a stand-alone format requires that the Manifest carries enough
+information to perform the verification following all the rules specific
+to the Gentoo repository.
+
+
+Tree design
+-----------
+
+The second important point of the design was determining whether
+the Manifest files should be structured hierarchically, or independent.
+Both options have their advantages.
+
+In the hierarchical model, each sub-Manifest file is covered by a higher
+level Manifest. As a result, only the top-level Manifest has to be
+OpenPGP-signed, and subsequent Manifests need to be only verified by
+checksum stored in the parent Manifest. This has the following
+implications:
+
+- Verifying any set of files in the repository requires using checksums
+ from the most relevant Manifests and the parent Manifests.
+
+- The OpenPGP signature of the top-level Manifest needs to be verified
+ only once per process.
+
+- Altering any set of files requires updating the relevant Manifests,
+ and their parent Manifests up to the top-level Manifest, and signing
+ the last one.
+
+- As a result, the top-level Manifest changes on every commit,
+ and various middle-level Manifests change (and need to be transferred)
+ frequently.
+
+In the independent model, each sub-Manifest file is independent
+of the parent Manifests. As a result, each of them needs to be signed
+and verified independently. However, the parent Manifests still need
+to list sub-Manifests (albeit without verification data) in order
+to detect removal or replacement of subdirectories. This has
+the following implications:
+
+- Verifying any set of files in the repository requires using checksums
+ and verifying signatures of the most relevant Manifest files.
+
+- Altering any set of files requires updating the relevant Manifests
+ and signing them again.
+
+- Parent Manifests are updated only when Manifests are added or removed
+ from subdirectories. As a result, they change infrequently.
+
+While both models have their advantages, the hierarchical model was
+selected because it reduces the number of OpenPGP operations
+which are comparatively costly to the minimum.
+
+
+Tree layout restrictions
+------------------------
+
+The algorithm is meant to work primarily with ebuild repositories which
+normally contain only files and directories. Directories provide
+no useful metadata for verification, and specifying special entries
+for additional file types is purposeless. Therefore, the specification
+is restricted to dealing with regular files.
+
+The Gentoo repository does not use symbolic links. Some Gentoo
+repositories do, however. To provide a simple solution for dealing with
+symlinks without having to take care to implement special handling for
+them, the common behavior of implicitly resolving them is used.
+Therefore, symbolic links to files are stored as if they were regular
+files, and symbolic links to directories are followed as if they were
+regular directories.
+
+Dotfiles are implicitly ignored as that is a common notion used
+in software written for POSIX systems. All other filenames require
+explicit ``IGNORE`` lines.
+
+The algorithm is restricted to work on a single filesystem. This is
+mostly relevant when scanning for top-level Manifest — we do not want
+to cross filesystem boundaries then. However, to ensure consistent
+bidirectional behavior we need to also ban them when operating downwards
+the tree.
+
+The directories and files on different filesystems needs to be ignored
+explicitly as implicitly skipping them would cause confusion.
+In particular, tools might then claim that a file does not exist when
+it clearly does because it was skipped due to filesystem boundaries.
+
+
+File verification model
+-----------------------
+
+The verification model aims to provide full coverage against different
+forms of attack. In particular, three different kinds of manipulation
+are considered:
+
+1. Alteration of the file content.
+
+2. Removal of a file.
+
+3. Addition of a new file.
+
+In order to prevent against all three, the system requires that all
+files in the repository are listed in Manifests and verified against
+them.
+
+As a special case, ignores are allowed to account for directories
+that are not part of the repository but were traditionally placed inside
+it. Those directories were ``distfiles``, ``local`` and ``packages``. It
+could be also used to ignore VCS directories such as ``CVS``.
+
+
+Non-obligatory Manifest verification
+------------------------------------
+
+While this specification recommends all tools to use strict verification
+by default, it allows declaring some files as non-obligatory like
+the original Manifest2 format did. This could be used on files that do
+not affect the normal package manager operation.
+
+It aims to account for two use cases:
+
+1. Stripping down files that are not strictly required to install
+ packages from repository checkouts.
+
+2. Accounting for automatically generated files that might be updated
+ by standard tooling.
+
+The traditional ``MISC`` type is amended with a complementary
+``OPTIONAL`` tag to account for files that are not provided
+in the specific repository. It aims to ensure that the same path would
+be non-fatal when provided by the repository but fatal when created
+by the user tooling.
+
+
+Timestamp field
+---------------
+
+The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
+to include a generation timestamp in the Manifest. A similar feature
+was originally proposed in GLEP 58 [#GLEP58]_.
+
+The timestamp can be used to detect delay or replay attacks against
+Gentoo mirrors.
+
+Strictly speaking, this is already provided by the various
+``metadata/timestamp.*`` files provided already by Gentoo which are also
+covered by the Manifest. However, including the value in the Manifest
+itself has a little cost and provides the ability to perform
+the verification stand-alone.
+
+
+New vs deprecated tags
+----------------------
+
+Out of the four types defined by Manifest2, two are reused and two are
+marked deprecated.
+
+The ``DIST`` and ``MISC`` tags are reused since they can be relatively
+clearly marked into the new concept.
+
+The ``EBUILD`` tag could potentially be reused for generic file
+verification data. However, it would be confusing if all the different
+data files were marked as ``EBUILD``. Therefore, an equivalent ``DATA``
+type was introduced as a replacement.
+
+The ``AUX`` tag is deprecated as it is redundant to ``DATA``, and has
+the limiting property of implicit ``files/`` path prefix.
+
+
+Finding top-level Manifest
+--------------------------
+
+The development of a reference implementation for this GLEP has brought
+the following problem: how to find all the relevant Manifests when
+the Manifest tool is run inside a subdirectory of the repository?
+
+One of the options would be to provide a bi-directional linking
+of Manifests via a ``PARENT`` tag. However, that would not solve
+the problem when a new Manifest file is being created.
+
+Instead, an algorithm for iterating over parent directories is proposed.
+Since there is no obligatory explicit indicator for the top-level
+Manifest, the algorithm assumes that the top-level Manifest
+is the highest ``Manifest`` in the directory hierarchy that can cover
+the current directory. This generally makes sense since the Manifest
+files are required to provide coverage for all subdirectories, so all
+Manifests starting from that one need to be updated.
+
+If independent Manifest trees are nested in the directory structure,
+then an ``IGNORE`` entry needs to be used to separate them.
+
+Since sub-Manifests can use any filenames, the Manifest finding
+algorithm must not short-cut the procedure by storing all ``Manifest``
+files along the parent directories. Instead, it needs to retrace
+the relevant sub-Manifest files along ``MANIFEST`` entries
+in the top-level Manifest.
+
+
+Injecting ChangeLogs into the checkout
+--------------------------------------
+
+One of the problems considered in the new Manifest format was that
+of injecting historical and autogenerated ChangeLog into the repository.
+Normally we are not including those files to reduce the checkout size.
+However, some users have shown interest in them and Infra is working
+on providing them via an additional rsync module.
+
+If such files were injected into the repository, they would cause strict
+verification failures of Manifests. To account for this, Infra could
+provide either ``OPTIONAL`` entries for the Manifest files to allow them
+in non-strict verification mode, or ``IGNORE`` entries to allow them
+in the strict mode.
+
+
+Splitting distfile checksums from file checksums
+------------------------------------------------
+
+Another problem with the current Manifest format is that the checksums
+for fetched files are combined with checksums for local files
+in a single file inside the package directory. It has been specifically
+pointed out that:
+
+- since distfiles are sometimes reused across different packages,
+ the repeating checksums are redundant,
+
+- mirror admins were interested in the possibility of verifying all
+ the distfiles with a single tool.
+
+This specification does not provide a clean solution to this problem.
+It technically permits moving ``DIST`` entries to higher-level Manifests
+but the usefulness of such a solution is doubtful.
+
+However, for the second problem we will probably deliver a dedicated
+tool working with this Manifest format.
+
+
+Hash algorithms
+---------------
+
+While maintaining a consistent supported hash set is important
+for interoperability, it is no good fit for the generic layout of this
+GLEP. Furthermore, it would require updating the GLEP in the future
+every time the used algorithms change.
+
+Instead, the specification focuses on listing the currently used
+algorithm names for interoperability, and sets a recommendation
+for consistent naming of algorithms in the future. The Python
+``hashlib`` module is used as a reference since it is used
+as the provider of hash functions for most of the Python software,
+including Portage and PkgCore.
+
+The basic rules for changing hash algorithms are defined in GLEP 59
+[#GLEP59]_. The implementations can focus only on those algorithms
+that are actually used or planned on being used. It may be feasible
+to devise a new GLEP that specifies the currently used hashes (or update
+GLEP 59 accordingly).
+
+
+Manifest compression
+--------------------
+
+The support for Manifest compression is introduced with minimal changes
+to the file format. The ``MANIFEST`` entries are required to provide
+the real (compressed) file path for compatibility with other file
+entries and to avoid confusion.
+
+The existence of additional entries for uncompressed Manifest checksums
+was debated. However, plain entries for the uncompressed file would
+be confusing if only compressed file existed, and conflicting if both
+uncompressed and compressed variants existed. Furthermore, it has been
+pointed out that ``DIST`` entries do not have uncompressed variant
+either.
+
+
+Performance considerations
+--------------------------
+
+Performing a full-tree verification on every sync raises some
+performance concerns for end-user systems. The initial testing has shown
+that a cold-cache verification on a btrfs file system can take up around
+4 minutes, with the process being mostly I/O bound. On the other hand,
+it can be expected that the verification will be performed directly
+after syncing, taking advantage of warm filesystem cache.
+
+To improve speed on I/O and/or CPU-restrained systems even further,
+the algorithms can be easily extended to perform incremental
+verification. Given that rsync does not preserve mtimes by default,
+the tool can take advantage of mtime and Manifest comparisons to recheck
+only the parts of the repository that have changed.
+
+Furthermore, the package manager implementations can restrict checking
+only to the parts of the repository that are actually being used.
+
+
+Backwards Compatibility
+=======================
+
+This GLEP provides optional means of preserving backwards compatibility.
+To preserve the backwards compatibility, the following needs to be
+ensured:
+
+- all files within the package directory must be covered by ``Manifest``
+ file inside that package directory,
+
+- all distfiles used by the package must be covered by ``Manifest``
+ file inside the package directory,
+
+- all files inside the ``files/`` subdirectory of a package directory
+ need to be use the deprecated ``AUX`` tag (rather than ``DATA``),
+
+- all ``.ebuild`` files inside the package directory need to use
+ the deprecated ``EBUILD`` tag (rather than ``DATA``),
+
+- the Manifest files inside the package directory can be signed
+ to provide authenticity verification.
+
+Once the backwards compatibility is no longer a concern, the above
+no longer needs to hold and the deprecated tags can be removed.
+
+
+Reference Implementation
+========================
+
+The reference implementation for this GLEP is being developed
+as the gemato project [#GEMATO]_.
+
+
+Credits
+=======
+
+Thanks to all the people whose contributions were invaluable
+to the creation of this GLEP. This includes but is not limited to:
+
+- Robin Hugh Johnson,
+- Ulrich Müller.
+
+Additionally, thanks to Robin Hugh Johnson for the original
+MataManifest GLEP series which served both as inspiration and source
+of many concepts used in this GLEP. Recursively, also thanks to all
+the people who contributed to the original GLEPs.
+
+
+References
+==========
+
+.. [#GLEP44] GLEP 44: Manifest2 format
+ (https://www.gentoo.org/glep/glep-0044.html)
+
+.. [#GLEP57] GLEP 57: Security of distribution of Gentoo software
+ - Overview
+ (https://www.gentoo.org/glep/glep-0057.html)
+
+.. [#GLEP58] GLEP 58: Security of distribution of Gentoo software
+ - Infrastructure to User distribution - MetaManifest
+ (https://www.gentoo.org/glep/glep-0058.html)
+
+.. [#GLEP59] GLEP 59: Manifest2 hash policies and security implications
+ (https://www.gentoo.org/glep/glep-0059.html)
+
+.. [#GLEP60] GLEP 60: Manifest2 filetypes
+ (https://www.gentoo.org/glep/glep-0060.html)
+
+.. [#GLEP61] GLEP 61: Manifest2 compression
+ (https://www.gentoo.org/glep/glep-0061.html)
+
+.. [#PMS-FETCH] Package Manager Specification: Dependency Specification
+ Format - SRC_URI
+ (https://projects.gentoo.org/pms/6/pms.html#x1-940008.2.10)
+
+.. [#MD5] RFC1321: The MD5 Message-Digest Algorithm
+ (https://www.ietf.org/rfc/rfc1321.txt)
+
+.. [#RIPEMD160] The hash function RIPEMD-160
+ (https://homes.esat.kuleuven.be/~bosselae/ripemd160.html)
+
+.. [#SHS] FIPS PUB 180-4: Secure Hash Standard (SHS)
+ (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf)
+
+.. [#WHIRLPOOL] The WHIRLPOOL Hash Function
+ (http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
+
+.. [#BLAKE2] BLAKE2 — fast secure hashing
+ (https://blake2.net/)
+
+.. [#SHA3] FIPS PUB 202: SHA-3 Standard: Permutation-Based Hash
+ and Extendable-Output Functions
+ (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf)
+
+.. [#STREEBOG] GOST R 34.11-2012: Streebog Hash Function
+ (https://www.streebog.net/)
+
+.. [#GEMATO] gemato: Gentoo Manifest Tool
+ (https://github.com/mgorny/gemato/)
+
+Copyright
+=========
+This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
+Unported License. To view a copy of this license, visit
+http://creativecommons.org/licenses/by-sa/3.0/.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: c20fe701388d8394c0e957177355eb139559c84e
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Fri Oct 13 15:09:38 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Fri Oct 27 17:44:21 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=c20fe701
glep-0002: Indicate that the 'Replaces' header is multi-value
Update the description of the 'Replaces' header to account for
replacement of multiple GLEPs. This possibility is already permitted
by GLEP 1; however, GLEP 2 seems to be out of date.
Closes: https://bugs.gentoo.org/577760
glep-0002.rst | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/glep-0002.rst b/glep-0002.rst
index c73beec..be11dba 100644
--- a/glep-0002.rst
+++ b/glep-0002.rst
@@ -116,11 +116,11 @@ directions below.
yours depends on. Don't add this header if your dependent feature is
described in a Final GLEP.
-- Add a Replaces header if your GLEP obsoletes an earlier GLEP. The
- value of this header is the number of the GLEP that your new GLEP is
- replacing. Only add this header if the older GLEP is in "final" form, i.e.
- is either Accepted, Final, or Rejected. You aren't replacing an older open
- GLEP if you're submitting a competing idea.
+- Add a Replaces header if your GLEP obsoletes one or more earlier GLEPs.
+ The value of this header is the numbers of the GLEPs that your new GLEP is
+ replacing, separated by commas. Only add this header if the older GLEP is
+ in "final" form, i.e. is either Accepted, Final, or Rejected. You aren't
+ replacing an older open GLEP if you're submitting a competing idea.
- Now write your Abstract, Rationale, and other content for your GLEP,
replacing all of this gobbledygook with your own text. Be sure to adhere to
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: 6e82a8a1bbc6008f9997f86be6ce27715b1c486c
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 2 19:08:12 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=6e82a8a1
glep-0074: Further cleanup
glep-0074.rst | 73 ++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 42 insertions(+), 31 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index eee863a..e4d6a80 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -96,13 +96,17 @@ covered by a signed top-level Manifest.
Directory tree coverage
-----------------------
-The Manifest files can also specify ``IGNORE`` entries to skip Manifest
-verification of subdirectories and/or files. The package manager can
-support injecting ignore paths to account for additional files created,
-modified or removed by user's processes that would not be ignored
-by existing rules. Files and directories starting with a dot are always
-implicitly ignored. All files that are not ignored must be covered
-by at least one of the Manifests.
+The specification provides three ways of skipping Manifest verification
+of specific files and directories (recursively):
+
+1. explicit ``IGNORE`` entries in Manifest files,
+
+2. injected ignore paths via package manager configuration,
+
+3. using names starting with a dot (``.``) which are always skipped.
+
+All files that are not ignored must be covered by at least one
+of the Manifests.
A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
@@ -113,14 +117,17 @@ to specify another entry for a file matching ``IGNORE``, or one of its
subdirectories.
The file entries (except for ``IGNORE``) can be specified for regular
-files only. Symbolic links are followed when opening files. It is
-an error to specify an entry for a different file type.
+files only. Symbolic links are followed when opening files
+and traversing directories. It is an error to specify an entry for
+a different file type. If the tree contain files of other types
+that are not otherwise ignored, they need to be covered by an explicit
+``IGNORE``.
All the local (non-``DIST``) files covered by a Manifest tree must
reside on the same filesystem. It is an error to specify entries
applying to files on another filesystem. If subdirectories
-of the Manifest tree reside on a different filesystem, they must
-be explicitly excluded via ``IGNORE``.
+that are not otherwise ignored reside on a different filesystem, they
+must be explicitly excluded via ``IGNORE``.
File verification
@@ -196,7 +203,8 @@ The Manifest files can specify the following tags:
``IGNORE <path>``
Ignores a subdirectory or file from Manifest checks. If the specified
path is present, it and its contents are omitted from the Manifest
- verification (always pass).
+ verification (always pass). *Path* must be a plain file or directory
+ path without a trailing slash, and must not contain wildcards.
``DATA <path> <size> <checksums>…``
Specifies a regular file subject to Manifest verification. The file
@@ -362,9 +370,9 @@ the following content::
IGNORE lost+found
IGNORE packages
MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
- ...
+ …
MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
- ...
+ …
An example modern Manifest (disregarding backwards compatibility)
for a package directory would have the following content::
@@ -476,8 +484,12 @@ files, and symbolic links to directories are followed as if they were
regular directories.
Dotfiles are implicitly ignored as that is a common notion used
-in software written for POSIX systems. All other filenames require
-explicit ``IGNORE`` lines.
+in software written for POSIX systems. All other common filenames
+require explicit ``IGNORE`` lines.
+
+An ability to inject additional ignore entries is provided to account
+for site configuration affecting the repository tree — placing
+additional files in it, skipping some of the categories from syncing.
The algorithm is restricted to work on a single filesystem. This is
mostly relevant when scanning for top-level Manifest — we do not want
@@ -485,7 +497,7 @@ to cross filesystem boundaries then. However, to ensure consistent
bidirectional behavior we need to also ban them when operating downwards
the tree.
-The directories and files on different filesystems needs to be ignored
+The directories and files on different filesystems need to be ignored
explicitly as implicitly skipping them would cause confusion.
In particular, tools might then claim that a file does not exist when
it clearly does because it was skipped due to filesystem boundaries.
@@ -736,26 +748,25 @@ Backwards Compatibility
=======================
This GLEP provides optional means of preserving backwards compatibility.
-To preserve the backwards compatibility, the following needs to be
-ensured:
+To preserve the backwards compatibility, the following needs to hold
+for the ``Manifest`` file in every package directory:
+
+- all files must be covered by the single ``Manifest`` file,
-- all files within the package directory must be covered by ``Manifest``
- file inside that package directory,
+- all distfiles used by the package must be included,
-- all distfiles used by the package must be covered by ``Manifest``
- file inside the package directory,
+- all files inside the ``files/`` subdirectory need to use
+ the ``AUX`` tag (rather than ``DATA``),
-- all files inside the ``files/`` subdirectory of a package directory
- need to be use the deprecated ``AUX`` tag (rather than ``DATA``),
+- all ``.ebuild`` files need to use the ``EBUILD`` tag,
-- all ``.ebuild`` files inside the package directory need to use
- the deprecated ``EBUILD`` tag (rather than ``DATA``),
+` the ``metadata.xml`` and ``ChangeLog`` files need to use
+ the ``MISC`` tag,
-- the Manifest files inside the package directory can be signed
- to provide authenticity verification,
+- the Manifest can be signed to provide authenticity verification,
-- an uncompressed Manifest file must exist in the package directory,
- and a compressed Manifest of identical content may be present.
+- an uncompressed Manifest must always exist, and a compressed Manifest
+ of identical content may be present.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-02 19:09 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-02 19:09 UTC (permalink / raw
To: gentoo-commits
commit: a744fc8a88d48186b0c68c8282cd6d6d47212285
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 2 18:43:14 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 2 19:09:17 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=a744fc8a
glep-0074: Deprecate MISC and remove non-strict behavior
glep-0074.rst | 93 +++++++++++++++++++++++++++++++++++++----------------------
1 file changed, 59 insertions(+), 34 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index f256451..eee863a 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -49,16 +49,10 @@ This specification is designed with the following goals in mind:
1. It should provide means to ensure the authenticity of the complete
repository, including preventing the injection of additional files.
-2. Like the original Manifest2, the files should be split into two
- groups — files whose authenticity is critical, and those whose
- mismatch may be accepted in non-strict mode. The same classification
- should apply both to files listed in Manifests, and to stray files
- present only in the repository.
-
-3. The format should be universal enough to work both for the Gentoo
+2. The format should be universal enough to work both for the Gentoo
repository and third-party repositories of different characteristics.
-4. The Manifest files should be verifiable stand-alone, that is without
+3. The Manifest files should be verifiable stand-alone, that is without
knowing any details about the underlying repository format.
@@ -205,15 +199,9 @@ The Manifest files can specify the following tags:
verification (always pass).
``DATA <path> <size> <checksums>…``
- Specifies a file subject to obligatory Manifest verification.
- The file is required to pass verification. Used for all files directly
- affecting package manager operation (ebuilds, eclasses, profiles).
-
-``MISC <path> <size> <checksums>…``
- Specifies a file subject to non-obligatory Manifest verification.
- The package manager may ignore a verification failure if operating
- in non-strict mode. Used for files that do not affect the installed
- packages (``metadata.xml``, ``use.desc``).
+ Specifies a regular file subject to Manifest verification. The file
+ is required to pass verification. Used for all files that do not match
+ any other type.
``DIST <filename> <size> <checksums>…``
Specifies a distfile entry used to verify files fetched as part
@@ -233,6 +221,11 @@ allowed at the package directory level:
``EBUILD <filename> <size> <checksums>…``
Equivalent to the ``DATA`` type.
+``MISC <path> <size> <checksums>…``
+ Equivalent to the ``DATA`` type. Historically indicated that
+ the package manager may ignore a verification failure if operating
+ in non-strict mode. However, that behavior is deprecated.
+
``AUX <filename> <size> <checksums>…``
Equivalent to the ``DATA`` type, except that the filename is relative
to ``files/`` subdirectory.
@@ -378,11 +371,11 @@ for a package directory would have the following content::
DATA SphinxTrain-0.9.1-r1.ebuild 932 SHA256 3d3b.. SHA512 be4d..
DATA SphinxTrain-1.0.8.ebuild 912 SHA256 f681.. SHA512 0749..
+ DATA metadata.xml 664 SHA256 97c6.. SHA512 1175..
DATA files/gcc.patch 816 SHA256 b56e.. SHA512 2468..
DATA files/gcc34.patch 333 SHA256 c107.. SHA512 9919..
DIST SphinxTrain-0.9.1-beta.tar.gz 469617 SHA256 c1a4.. SHA512 1b33..
DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d..
- MISC metadata.xml 664 SHA256 97c6.. SHA512 1175..
Rationale
@@ -521,21 +514,48 @@ it. Those directories were ``distfiles``, ``local`` and ``packages``. It
could be also used to ignore VCS directories such as ``CVS``.
-Non-obligatory Manifest verification
-------------------------------------
+Non-strict Manifest verification
+--------------------------------
-While this specification recommends all tools to use strict verification
-by default, it allows declaring some files as non-obligatory like
-the original Manifest2 format did. This could be used on files that do
-not affect the normal package manager operation.
+Originally the Manifest2 format provided a special ``MISC`` tag that
+was used for ``metadata.xml`` and ``ChangeLog`` files. This tag
+indicated that the Manifest verification failures could be ignored for
+those files unless the package manager was working in strict mode.
-It aims to account for two use cases:
+The first versions of this specification continued the use of this tag.
+However, after a long debate it was decided to deprecate it along with
+the non-strict behavior, and require all files to strictly match.
-1. Stripping down files that are not strictly required to install
- packages from repository checkouts.
+Two arguments were mentioned for the usefulness of a ``MISC`` type:
-2. Accounting for automatically generated files that might be updated
- by standard tooling.
+1. being able to reduce the checkout size by stripping unnecessary
+ files out, and
+
+2. being able to run update automatically generated files locally
+ without causing unnecessary verification failures.
+
+However, the usefulness of ``MISC`` in both cases is doubtful.
+
+The cases for stripping unnecessary files mostly focused around space
+savings. For this purpose, stripping ``metadata.xml`` and similar files
+has little value. It is much more common for users to strip whole
+categories which can not be handled via the ``MISC`` type, and needs
+a dedicated package manager mechanism. The same mechanism can also
+handle files that used the ``MISC`` type.
+
+The cases for autogenerated files involve such cache files
+as ``use.local.desc``. However, we can not include ``md5-cache`` there
+due to security concerns which results in inconsistent cache handling.
+Furthermore, the tools were historically modified to provide stable
+output which means that their content can not change without
+a non-``MISC`` content being changed first. This practically defeats
+the purpose of using ``MISC``.
+
+Finally, the non-strict mode could be used as means to an attack.
+The allowance of missing or modified documentation file could be used
+to spread misinformation, resulting in bad decisions made by the user.
+A modified file could also be used e.g. to exploit vulnerabilities
+of an XML parser.
Timestamp field
@@ -569,17 +589,22 @@ be not suitable to safe use.
New vs deprecated tags
----------------------
-Out of the four types defined by Manifest2, two are reused and two are
-marked deprecated.
+Out of the four types defined by Manifest2, only one is reused
+and the remaining three is replaced by a single, universal ``DATA``
+type.
-The ``DIST`` and ``MISC`` tags are reused since they can be relatively
-clearly marked into the new concept.
+The ``DIST`` tag is reused since the specification does not change
+anything with regard to distfile handling.
The ``EBUILD`` tag could potentially be reused for generic file
verification data. However, it would be confusing if all the different
data files were marked as ``EBUILD``. Therefore, an equivalent ``DATA``
type was introduced as a replacement.
+The ``MISC`` tag and the relevant non-strict mode has been removed
+as being of little value, as detailed in the `Non-strict Manifest
+verification`_ section.
+
The ``AUX`` tag is deprecated as it is redundant to ``DATA``, and has
the limiting property of implicit ``files/`` path prefix.
@@ -622,7 +647,7 @@ Normally we are not including those files to reduce the checkout size.
However, some users have shown interest in them and Infra is working
on providing them via an additional rsync module.
-If such files were injected into the repository, they would cause strict
+If such files were injected into the repository, they would cause
verification failures of Manifests. To account for this, Infra could
provide ``IGNORE`` entries to allow them to exist.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-05 21:11 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-05 21:11 UTC (permalink / raw
To: gentoo-commits
commit: 378f8dbc158620489965f1cf5bd6abe30a5f93c6
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sun Nov 5 21:11:03 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Sun Nov 5 21:11:03 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=378f8dbc
glep-0074: More suggestions from Robin H. Johnson
glep-0074.rst | 61 ++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 35 insertions(+), 26 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index e4d6a80..aae8fcf 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -125,9 +125,10 @@ that are not otherwise ignored, they need to be covered by an explicit
All the local (non-``DIST``) files covered by a Manifest tree must
reside on the same filesystem. It is an error to specify entries
-applying to files on another filesystem. If subdirectories
-that are not otherwise ignored reside on a different filesystem, they
-must be explicitly excluded via ``IGNORE``.
+applying to files on another filesystem. If files or directories that
+are not otherwise ignored reside on a different filesystem, or symbolic
+links point to targets on a different filesystem, they must
+be explicitly excluded via ``IGNORE``.
File verification
@@ -194,7 +195,7 @@ The Manifest files can specify the following tags:
to detect an outdated repository checkout as described in `Timestamp
verification`_.
-``MANIFEST <path> <size> <checksums>…``
+``MANIFEST <path> <size> <checksums>...``
Specifies a sub-Manifest. The sub-Manifest must be verified like
a regular file. If the verification succeeds, the entries from
the sub-Manifest are included for verification as described
@@ -206,12 +207,12 @@ The Manifest files can specify the following tags:
verification (always pass). *Path* must be a plain file or directory
path without a trailing slash, and must not contain wildcards.
-``DATA <path> <size> <checksums>…``
+``DATA <path> <size> <checksums>...``
Specifies a regular file subject to Manifest verification. The file
is required to pass verification. Used for all files that do not match
any other type.
-``DIST <filename> <size> <checksums>…``
+``DIST <filename> <size> <checksums>...``
Specifies a distfile entry used to verify files fetched as part
of ``SRC_URI``. The filename must match the filename used to store
the fetched file as specified in the PMS [#PMS-FETCH]_. The package
@@ -226,15 +227,15 @@ Deprecated Manifest tags
For backwards compatibility, the following tags are additionally
allowed at the package directory level:
-``EBUILD <filename> <size> <checksums>…``
+``EBUILD <filename> <size> <checksums>...``
Equivalent to the ``DATA`` type.
-``MISC <path> <size> <checksums>…``
+``MISC <path> <size> <checksums>...``
Equivalent to the ``DATA`` type. Historically indicated that
the package manager may ignore a verification failure if operating
in non-strict mode. However, that behavior is deprecated.
-``AUX <filename> <size> <checksums>…``
+``AUX <filename> <size> <checksums>...``
Equivalent to the ``DATA`` type, except that the filename is relative
to ``files/`` subdirectory.
@@ -314,13 +315,13 @@ of supported algorithms is outside the scope of this specification.
The algorithm names reserved at the time of writing are:
- ``MD5`` [#MD5]_,
-- ``RMD160`` — RIPEMD-160 [#RIPEMD160]_,
+- ``RMD160`` -- RIPEMD-160 [#RIPEMD160]_,
- ``SHA1`` [#SHS]_,
-- ``SHA256`` and ``SHA512`` — SHA-2 family of hashes [#SHS]_,
+- ``SHA256`` and ``SHA512`` -- SHA-2 family of hashes [#SHS]_,
- ``WHIRLPOOL`` [#WHIRLPOOL]_,
-- ``BLAKE2B`` and ``BLAKE2S`` — BLAKE2 family of hashes [#BLAKE2]_,
-- ``SHA3_256`` and ``SHA3_512`` — SHA-3 family of hashes [#SHA3]_,
-- ``STREEBOG256`` and ``STREEBOG512`` — Streebog family of hashes
+- ``BLAKE2B`` and ``BLAKE2S`` -- BLAKE2 family of hashes [#BLAKE2]_,
+- ``SHA3_256`` and ``SHA3_512`` -- SHA-3 family of hashes [#SHA3]_,
+- ``STREEBOG256`` and ``STREEBOG512`` -- Streebog family of hashes
[#STREEBOG]_.
The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
@@ -370,9 +371,9 @@ the following content::
IGNORE lost+found
IGNORE packages
MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
- …
+ ...
MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
- …
+ ...
An example modern Manifest (disregarding backwards compatibility)
for a package directory would have the following content::
@@ -484,15 +485,17 @@ files, and symbolic links to directories are followed as if they were
regular directories.
Dotfiles are implicitly ignored as that is a common notion used
-in software written for POSIX systems. All other common filenames
-require explicit ``IGNORE`` lines.
+in software written for POSIX systems. All other filenames require
+explicit ``IGNORE`` lines.
An ability to inject additional ignore entries is provided to account
-for site configuration affecting the repository tree — placing
+for site configuration affecting the repository tree -- placing
additional files in it, skipping some of the categories from syncing.
+This configuration can extend beyond the limits of this GLEP,
+e.g. by allowing wildcards or regular expressions.
The algorithm is restricted to work on a single filesystem. This is
-mostly relevant when scanning for top-level Manifest — we do not want
+mostly relevant when scanning for top-level Manifest -- we do not want
to cross filesystem boundaries then. However, to ensure consistent
bidirectional behavior we need to also ban them when operating downwards
the tree.
@@ -551,9 +554,11 @@ However, the usefulness of ``MISC`` in both cases is doubtful.
The cases for stripping unnecessary files mostly focused around space
savings. For this purpose, stripping ``metadata.xml`` and similar files
has little value. It is much more common for users to strip whole
-categories which can not be handled via the ``MISC`` type, and needs
-a dedicated package manager mechanism. The same mechanism can also
-handle files that used the ``MISC`` type.
+packages or categories. The ``MISC`` type is not suitable for that,
+and so a dedicated package manager mechanism needs to be developed
+instead; possibly combining it with rsync exclusion list. The same
+mechanism can also handle files that historically used the ``MISC``
+type.
The cases for autogenerated files involve such cache files
as ``use.local.desc``. However, we can not include ``md5-cache`` there
@@ -673,8 +678,8 @@ in a single file inside the package directory. It has been specifically
pointed out that:
- since distfiles are sometimes reused across different packages,
- the repeating checksums are redundant,
-
+ the repeating checksums are redundant [#DIST]_.
+
- mirror admins were interested in the possibility of verifying all
the distfiles with a single tool.
@@ -833,7 +838,7 @@ References
.. [#WHIRLPOOL] The WHIRLPOOL Hash Function
(http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
-.. [#BLAKE2] BLAKE2 — fast secure hashing
+.. [#BLAKE2] BLAKE2 -- fast secure hashing
(https://blake2.net/)
.. [#SHA3] FIPS PUB 202: SHA-3 Standard: Permutation-Based Hash
@@ -846,6 +851,10 @@ References
.. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
(https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
+.. [#DIST] According to Robin H. Johnson, 8.4% of all DIST entries
+ at the time of writing are duplicate, representing a 2 MiB
+ out of 25 MiB of DIST entries altogether.
+
.. [#GEMATO] gemato: Gentoo Manifest Tool
(https://github.com/mgorny/gemato/)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-06 21:54 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-06 21:54 UTC (permalink / raw
To: gentoo-commits
commit: 83c93bcc64889ebcf3289979ffedd39fc892f72e
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sun Nov 5 21:11:03 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 6 21:51:49 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=83c93bcc
glep-0074: More suggestions from Robin H. Johnson
glep-0074.rst | 64 ++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 37 insertions(+), 27 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index e4d6a80..86b2361 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-30
+Last-Modified: 2017-11-06
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -125,9 +125,10 @@ that are not otherwise ignored, they need to be covered by an explicit
All the local (non-``DIST``) files covered by a Manifest tree must
reside on the same filesystem. It is an error to specify entries
-applying to files on another filesystem. If subdirectories
-that are not otherwise ignored reside on a different filesystem, they
-must be explicitly excluded via ``IGNORE``.
+applying to files on another filesystem. If files or directories that
+are not otherwise ignored reside on a different filesystem, or symbolic
+links point to targets on a different filesystem, they must
+be explicitly excluded via ``IGNORE``.
File verification
@@ -194,7 +195,7 @@ The Manifest files can specify the following tags:
to detect an outdated repository checkout as described in `Timestamp
verification`_.
-``MANIFEST <path> <size> <checksums>…``
+``MANIFEST <path> <size> <checksums>...``
Specifies a sub-Manifest. The sub-Manifest must be verified like
a regular file. If the verification succeeds, the entries from
the sub-Manifest are included for verification as described
@@ -206,12 +207,12 @@ The Manifest files can specify the following tags:
verification (always pass). *Path* must be a plain file or directory
path without a trailing slash, and must not contain wildcards.
-``DATA <path> <size> <checksums>…``
+``DATA <path> <size> <checksums>...``
Specifies a regular file subject to Manifest verification. The file
is required to pass verification. Used for all files that do not match
any other type.
-``DIST <filename> <size> <checksums>…``
+``DIST <filename> <size> <checksums>...``
Specifies a distfile entry used to verify files fetched as part
of ``SRC_URI``. The filename must match the filename used to store
the fetched file as specified in the PMS [#PMS-FETCH]_. The package
@@ -226,15 +227,15 @@ Deprecated Manifest tags
For backwards compatibility, the following tags are additionally
allowed at the package directory level:
-``EBUILD <filename> <size> <checksums>…``
+``EBUILD <filename> <size> <checksums>...``
Equivalent to the ``DATA`` type.
-``MISC <path> <size> <checksums>…``
+``MISC <path> <size> <checksums>...``
Equivalent to the ``DATA`` type. Historically indicated that
the package manager may ignore a verification failure if operating
in non-strict mode. However, that behavior is deprecated.
-``AUX <filename> <size> <checksums>…``
+``AUX <filename> <size> <checksums>...``
Equivalent to the ``DATA`` type, except that the filename is relative
to ``files/`` subdirectory.
@@ -314,13 +315,13 @@ of supported algorithms is outside the scope of this specification.
The algorithm names reserved at the time of writing are:
- ``MD5`` [#MD5]_,
-- ``RMD160`` — RIPEMD-160 [#RIPEMD160]_,
+- ``RMD160`` -- RIPEMD-160 [#RIPEMD160]_,
- ``SHA1`` [#SHS]_,
-- ``SHA256`` and ``SHA512`` — SHA-2 family of hashes [#SHS]_,
+- ``SHA256`` and ``SHA512`` -- SHA-2 family of hashes [#SHS]_,
- ``WHIRLPOOL`` [#WHIRLPOOL]_,
-- ``BLAKE2B`` and ``BLAKE2S`` — BLAKE2 family of hashes [#BLAKE2]_,
-- ``SHA3_256`` and ``SHA3_512`` — SHA-3 family of hashes [#SHA3]_,
-- ``STREEBOG256`` and ``STREEBOG512`` — Streebog family of hashes
+- ``BLAKE2B`` and ``BLAKE2S`` -- BLAKE2 family of hashes [#BLAKE2]_,
+- ``SHA3_256`` and ``SHA3_512`` -- SHA-3 family of hashes [#SHA3]_,
+- ``STREEBOG256`` and ``STREEBOG512`` -- Streebog family of hashes
[#STREEBOG]_.
The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
@@ -370,9 +371,9 @@ the following content::
IGNORE lost+found
IGNORE packages
MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
- …
+ ...
MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
- …
+ ...
An example modern Manifest (disregarding backwards compatibility)
for a package directory would have the following content::
@@ -484,15 +485,17 @@ files, and symbolic links to directories are followed as if they were
regular directories.
Dotfiles are implicitly ignored as that is a common notion used
-in software written for POSIX systems. All other common filenames
-require explicit ``IGNORE`` lines.
+in software written for POSIX systems. All other filenames require
+explicit ``IGNORE`` lines.
An ability to inject additional ignore entries is provided to account
-for site configuration affecting the repository tree — placing
+for site configuration affecting the repository tree -- placing
additional files in it, skipping some of the categories from syncing.
+This configuration can extend beyond the limits of this GLEP,
+e.g. by allowing wildcards or regular expressions.
The algorithm is restricted to work on a single filesystem. This is
-mostly relevant when scanning for top-level Manifest — we do not want
+mostly relevant when scanning for top-level Manifest -- we do not want
to cross filesystem boundaries then. However, to ensure consistent
bidirectional behavior we need to also ban them when operating downwards
the tree.
@@ -551,9 +554,12 @@ However, the usefulness of ``MISC`` in both cases is doubtful.
The cases for stripping unnecessary files mostly focused around space
savings. For this purpose, stripping ``metadata.xml`` and similar files
has little value. It is much more common for users to strip whole
-categories which can not be handled via the ``MISC`` type, and needs
-a dedicated package manager mechanism. The same mechanism can also
-handle files that used the ``MISC`` type.
+packages or categories. The ``MISC`` type is not suitable for that,
+and so a dedicated package manager mechanism needs to be developed
+instead. The same mechanism can also handle files that historically used
+the ``MISC`` type. As an example, the package manager may choose
+to generate both the rsync exclusion list and Manifest ignore list
+using a single source list.
The cases for autogenerated files involve such cache files
as ``use.local.desc``. However, we can not include ``md5-cache`` there
@@ -673,8 +679,8 @@ in a single file inside the package directory. It has been specifically
pointed out that:
- since distfiles are sometimes reused across different packages,
- the repeating checksums are redundant,
-
+ the repeating checksums are redundant [#DIST]_.
+
- mirror admins were interested in the possibility of verifying all
the distfiles with a single tool.
@@ -833,7 +839,7 @@ References
.. [#WHIRLPOOL] The WHIRLPOOL Hash Function
(http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
-.. [#BLAKE2] BLAKE2 — fast secure hashing
+.. [#BLAKE2] BLAKE2 -- fast secure hashing
(https://blake2.net/)
.. [#SHA3] FIPS PUB 202: SHA-3 Standard: Permutation-Based Hash
@@ -846,6 +852,10 @@ References
.. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
(https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
+.. [#DIST] According to Robin H. Johnson, 8.4% of all DIST entries
+ at the time of writing are duplicate, representing a 2 MiB
+ out of 25 MiB of DIST entries altogether.
+
.. [#GEMATO] gemato: Gentoo Manifest Tool
(https://github.com/mgorny/gemato/)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:master commit in: /
@ 2017-11-13 16:08 Michał Górny
2017-11-13 17:35 ` [gentoo-commits] data/glep:glep-manifest " Michał Górny
0 siblings, 1 reply; 61+ messages in thread
From: Michał Górny @ 2017-11-13 16:08 UTC (permalink / raw
To: gentoo-commits
commit: eccf28560c75997f7a4fbbde84f8cf11de1245e4
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 16:01:40 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:07:54 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=eccf2856
glep-0066: Mark Final per 2017-11-12 Council meeting
glep-0066.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/glep-0066.rst b/glep-0066.rst
index cc284d3..a605cf2 100644
--- a/glep-0066.rst
+++ b/glep-0066.rst
@@ -3,10 +3,10 @@ GLEP: 66
Title: Gentoo Git Workflow
Author: Michał Górny <mgorny@gentoo.org>
Type: Standards Track
-Status: Draft
+Status: Final
Version: 1
Created: 2017-07-24
-Last-Modified: 2017-10-14
+Last-Modified: 2017-11-13
Post-History: 2017-07-25, 2017-09-28, 2017-10-11
Content-Type: text/x-rst
---
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
2017-11-13 16:08 [gentoo-commits] data/glep:master " Michał Górny
@ 2017-11-13 17:35 ` Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: eccf28560c75997f7a4fbbde84f8cf11de1245e4
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 16:01:40 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:07:54 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=eccf2856
glep-0066: Mark Final per 2017-11-12 Council meeting
glep-0066.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/glep-0066.rst b/glep-0066.rst
index cc284d3..a605cf2 100644
--- a/glep-0066.rst
+++ b/glep-0066.rst
@@ -3,10 +3,10 @@ GLEP: 66
Title: Gentoo Git Workflow
Author: Michał Górny <mgorny@gentoo.org>
Type: Standards Track
-Status: Draft
+Status: Final
Version: 1
Created: 2017-07-24
-Last-Modified: 2017-10-14
+Last-Modified: 2017-11-13
Post-History: 2017-07-25, 2017-09-28, 2017-10-11
Content-Type: text/x-rst
---
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:master commit in: /
@ 2017-11-13 16:08 Michał Górny
2017-11-13 17:35 ` [gentoo-commits] data/glep:glep-manifest " Michał Górny
0 siblings, 1 reply; 61+ messages in thread
From: Michał Górny @ 2017-11-13 16:08 UTC (permalink / raw
To: gentoo-commits
commit: 5296812ec1b0d8155480261a49120b2b9347bd0f
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 25 07:18:17 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:07:56 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=5296812e
glep-0065: Apply suggestions from Michael Orlitzky
glep-0065.rst | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/glep-0065.rst b/glep-0065.rst
index a8a7321..af641d7 100644
--- a/glep-0065.rst
+++ b/glep-0065.rst
@@ -18,9 +18,9 @@ This GLEP provides two kinds of QA check: checks run on the installation image
once ``src_install`` returns, and checks run on the live system once
``pkg_postinst`` returns. The checks can be provided by the Package Manager,
repositories, packages (installed system-wide) and the system administrator.
-The QA checks can inspect the installation image or live system respectively,
-output and store both user- and machine-oriented QA warning logs, manipulate
-the files and abort the install, as necessary.
+The QA checks can inspect the installation image or live system, output
+and store both user- and machine-oriented QA warning logs, manipulate files
+and abort the install.
Motivation
@@ -34,8 +34,9 @@ the installed files. This is where post-install QA checks become useful.
Over time, many different QA checks have been added to Portage. That includes
checks corresponding to generic Gentoo rules (like filesystem hierarchy,
-security requirements), checks enforcing Gentoo team policies and correct
-eclass uses. Some of the checks depend on external tools being present.
+security requirements), checks enforcing Gentoo team policies, and checks
+enforcing correct eclass usage. Some of the checks depend on external tools
+being present.
Keeping those checks directly in Portage sources has two major disadvantages:
@@ -58,7 +59,7 @@ There are two kinds of QA checks defined in this specification:
1. Post-install QA checks (``install-qa-check.d``),
-2. Post-merge (postinst) QA checks (``postinst-qa-check.d``).
+2. Post-merge QA checks (``postinst-qa-check.d``).
The post-install QA checks are are executed after the ``src_install`` ebuild
phase finishes but before the binary package is built or the ``pkg_preinst``
@@ -117,7 +118,7 @@ run in an isolated subshell, and therefore can safely alter the environment
and change the working directory. QA scripts must always end with a command
terminating with a successful exit code.
-Aside to the standard PMS functions, two additional commands are provided:
+Aside from the standard PMS functions, two additional commands are provided:
1. ``eqawarn`` to output QA warnings to user,
2. ``eqatag`` to store machine-readable information about QA issues.
@@ -150,8 +151,8 @@ Synopsis
Tag the package with specific QA issues. The *tag* parameter is
a well-defined name identifying specific QA issue. The tag can be additionally
associated with some data in key-value form and/or one or more *files*.
-The file paths are relative to installation image (``${D}``), and need to
-start with a leading slash.
+The file paths are relative to the installation root (``${D}`` in post-install
+checks or ``${ROOT}`` in post-merge), and need to start with a leading slash.
If ``-v`` (verbose) parameter is passed, the function will also output
newline-delimited list of files using ``eqawarn``. This is intended
@@ -181,7 +182,7 @@ account for various problems caused by the ebuild code up to and including
``src_install``, the upstream code executed as part of any of those phases
and the supplied files.
-Post-postinst QA checks can be used to verify the state of system after
+Post-merge QA checks can be used to verify the state of system after
the package is merged and its ``pkg_postinst`` phase is executed. They mostly
aim to detect missing postinst actions but can do other live system integrity
checks.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
2017-11-13 16:08 [gentoo-commits] data/glep:master " Michał Górny
@ 2017-11-13 17:35 ` Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 5296812ec1b0d8155480261a49120b2b9347bd0f
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Wed Oct 25 07:18:17 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:07:56 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=5296812e
glep-0065: Apply suggestions from Michael Orlitzky
glep-0065.rst | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/glep-0065.rst b/glep-0065.rst
index a8a7321..af641d7 100644
--- a/glep-0065.rst
+++ b/glep-0065.rst
@@ -18,9 +18,9 @@ This GLEP provides two kinds of QA check: checks run on the installation image
once ``src_install`` returns, and checks run on the live system once
``pkg_postinst`` returns. The checks can be provided by the Package Manager,
repositories, packages (installed system-wide) and the system administrator.
-The QA checks can inspect the installation image or live system respectively,
-output and store both user- and machine-oriented QA warning logs, manipulate
-the files and abort the install, as necessary.
+The QA checks can inspect the installation image or live system, output
+and store both user- and machine-oriented QA warning logs, manipulate files
+and abort the install.
Motivation
@@ -34,8 +34,9 @@ the installed files. This is where post-install QA checks become useful.
Over time, many different QA checks have been added to Portage. That includes
checks corresponding to generic Gentoo rules (like filesystem hierarchy,
-security requirements), checks enforcing Gentoo team policies and correct
-eclass uses. Some of the checks depend on external tools being present.
+security requirements), checks enforcing Gentoo team policies, and checks
+enforcing correct eclass usage. Some of the checks depend on external tools
+being present.
Keeping those checks directly in Portage sources has two major disadvantages:
@@ -58,7 +59,7 @@ There are two kinds of QA checks defined in this specification:
1. Post-install QA checks (``install-qa-check.d``),
-2. Post-merge (postinst) QA checks (``postinst-qa-check.d``).
+2. Post-merge QA checks (``postinst-qa-check.d``).
The post-install QA checks are are executed after the ``src_install`` ebuild
phase finishes but before the binary package is built or the ``pkg_preinst``
@@ -117,7 +118,7 @@ run in an isolated subshell, and therefore can safely alter the environment
and change the working directory. QA scripts must always end with a command
terminating with a successful exit code.
-Aside to the standard PMS functions, two additional commands are provided:
+Aside from the standard PMS functions, two additional commands are provided:
1. ``eqawarn`` to output QA warnings to user,
2. ``eqatag`` to store machine-readable information about QA issues.
@@ -150,8 +151,8 @@ Synopsis
Tag the package with specific QA issues. The *tag* parameter is
a well-defined name identifying specific QA issue. The tag can be additionally
associated with some data in key-value form and/or one or more *files*.
-The file paths are relative to installation image (``${D}``), and need to
-start with a leading slash.
+The file paths are relative to the installation root (``${D}`` in post-install
+checks or ``${ROOT}`` in post-merge), and need to start with a leading slash.
If ``-v`` (verbose) parameter is passed, the function will also output
newline-delimited list of files using ``eqawarn``. This is intended
@@ -181,7 +182,7 @@ account for various problems caused by the ebuild code up to and including
``src_install``, the upstream code executed as part of any of those phases
and the supplied files.
-Post-postinst QA checks can be used to verify the state of system after
+Post-merge QA checks can be used to verify the state of system after
the package is merged and its ``pkg_postinst`` phase is executed. They mostly
aim to detect missing postinst actions but can do other live system integrity
checks.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: ae58fa356094a90ee1f2fffee6a8f94fcc847054
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 16:06:58 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:07:56 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=ae58fa35
glep-0065: Mark as Accepted per 2017-11-12 meeting
glep-0065.rst | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/glep-0065.rst b/glep-0065.rst
index af641d7..3158ad6 100644
--- a/glep-0065.rst
+++ b/glep-0065.rst
@@ -3,14 +3,21 @@ GLEP: 65
Title: Post-install QA checks
Author: Michał Górny <mgorny@gentoo.org>
Type: Standards Track
-Status: Draft
+Status: Accepted
Version: 2
Created: 2014-10-26
-Last-Modified: 2017-10-17
+Last-Modified: 2017-11-13
Post-History: 2014-10-30, 2017-10-17
Content-Type: text/x-rst
---
+Status
+======
+
+This GLEP has been accepted on the 2017-11-12 Council meeting. However,
+full tree signing needs to be deployed before it is implemented.
+
+
Abstract
========
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: a0b5ca20ae53c8867b45d734cfe25d31de738dbe
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Tue Oct 17 18:04:37 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:07:56 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=a0b5ca20
glep-0065: Provide post-postinst QA checks
glep-0065.rst | 139 +++++++++++++++++++++++++++++++++++++---------------------
1 file changed, 89 insertions(+), 50 deletions(-)
diff --git a/glep-0065.rst b/glep-0065.rst
index e628184..a8a7321 100644
--- a/glep-0065.rst
+++ b/glep-0065.rst
@@ -4,22 +4,23 @@ Title: Post-install QA checks
Author: Michał Górny <mgorny@gentoo.org>
Type: Standards Track
Status: Draft
-Version: 1
+Version: 2
Created: 2014-10-26
-Last-Modified: 2014-12-14
-Post-History: 2014-10-30
+Last-Modified: 2017-10-17
+Post-History: 2014-10-30, 2017-10-17
Content-Type: text/x-rst
---
Abstract
========
-This GLEP provides a mechanism for running QA checks on installation image
-after ``src_install`` phase exits. The checks can be provided by the Package
-Manager, repositories, packages (installed system-wide) and the system
-administrator. The QA checks can inspect the installation image, output and
-store both user- and machine-oriented QA warning logs, manipulate the files
-and abort the install, as necessary.
+This GLEP provides two kinds of QA check: checks run on the installation image
+once ``src_install`` returns, and checks run on the live system once
+``pkg_postinst`` returns. The checks can be provided by the Package Manager,
+repositories, packages (installed system-wide) and the system administrator.
+The QA checks can inspect the installation image or live system respectively,
+output and store both user- and machine-oriented QA warning logs, manipulate
+the files and abort the install, as necessary.
Motivation
@@ -39,7 +40,7 @@ eclass uses. Some of the checks depend on external tools being present.
Keeping those checks directly in Portage sources has two major disadvantages:
1. The checks can not be properly updated without Portage upgrade.
- In particular, a change in QA check becomes fully effective when
+ In particular, a change in a QA check becomes fully effective when
the relevant Portage version becomes stable and the user upgrades.
There is no easy way to keep QA checks in sync with eclasses.
@@ -50,14 +51,41 @@ Keeping those checks directly in Portage sources has two major disadvantages:
Specification
=============
-QA check format & locations
----------------------------
+QA check types
+--------------
+
+There are two kinds of QA checks defined in this specification:
+
+1. Post-install QA checks (``install-qa-check.d``),
+
+2. Post-merge (postinst) QA checks (``postinst-qa-check.d``).
+
+The post-install QA checks are are executed after the ``src_install`` ebuild
+phase finishes but before the binary package is built or the ``pkg_preinst``
+phase is executed. They can use the same commands as are permitted
+in ``src_install``, and access the installation image ``${D}``
+and the temporary directory ``${T}``.
+
+In case of severe QA issues, the checks are allowed to alter the contents of
+the installation image in order to sanitize them, or call the ``die`` function
+to abort the build.
+
+The post-merge QA checks are executed after the ``pkg_postinst`` ebuild phase
+finishes. They can use the same commands as are permitted in ``pkg_postinst``,
+and access the installed system location ``${ROOT}`` and the temporary
+directory ``${T}``.
+
+The checks are allowed to alter the contents of the filesystem to the same
+degree as ``pkg_postinst`` phase is. They must not call ``die``.
+
+QA check file format & locations
+--------------------------------
QA checks are stored as bash scripts. The checks are identified and ordered
by file name. If files with same names are present in multiple locations,
the file in location with the highest priority is used.
-The specification defines four types of QA checks, listed in the order
+The specification defines four sources of QA checks, listed in the order
of increasing priority:
1. internal checks included in the Package Manager,
@@ -71,13 +99,15 @@ generic checks are included in the Package Manager and not checks specific to
Gentoo policies, packages or eclasses included in Gentoo.
Repository-specific QA checks are included in ``metadata/install-qa-check.d``
-directory of a repository. For an ebuild in question, the repository
-containing it and its masters are traversed for QA checks, with priority
-decreasing with each inheritance level.
+and ``metadata/postinst-qa-check.d`` directories of a repository.
+For an ebuild in question, the repository containing it and its masters are
+traversed for QA checks, with priority decreasing with each inheritance level.
The package-installed QA checks are located in ``/usr/lib/install-qa-check.d``
-and are intended to be installed by packages. The sysadmin-defined QA checks
-are located in ``/usr/local/lib/install-qa-check.d``.
+and ``/usr/lib/postinst-qa-check.d``, and are intended to be installed
+by packages. The sysadmin-defined QA checks are located
+in ``/usr/local/lib/install-qa-check.d``
+and ``/usr/local/lib/postinst-qa-check.d``.
QA check script format
----------------------
@@ -87,19 +117,11 @@ run in an isolated subshell, and therefore can safely alter the environment
and change the working directory. QA scripts must always end with a command
terminating with a successful exit code.
-The QA checks are executed after the ``src_install`` ebuild phase finishes
-and before the binary package is built or the ``pkg_preinst`` phase is
-executed. They can use the same commands as allowed in ``src_install``,
-and use the installation image ``${D}`` and the temporary directory ``${T}``.
-Aside to standard PMS functions, two additional commands are provided:
+Aside to the standard PMS functions, two additional commands are provided:
1. ``eqawarn`` to output QA warnings to user,
2. ``eqatag`` to store machine-readable information about QA issues.
-In case of severe QA issues, the checks are allowed to alter the contents of
-the installation image in order to sanitize them, or call the ``die`` function
-to abort the build.
-
Repository-defined QA checks are allowed to ``inherit`` eclasses from
the repository providing the check or any of its masters. The same
inheritance rules apply as to ebuilds in the particular repository. Sourced
@@ -147,37 +169,54 @@ the tags used by ``60bash-completion`` check would be named
Rationale
=========
-QA check format & locations
----------------------------
+QA check types
+--------------
+
+The two types of QA checks were created to account for different kinds
+of common mistakes in ebuilds.
+
+Post-install QA checks can be used to verify the installation image before
+it is merged to a live system or published as a binary package. They can
+account for various problems caused by the ebuild code up to and including
+``src_install``, the upstream code executed as part of any of those phases
+and the supplied files.
+
+Post-postinst QA checks can be used to verify the state of system after
+the package is merged and its ``pkg_postinst`` phase is executed. They mostly
+aim to detect missing postinst actions but can do other live system integrity
+checks.
+
+QA check file format & locations
+--------------------------------
The multiple locations for QA checks aim to get the best coverage for various
requirements.
-The checks installed along with the Package Manager are meant to cover the
-generic cases and other checks that rely on Package Manager internals. Unlike
-other categories of QA checks, those checks apply to a single Package Manager
-only and can therefore use internal API. However, it is recommended that this
-category is used scarcely.
+The checks installed along with the Package Manager are meant to cover
+the generic cases and other checks that rely on Package Manager internals.
+Unlike other categories of QA checks, those checks apply to a single Package
+Manager only and can therefore use internal API. However, it is recommended
+that this category is used scarcely.
Storing checks in the repository allows developers to strictly bind them to
a specific version of the distribution and update them along with the relevant
-policies and/or eclasses. In particular, rules enforced by Gentoo policies and
-eclasses don't have to apply to other distributions using Portage.
+policies and/or eclasses. In particular, rules enforced by Gentoo policies
+and eclasses don't have to apply to other distributions using Portage.
The QA checks are applied to sub-repositories (via ``masters`` attribute)
-likewise eclasses. This makes sure that the common repositories don't lose QA
-checks. The QA checks related to eclasses are inherited the same way as
-eclasses are. Similarly to eclasses, sub-repositories can override (or
-disable) QA checks.
+likewise eclasses. This makes sure that the majority of repositories don't
+lose QA checks. The QA checks related to eclasses are inherited the same way
+as eclasses are. Similarly to eclasses, sub-repositories can override
+(or disable) QA checks.
System-wide QA checks present the opportunity of installing QA checks along
with packages. In the past, some QA checks were run only conditionally
-depending on existence of external checker software. Instead, the software can
-install its own QA checks directly.
+depending on existence of external checker software. Instead, the software
+packages can install their own QA checks directly.
-The administrative override via ``/usr/local`` is a natural extension of
-system-wide QA checks. Additionally, it can be used by the sysadmin to
-override or disable practically any other QA check, either internal Portage
+The administrative override via ``/usr/local`` is a natural extension
+of system-wide QA checks. Additionally, it can be used by the sysadmin
+to override or disable practically any other QA check, either internal Portage
or repository-wide.
Sharing the QA checks has the additional advantage of having unified QA tools
@@ -186,9 +225,8 @@ for all Package Managers.
QA check script format
----------------------
-Use of bash is aimed to match the ebuild format at ''src_install'' phase.
-The choice of functions aims at providing portability between Package
-Managers.
+Use of bash is aimed to match the ebuild format. The choice of functions aims
+at portability between Package Managers.
The scripts are run in isolated subshell to simplify the checks and reduce
the risk of accidental cross-script issues.
@@ -289,8 +327,9 @@ be used from the repository anyway.
Reference implementation
========================
-The reference implementation is available in Portage starting with version
-2.2.15 (released 2014-12-04).
+The reference implementation of ``install-qa-check.d`` is available in Portage
+starting with version 2.2.15 (released 2014-12-04). The support
+for ``postinst-qa-check.d`` was added in 2.3.9 (released 2017-09-19).
Copyright
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: e0079a728e8eb4bd44015d7cc54b95ec8b0b226d
Author: Ulrich Müller <ulm <AT> gentoo <DOT> org>
AuthorDate: Sun Nov 12 21:13:52 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Sun Nov 12 21:13:52 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=e0079a72
glep-0008: Mark as Moribund.
Bug: https://bugs.gentoo.org/634100
glep-0008.rst | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/glep-0008.rst b/glep-0008.rst
index 695a5fa..32781c2 100644
--- a/glep-0008.rst
+++ b/glep-0008.rst
@@ -4,10 +4,10 @@ Title: Adopt-A-Developer
Author: Brian Jackson <iggy@gentoo.org>,
Thomas Cort <tcort@gentoo.org>
Type: Standards Track
-Status: Final
+Status: Moribund
Version: 1
Created: 2003-07-02
-Last-Modified: 2014-01-15
+Last-Modified: 2017-11-12
Post-History: 2003-07-09, 2004-04-04, 2006-09-03
Content-Type: text/x-rst
---
@@ -15,10 +15,7 @@ Content-Type: text/x-rst
Status
======
-Reactivated by tcort, now existing at
-http://www.gentoo.org/proj/en/userrel/adopt-a-dev/.
-Since the community has bought into it, GLEP editor g2boojum
-has marked it as "final".
+Marked as Moribund by decision of the Gentoo Council on 2017-11-12.
Credits
=======
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 1fc314fbfc8678058f3d3bef534fed00435ff586
Author: Ulrich Müller <ulm <AT> gentoo <DOT> org>
AuthorDate: Sun Nov 12 21:14:18 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Sun Nov 12 21:14:18 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=1fc314fb
glep-0036: Mark as Moribund.
Bug: https://bugs.gentoo.org/634100
glep-0036.rst | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/glep-0036.rst b/glep-0036.rst
index 64f6a79..5b3f414 100644
--- a/glep-0036.rst
+++ b/glep-0036.rst
@@ -3,14 +3,19 @@ GLEP: 36
Title: Subversion/CVS for Gentoo Hosted Projects
Author: Aaron Walker <ka0ttic@gentoo.org>
Type: Standards Track
-Status: Final
+Status: Moribund
Version: 1
Created: 2004-11-11
-Last-Modified: 2014-01-21
+Last-Modified: 2017-11-12
Post-History: 2005-03-13, 2005-03-21
Content-Type: text/x-rst
---
+Status
+======
+
+Marked as Moribund by decision of the Gentoo Council on 2017-11-12.
+
Abstract
========
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 7d14c52af5c4abc429a85184a71f1360b1ba41a6
Author: Ulrich Müller <ulm <AT> gentoo <DOT> org>
AuthorDate: Sun Nov 12 21:14:39 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Sun Nov 12 21:14:39 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=7d14c52a
glep-0059: Mark as Final.
Bug: https://bugs.gentoo.org/634100
glep-0059.rst | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/glep-0059.rst b/glep-0059.rst
index a44c70f..17b7540 100644
--- a/glep-0059.rst
+++ b/glep-0059.rst
@@ -3,15 +3,20 @@ GLEP: 59
Title: Manifest2 hash policies and security implications
Author: Robin Hugh Johnson <robbat2@gentoo.org>
Type: Standards Track
-Status: Accepted
+Status: Final
Version: 1
Created: 2008-10-22
-Last-Modified: 2014-01-23
+Last-Modified: 2017-11-12
Post-History: 2009-12-01, 2010-01-31
Content-Type: text/x-rst
Requires: 44
---
+Status
+======
+Implementation is complete. Marked as Final by decision of the Gentoo
+Council on 2017-11-12.
+
Abstract
========
While Manifest2 format allows multiple hashes, the question of which
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 2e0621bfffc48ca8f842215f39f1bf218f708902
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sat Oct 28 11:49:39 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=2e0621bf
glep-0074: Update based on feedback from Robin H. Johnson
glep-0074.rst | 66 ++++++++++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 52 insertions(+), 14 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index e9f8bad..425381f 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-26
+Last-Modified: 2017-10-29
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -49,7 +49,7 @@ This specification is designed with the following goals in mind:
1. It should provide means to ensure the authenticity of the complete
repository, including preventing the injection of additional files.
-2. Alike the original Manifest2, the files should be split into two
+2. Like the original Manifest2, the files should be split into two
groups — files whose authenticity is critical, and those whose
mismatch may be accepted in non-strict mode. The same classification
should apply both to files listed in Manifests, and to stray files
@@ -115,11 +115,11 @@ The file entries (except for ``IGNORE``) can be specified for regular
files only. Symbolic links are followed when opening files. It is
an error to specify an entry for a different file type.
-All the files covered by a Manifest tree must reside on the same
-filesystem. It is an error to specify entries applying to files
-on another filesystem. If subdirectories of the Manifest tree reside
-on a different filesystem, they must be explicitly excluded
-via ``IGNORE``.
+All the local (non-``DIST``) files covered by a Manifest tree must
+reside on the same filesystem. It is an error to specify entries
+applying to files on another filesystem. If subdirectories
+of the Manifest tree reside on a different filesystem, they must
+be explicitly excluded via ``IGNORE``.
File verification
@@ -156,7 +156,8 @@ The Manifest files can specify the following tags:
combined date and time in UTC timezone, i.e. using the following
``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optionally used
in the top-level Manifest file. The package manager can use it
- to detect an outdated repository checkout.
+ to detect an outdated repository checkout as described in `Timestamp
+ verification`_.
``MANIFEST <path> <size> <checksums>…``
Specifies a sub-Manifest. The sub-Manifest must be verified like
@@ -209,6 +210,28 @@ allowed at the package directory level:
to ``files/`` subdirectory.
+Timestamp verification
+----------------------
+
+The Manifest file can contain a ``TIMESTAMP`` entry to account
+for attacks against tree update distribution. If such an entry
+is present, it should be updated every time at least one
+of the Manifests changes. Every unique timestamp value must correspond
+to a single tree state.
+
+During the verification process, the client should compare the timestamp
+against the update time obtained from a local clock or a trusted time
+source. If the comparison result indicates that the Manifest at the time
+of receiving was already significantly outdated, the client should
+either fail the verification or require manual confirmation from user.
+
+Furthermore, the Manifest provider may employ additional methods
+of distributing the timestamps of recently generated Manifests
+using a secure channel from a trusted source for exact comparison.
+The exact details of such a solution are outside the scope of this
+specification.
+
+
Algorithm for full-tree verification
------------------------------------
@@ -218,8 +241,9 @@ can be used:
1. Collect all files present in the repository into *present* set.
2. Start at the top-level Manifest file. Verify its OpenPGP signature.
- Optionally verify the ``TIMESTAMP`` entry if present. Remove
- the top-level Manifest from the *present* set.
+ Optionally verify the ``TIMESTAMP`` entry if present as specified
+ in `timestamp verification`. Remove the top-level Manifest
+ from the *present* set.
3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
files according to `file verification`_ section, and include their
@@ -232,7 +256,11 @@ can be used:
5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
``EBUILD`` and ``AUX`` entries into the *covered* set.
-6. Verify all the files in the union of the *present* and *covered*
+6. Verify the entries in *covered* set for incompatible duplicates
+ and collisions with ignored files as explained in `Manifest file
+ locations and nesting`_.
+
+7. Verify all the files in the union of the *present* and *covered*
sets, according to `file verification`_ section.
@@ -489,8 +517,15 @@ The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
to include a generation timestamp in the Manifest. A similar feature
was originally proposed in GLEP 58 [#GLEP58]_.
-The timestamp can be used to detect delay or replay attacks against
-Gentoo mirrors.
+A malicious third-party may use the principles of exclusion and replay
+to deny an update to clients, while at the same time recording
+the identity of clients to attack. The timestamp field can be used
+to detect that.
+
+In order to provide a more complete protection, the Gentoo
+Infrastructure should provide an ability to obtain the timestamps
+of all Manifests from a recent timeframe over a secure channel
+from a trusted source for comparison.
Strictly speaking, this is already provided by the various
``metadata/timestamp.*`` files provided already by Gentoo which are also
@@ -662,7 +697,10 @@ ensured:
the deprecated ``EBUILD`` tag (rather than ``DATA``),
- the Manifest files inside the package directory can be signed
- to provide authenticity verification.
+ to provide authenticity verification,
+
+- if the Manifest files inside the package directory are compressed,
+ a uncompressed file of identical content must coexist.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 2d8523aa998e8d98953c162aac4b51a23a3e155f
Author: Ulrich Müller <ulm <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 10:20:09 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 10:20:09 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=2d8523aa
Update remaining dates to ISO 8601 format.
As required by GLEP 45: "... all dates in existing GLEPs should be
changed to be ISO-8601 compliant."
glep-0022.rst | 2 +-
glep-0023.rst | 2 +-
glep-0027.rst | 2 +-
glep-0028.rst | 2 +-
glep-0030.rst | 2 +-
glep-0031.rst | 2 +-
glep-0033.rst | 2 +-
glep-0039.rst | 2 +-
glep-0040.rst | 4 ++--
glep-0041.rst | 2 +-
glep-0043.rst | 3 +--
glep-0061.rst | 2 +-
glep-0065.rst | 2 +-
13 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/glep-0022.rst b/glep-0022.rst
index a1ae7aa..a39eff2 100644
--- a/glep-0022.rst
+++ b/glep-0022.rst
@@ -19,7 +19,7 @@ After withdrawing this GLEP temporarily, a rewritten version has
now been resubmitted. This version no longer tries to prevent a
keyword explosion, but merely tries to make it manageable.
-This version was approved on 14-Jun-2004, with the amendment that cascading
+This version was approved on 2004-06-14, with the amendment that cascading
profiles should be used.
Credits
diff --git a/glep-0023.rst b/glep-0023.rst
index 8bf4135..7223874 100644
--- a/glep-0023.rst
+++ b/glep-0023.rst
@@ -18,7 +18,7 @@ Status Update
Repoman has been updated to check for the LICENSE syntax. Portage now handles
ACCEPT_LICENSE and license groups, with NON-MUST-HAVE-READ's role handled
-by @EULA. Marking as Final as of 1/16/2014.
+by @EULA. Marking as Final as of 2014-01-16.
Abstract
========
diff --git a/glep-0027.rst b/glep-0027.rst
index 2ce7f1b..11b1063 100644
--- a/glep-0027.rst
+++ b/glep-0027.rst
@@ -15,7 +15,7 @@ Content-Type: text/x-rst
Status
======
-This GLEP was approved as-is on 14-Jun-2004.
+This GLEP was approved as-is on 2004-06-14.
Implementation not completed. Marked deferred by GLEP editor Michał Górny
on 2017-10-13.
diff --git a/glep-0028.rst b/glep-0028.rst
index f3cf365..a8622af 100644
--- a/glep-0028.rst
+++ b/glep-0028.rst
@@ -15,7 +15,7 @@ Content-Type: text/x-rst
Status
======
-This GLEP was approved on 14-Jun-2004 and marked as final on 1/16/2014.
+This GLEP was approved on 2004-06-14 and marked as final on 2014-01-16.
Abstract
========
diff --git a/glep-0030.rst b/glep-0030.rst
index 29961c6..594702c 100644
--- a/glep-0030.rst
+++ b/glep-0030.rst
@@ -14,7 +14,7 @@ Content-Type: text/x-rst
Status
======
-The new `Planet Gentoo`_ came online 10-Mar-2005, so this GLEP is now Final.
+The new `Planet Gentoo`_ came online 2005-03-10, so this GLEP is now Final.
.. _Planet Gentoo: http://planet.gentoo.org/
diff --git a/glep-0031.rst b/glep-0031.rst
index 1d428c0..bb4b1ad 100644
--- a/glep-0031.rst
+++ b/glep-0031.rst
@@ -20,7 +20,7 @@ portage tree and how they should be encoded is required.
Status
======
-Approved on 8-Nov-2004 assuming that implementation will include
+Approved on 2004-11-08 assuming that implementation will include
documentation for correctly encoding files within nano.
Motivation
diff --git a/glep-0033.rst b/glep-0033.rst
index 93f29a0..7008f5c 100644
--- a/glep-0033.rst
+++ b/glep-0033.rst
@@ -15,7 +15,7 @@ Content-Type: text/x-rst
Status
======
-Approved by the Gentoo Council on 15 September 2005. As of Sept. 2006
+Approved by the Gentoo Council on 2005-09-15. As of September 2006
this GLEP is on hold, pending future revisions.
Abstract
diff --git a/glep-0039.rst b/glep-0039.rst
index b0e3dc0..8f61643 100644
--- a/glep-0039.rst
+++ b/glep-0039.rst
@@ -16,7 +16,7 @@ Replaces: 4
Status
======
-Implemented. GLEP amended on 09 Feb 2006 to add the final bullet point to
+Implemented. GLEP amended on 2006-02-09 to add the final bullet point to
list B in `Specification`_.
Abstract
diff --git a/glep-0040.rst b/glep-0040.rst
index ac177c0..9862f1d 100644
--- a/glep-0040.rst
+++ b/glep-0040.rst
@@ -14,8 +14,8 @@ Content-Type: text/x-rst
Status
======
-Approved by the Gentoo Council on 15 September 2005. As of 20060903
-we have a robust x86 arch team, so this GLEP is final
+Approved by the Gentoo Council on 2005-09-15. As of 2006-09-03 we have
+a robust x86 arch team, so this GLEP is final.
Credits
=======
diff --git a/glep-0041.rst b/glep-0041.rst
index e9d3b88..2699be1 100644
--- a/glep-0041.rst
+++ b/glep-0041.rst
@@ -20,7 +20,7 @@ Arch Testers should be treated as official Gentoo staff.
Status
======
-Rejected by the Gentoo Council on 13 Oct. 2005. This GLEP may be resubmitted
+Rejected by the Gentoo Council on 2005-10-13. This GLEP may be resubmitted
if the issues brought up in the council meeting,
http://www.gentoo.org/proj/en/council/meeting-logs/20051013.txt,
are addressed in a new version of this GLEP.
diff --git a/glep-0043.rst b/glep-0043.rst
index 2785c6e..03b7df7 100644
--- a/glep-0043.rst
+++ b/glep-0043.rst
@@ -20,8 +20,7 @@ sample code) associated with GLEPs.
Status
======
-This GLEP has been approved by the GLEP editor and marked Final on
-13 Nov. 2005.
+This GLEP has been approved by the GLEP editor and marked Final on 2005-11-13.
Motivation
==========
diff --git a/glep-0061.rst b/glep-0061.rst
index d9cde9d..3eaf938 100644
--- a/glep-0061.rst
+++ b/glep-0061.rst
@@ -64,7 +64,7 @@ compression.
Example Results with a 32KiB cut-off, gzip algorithm
====================================================
-As of 2010/01/30, the suggested cut-off would impact the following 21 existing
+As of 2010-01-30, the suggested cut-off would impact the following 21 existing
Manifests, for a saving of nearly 900KiB::
Size Path
diff --git a/glep-0065.rst b/glep-0065.rst
index 4889cf2..e628184 100644
--- a/glep-0065.rst
+++ b/glep-0065.rst
@@ -290,7 +290,7 @@ Reference implementation
========================
The reference implementation is available in Portage starting with version
-2.2.15 (released 4 Dec 2014).
+2.2.15 (released 2014-12-04).
Copyright
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: a8ec3d4f1350cdffd3f5d4f7f3fbd8e3c5c7ac40
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sun Oct 22 13:19:20 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=a8ec3d4f
glep-0074: Full-tree verification using Manifest files
glep-0074.rst | 749 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 749 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
new file mode 100644
index 0000000..e9f8bad
--- /dev/null
+++ b/glep-0074.rst
@@ -0,0 +1,749 @@
+---
+GLEP: 74
+Title: Full-tree verification using Manifest files
+Author: Michał Górny <mgorny@gentoo.org>,
+ Robin Hugh Johnson <robbat2@gentoo.org>,
+ Ulrich Müller <ulm@gentoo.org>
+Type: Standards Track
+Status: Draft
+Version: 1
+Created: 2017-10-21
+Last-Modified: 2017-10-26
+Post-History: 2017-10-26
+Content-Type: text/x-rst
+Requires: 59, 61
+Replaces: 44, 58, 60
+---
+
+Abstract
+========
+
+This GLEP extends the Manifest file format to cover full-tree file
+integrity and authenticity checks.The format aims to be future-proof,
+efficient and provide means of backwards compatibility.
+
+
+Motivation
+==========
+
+The Manifest files as defined by GLEP 44 [#GLEP44]_ provide the current
+means of verifying the integrity of distfiles and package files
+in Gentoo. Combined with OpenPGP signatures, they provide means to
+ensure the authenticity of the covered files. However, as noted
+in GLEP 57 [#GLEP57]_ they lack the ability to provide full-tree
+authenticity verification as they do not cover any files outside
+the package directory. In particular, they provide multiple ways
+for a third party to inject malicious code into the ebuild environment.
+
+Historically, the topic of providing authenticity coverage for the whole
+repository has been mentioned multiple times. The most noteworthy effort
+are GLEPs 58 [#GLEP58]_ and 60 [#GLEP60]_ by Robin H. Johnson from 2008.
+They were accepted by the Council in 2010 but have never been
+implemented. When potential implementation work started in 2017, a new
+discussion about the specification arose. It prompted the creation
+of a competing GLEP that would provide a redesigned alternative to
+the old GLEPs.
+
+This specification is designed with the following goals in mind:
+
+1. It should provide means to ensure the authenticity of the complete
+ repository, including preventing the injection of additional files.
+
+2. Alike the original Manifest2, the files should be split into two
+ groups — files whose authenticity is critical, and those whose
+ mismatch may be accepted in non-strict mode. The same classification
+ should apply both to files listed in Manifests, and to stray files
+ present only in the repository.
+
+3. The format should be universal enough to work both for the Gentoo
+ repository and third-party repositories of different characteristics.
+
+4. The Manifest files should be verifiable stand-alone, that is without
+ knowing any details about the underlying repository format.
+
+
+Specification
+=============
+
+Manifest file format
+--------------------
+
+This specification reuses and extends the Manifest file format defined
+in GLEP 44 [#GLEP44]_. For the purpose of it, the *file type* field is
+repurposed as a generic *tag* that could also indicate additional
+(non-checksum) metadata. Appropriately, those tags can be followed by
+other space-separated values.
+
+Unless specified otherwise, the paths used in the Manifest files
+are relative to the directory containing the Manifest file. The paths
+must not reference the parent directory (``..``).
+
+
+Manifest file locations and nesting
+-----------------------------------
+
+The ``Manifest`` file located in the root directory of the repository
+is called top-level Manifest, and it is used to perform the full-tree
+verification. In order to verify the authenticity, it must be signed
+using OpenPGP, using the armored cleartext format.
+
+The top-level Manifest may reference sub-Manifests contained
+in subdirectories of the repository. The sub-Manifests are traditionally
+named ``Manifest``; however, the implementation must support arbitrary
+names, including the possibility of multiple (split) Manifests
+for a single directory. The sub-Manifest can only cover the files inside
+the directory tree where it resides.
+
+The sub-Manifest can also be signed using OpenPGP armored cleartext
+format. However, the signature verification can be omitted if it is
+covered by a signed top-level Manifest.
+
+The Manifest files can also specify ``IGNORE`` entries to skip Manifest
+verification of subdirectories and/or files. Files and directories
+starting with a dot are always implicitly ignored. All files that
+are not ignored must be covered by at least one of the Manifests.
+
+A single file may be matched by multiple identical or equivalent
+Manifest entries, if and only if the entries have the same semantics,
+specify the same size and the checksums common to both entries match.
+It is an error for a single file to be matched by multiple entries
+of different semantics, file size or checksum values. It is an error
+to specify another entry for a file matching ``IGNORE``, or one of its
+subdirectories.
+
+The file entries (except for ``IGNORE``) can be specified for regular
+files only. Symbolic links are followed when opening files. It is
+an error to specify an entry for a different file type.
+
+All the files covered by a Manifest tree must reside on the same
+filesystem. It is an error to specify entries applying to files
+on another filesystem. If subdirectories of the Manifest tree reside
+on a different filesystem, they must be explicitly excluded
+via ``IGNORE``.
+
+
+File verification
+-----------------
+
+When verifying a file against the Manifest, the following rules are
+used:
+
+- if a file listed in Manifest is not present, then the verification
+ for the file fails,
+
+- if a file listed in Manifest is present but has a different size
+ or one of the checksums does not match, the verification fails,
+
+- if a file is present but not listed in Manifest, the verification
+ fails,
+
+- otherwise, the verification succeeds.
+
+Unless specified otherwise, the package manager must not allow using
+any files for which the verification failed. The package manager may
+reject any package or even the whole repository if it may refer to files
+for which the verification failed.
+
+
+New Manifest tags
+-----------------
+
+The Manifest files can specify the following tags:
+
+``TIMESTAMP <iso8601>``
+ Specifies a timestamp of when the Manifest file was last updated.
+ The timestamp must be a valid second-precision ISO8601 extended format
+ combined date and time in UTC timezone, i.e. using the following
+ ``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optionally used
+ in the top-level Manifest file. The package manager can use it
+ to detect an outdated repository checkout.
+
+``MANIFEST <path> <size> <checksums>…``
+ Specifies a sub-Manifest. The sub-Manifest must be verified like
+ a regular file. If the verification succeeds, the entries from
+ the sub-Manifest are included for verification as described
+ in `Manifest file locations and nesting`_.
+
+``IGNORE <path>``
+ Ignores a subdirectory or file from Manifest checks. If the specified
+ path is present, it and its contents are omitted from the Manifest
+ verification (always pass).
+
+``DATA <path> <size> <checksums>…``
+ Specifies a file subject to obligatory Manifest verification.
+ The file is required to pass verification. Used for all files directly
+ affecting package manager operation (ebuilds, eclasses, profiles).
+
+``MISC <path> <size> <checksums>…``
+ Specifies a file subject to non-obligatory Manifest verification.
+ The package manager may ignore a verification failure if operating
+ in non-strict mode. Used for files that do not affect the installed
+ packages (``metadata.xml``, ``use.desc``).
+
+``OPTIONAL <path>``
+ Specifies a file that would be subject to non-obligatory Manifest
+ verification if it existed. The package may ignore a stray file
+ matching this entry if operating in non-strict mode. Used for paths
+ that would match ``MISC`` if they existed.
+
+``DIST <filename> <size> <checksums>…``
+ Specifies a distfile entry used to verify files fetched as part
+ of ``SRC_URI``. The filename must match the filename used to store
+ the fetched file as specified in the PMS [#PMS-FETCH]_. The package
+ manager must reject the fetched file if it fails verification.
+ ``DIST`` entries apply to all packages below the Manifest file
+ specifying them.
+
+
+Deprecated Manifest tags
+------------------------
+
+For backwards compatibility, the following tags are additionally
+allowed at the package directory level:
+
+``EBUILD <filename> <size> <checksums>…``
+ Equivalent to the ``DATA`` type.
+
+``AUX <filename> <size> <checksums>…``
+ Equivalent to the ``DATA`` type, except that the filename is relative
+ to ``files/`` subdirectory.
+
+
+Algorithm for full-tree verification
+------------------------------------
+
+In order to perform full-tree verification, the following algorithm
+can be used:
+
+1. Collect all files present in the repository into *present* set.
+
+2. Start at the top-level Manifest file. Verify its OpenPGP signature.
+ Optionally verify the ``TIMESTAMP`` entry if present. Remove
+ the top-level Manifest from the *present* set.
+
+3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
+ files according to `file verification`_ section, and include their
+ entries in the current Manifest entry list (using paths relative
+ to directories containing the Manifests).
+
+4. Process all ``IGNORE`` entries. Remove any paths matching them
+ from the *present* set.
+
+5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
+ ``EBUILD`` and ``AUX`` entries into the *covered* set.
+
+6. Verify all the files in the union of the *present* and *covered*
+ sets, according to `file verification`_ section.
+
+
+Algorithm for finding parent Manifests
+--------------------------------------
+
+In order to find the top-level Manifest from the current directory
+the following algorithm can be used:
+
+1. Store the current directory as *original* and the device ID
+ of the containing filesystem (``st_dev``) as *startdev*,
+
+2. If the device ID of the containing filesystem (``st_dev``)
+ of the current directory is different than *startdev*, stop.
+
+3. If the current directory contains a ``Manifest`` file:
+
+ a. If a ``IGNORE`` entry in the ``Manifest`` file covers
+ the *original* directory (or one of the parent directories), stop.
+
+ b. Otherwise, store the current directory as *last_found*.
+
+4. If the current directory is the root system directory (``/``), stop.
+
+5. Otherwise, enter the parent directory and jump to step 2.
+
+Once the algorithm stops, *last_found* will contain the relevant
+top-level Manifest. If *last_found* is null, then the directory tree
+does not contain any valid top-level Manifest candidates and one should
+be created in the *original* directory.
+
+Once the top-level Manifest is found, its ``MANIFEST`` entries should
+be used to find any sub-Manifests below the top-level Manifest,
+up to and including the *original* directory. Note that those
+sub-Manifests can use different filenames than ``Manifest``.
+
+
+Checksum algorithms
+-------------------
+
+This section is informational only. Specifying the exact set
+of supported algorithms is outside the scope of this specification.
+
+The algorithm names reserved at the time of writing are:
+
+- ``MD5`` [#MD5]_,
+- ``RMD160`` — RIPEMD-160 [#RIPEMD160]_,
+- ``SHA1`` [#SHS]_,
+- ``SHA256`` and ``SHA512`` — SHA-2 family of hashes [#SHS]_,
+- ``WHIRLPOOL`` [#WHIRLPOOL]_,
+- ``BLAKE2B`` and ``BLAKE2S`` — BLAKE2 family of hashes [#BLAKE2]_,
+- ``SHA3_256`` and ``SHA3_512`` — SHA-3 family of hashes [#SHA3]_,
+- ``STREEBOG256`` and ``STREEBOG512`` — Streebog family of hashes
+ [#STREEBOG]_.
+
+The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
+It is recommended that any new hashes are named after the Python
+``hashlib`` module algorithm names, transformed into uppercase.
+
+
+Manifest compression
+--------------------
+
+The topic of Manifest file compression is covered by GLEP 61 [#GLEP61]_.
+This section merely addresses interoperability issues between Manifest
+compression and this specification.
+
+The compressed Manifest files are required to be suffixed for their
+compression algorithm. This suffix should be used to recognize
+the compression and decompress Manifests transparently. The exact list
+of algorithms and their corresponding suffixes are outside the scope
+of this specification.
+
+Whenever this specification refers to top-level Manifest file,
+the implementation should account for compressed variants of this file
+with appropriate suffixes (e.g. ``Manifest.gz``).
+
+Whenever this specification refers to sub-Manifests, they can use any
+names but are also required to use a specific compression suffix.
+The ``MANIFEST`` entries are required to specify the full name including
+compression suffix, and the verification is performed on the compressed
+file.
+
+The specification permits uncompressed Manifests to exist alongside
+their compressed counterparts, and multiple compressed formats
+to coexist. If that is the case, the files must have the same
+uncompressed content and the specification is free to choose either
+of the files using the same base name.
+
+
+Rationale
+=========
+
+Stand-alone format
+------------------
+
+The first question that needed to be asked before proceeding with
+the design was whether the Manifest file format was supposed to be
+stand-alone, or tightly bound to the repository format.
+
+The stand-alone format has been selected because of its three
+advantages:
+
+1. It is more future-proof. If an incompatible change to the repository
+ format is introduced, only developers need to be upgrade the tools
+ they use to generate the Manifests. The tools used to verify
+ the updated Manifests will continue to work.
+
+2. It is more flexible and universal. With a dedicated tool,
+ the Manifest files can be used to sign and verify arbitrary file
+ sets.
+
+3. It keeps the verification tool simpler. In particular, we can easily
+ write an independent verification tool that could work on any
+ distribution without needing to depend on a package manager
+ implementation or rewrite parts of it.
+
+Designing a stand-alone format requires that the Manifest carries enough
+information to perform the verification following all the rules specific
+to the Gentoo repository.
+
+
+Tree design
+-----------
+
+The second important point of the design was determining whether
+the Manifest files should be structured hierarchically, or independent.
+Both options have their advantages.
+
+In the hierarchical model, each sub-Manifest file is covered by a higher
+level Manifest. As a result, only the top-level Manifest has to be
+OpenPGP-signed, and subsequent Manifests need to be only verified by
+checksum stored in the parent Manifest. This has the following
+implications:
+
+- Verifying any set of files in the repository requires using checksums
+ from the most relevant Manifests and the parent Manifests.
+
+- The OpenPGP signature of the top-level Manifest needs to be verified
+ only once per process.
+
+- Altering any set of files requires updating the relevant Manifests,
+ and their parent Manifests up to the top-level Manifest, and signing
+ the last one.
+
+- As a result, the top-level Manifest changes on every commit,
+ and various middle-level Manifests change (and need to be transferred)
+ frequently.
+
+In the independent model, each sub-Manifest file is independent
+of the parent Manifests. As a result, each of them needs to be signed
+and verified independently. However, the parent Manifests still need
+to list sub-Manifests (albeit without verification data) in order
+to detect removal or replacement of subdirectories. This has
+the following implications:
+
+- Verifying any set of files in the repository requires using checksums
+ and verifying signatures of the most relevant Manifest files.
+
+- Altering any set of files requires updating the relevant Manifests
+ and signing them again.
+
+- Parent Manifests are updated only when Manifests are added or removed
+ from subdirectories. As a result, they change infrequently.
+
+While both models have their advantages, the hierarchical model was
+selected because it reduces the number of OpenPGP operations
+which are comparatively costly to the minimum.
+
+
+Tree layout restrictions
+------------------------
+
+The algorithm is meant to work primarily with ebuild repositories which
+normally contain only files and directories. Directories provide
+no useful metadata for verification, and specifying special entries
+for additional file types is purposeless. Therefore, the specification
+is restricted to dealing with regular files.
+
+The Gentoo repository does not use symbolic links. Some Gentoo
+repositories do, however. To provide a simple solution for dealing with
+symlinks without having to take care to implement special handling for
+them, the common behavior of implicitly resolving them is used.
+Therefore, symbolic links to files are stored as if they were regular
+files, and symbolic links to directories are followed as if they were
+regular directories.
+
+Dotfiles are implicitly ignored as that is a common notion used
+in software written for POSIX systems. All other filenames require
+explicit ``IGNORE`` lines.
+
+The algorithm is restricted to work on a single filesystem. This is
+mostly relevant when scanning for top-level Manifest — we do not want
+to cross filesystem boundaries then. However, to ensure consistent
+bidirectional behavior we need to also ban them when operating downwards
+the tree.
+
+The directories and files on different filesystems needs to be ignored
+explicitly as implicitly skipping them would cause confusion.
+In particular, tools might then claim that a file does not exist when
+it clearly does because it was skipped due to filesystem boundaries.
+
+
+File verification model
+-----------------------
+
+The verification model aims to provide full coverage against different
+forms of attack. In particular, three different kinds of manipulation
+are considered:
+
+1. Alteration of the file content.
+
+2. Removal of a file.
+
+3. Addition of a new file.
+
+In order to prevent against all three, the system requires that all
+files in the repository are listed in Manifests and verified against
+them.
+
+As a special case, ignores are allowed to account for directories
+that are not part of the repository but were traditionally placed inside
+it. Those directories were ``distfiles``, ``local`` and ``packages``. It
+could be also used to ignore VCS directories such as ``CVS``.
+
+
+Non-obligatory Manifest verification
+------------------------------------
+
+While this specification recommends all tools to use strict verification
+by default, it allows declaring some files as non-obligatory like
+the original Manifest2 format did. This could be used on files that do
+not affect the normal package manager operation.
+
+It aims to account for two use cases:
+
+1. Stripping down files that are not strictly required to install
+ packages from repository checkouts.
+
+2. Accounting for automatically generated files that might be updated
+ by standard tooling.
+
+The traditional ``MISC`` type is amended with a complementary
+``OPTIONAL`` tag to account for files that are not provided
+in the specific repository. It aims to ensure that the same path would
+be non-fatal when provided by the repository but fatal when created
+by the user tooling.
+
+
+Timestamp field
+---------------
+
+The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
+to include a generation timestamp in the Manifest. A similar feature
+was originally proposed in GLEP 58 [#GLEP58]_.
+
+The timestamp can be used to detect delay or replay attacks against
+Gentoo mirrors.
+
+Strictly speaking, this is already provided by the various
+``metadata/timestamp.*`` files provided already by Gentoo which are also
+covered by the Manifest. However, including the value in the Manifest
+itself has a little cost and provides the ability to perform
+the verification stand-alone.
+
+
+New vs deprecated tags
+----------------------
+
+Out of the four types defined by Manifest2, two are reused and two are
+marked deprecated.
+
+The ``DIST`` and ``MISC`` tags are reused since they can be relatively
+clearly marked into the new concept.
+
+The ``EBUILD`` tag could potentially be reused for generic file
+verification data. However, it would be confusing if all the different
+data files were marked as ``EBUILD``. Therefore, an equivalent ``DATA``
+type was introduced as a replacement.
+
+The ``AUX`` tag is deprecated as it is redundant to ``DATA``, and has
+the limiting property of implicit ``files/`` path prefix.
+
+
+Finding top-level Manifest
+--------------------------
+
+The development of a reference implementation for this GLEP has brought
+the following problem: how to find all the relevant Manifests when
+the Manifest tool is run inside a subdirectory of the repository?
+
+One of the options would be to provide a bi-directional linking
+of Manifests via a ``PARENT`` tag. However, that would not solve
+the problem when a new Manifest file is being created.
+
+Instead, an algorithm for iterating over parent directories is proposed.
+Since there is no obligatory explicit indicator for the top-level
+Manifest, the algorithm assumes that the top-level Manifest
+is the highest ``Manifest`` in the directory hierarchy that can cover
+the current directory. This generally makes sense since the Manifest
+files are required to provide coverage for all subdirectories, so all
+Manifests starting from that one need to be updated.
+
+If independent Manifest trees are nested in the directory structure,
+then an ``IGNORE`` entry needs to be used to separate them.
+
+Since sub-Manifests can use any filenames, the Manifest finding
+algorithm must not short-cut the procedure by storing all ``Manifest``
+files along the parent directories. Instead, it needs to retrace
+the relevant sub-Manifest files along ``MANIFEST`` entries
+in the top-level Manifest.
+
+
+Injecting ChangeLogs into the checkout
+--------------------------------------
+
+One of the problems considered in the new Manifest format was that
+of injecting historical and autogenerated ChangeLog into the repository.
+Normally we are not including those files to reduce the checkout size.
+However, some users have shown interest in them and Infra is working
+on providing them via an additional rsync module.
+
+If such files were injected into the repository, they would cause strict
+verification failures of Manifests. To account for this, Infra could
+provide either ``OPTIONAL`` entries for the Manifest files to allow them
+in non-strict verification mode, or ``IGNORE`` entries to allow them
+in the strict mode.
+
+
+Splitting distfile checksums from file checksums
+------------------------------------------------
+
+Another problem with the current Manifest format is that the checksums
+for fetched files are combined with checksums for local files
+in a single file inside the package directory. It has been specifically
+pointed out that:
+
+- since distfiles are sometimes reused across different packages,
+ the repeating checksums are redundant,
+
+- mirror admins were interested in the possibility of verifying all
+ the distfiles with a single tool.
+
+This specification does not provide a clean solution to this problem.
+It technically permits moving ``DIST`` entries to higher-level Manifests
+but the usefulness of such a solution is doubtful.
+
+However, for the second problem we will probably deliver a dedicated
+tool working with this Manifest format.
+
+
+Hash algorithms
+---------------
+
+While maintaining a consistent supported hash set is important
+for interoperability, it is no good fit for the generic layout of this
+GLEP. Furthermore, it would require updating the GLEP in the future
+every time the used algorithms change.
+
+Instead, the specification focuses on listing the currently used
+algorithm names for interoperability, and sets a recommendation
+for consistent naming of algorithms in the future. The Python
+``hashlib`` module is used as a reference since it is used
+as the provider of hash functions for most of the Python software,
+including Portage and PkgCore.
+
+The basic rules for changing hash algorithms are defined in GLEP 59
+[#GLEP59]_. The implementations can focus only on those algorithms
+that are actually used or planned on being used. It may be feasible
+to devise a new GLEP that specifies the currently used hashes (or update
+GLEP 59 accordingly).
+
+
+Manifest compression
+--------------------
+
+The support for Manifest compression is introduced with minimal changes
+to the file format. The ``MANIFEST`` entries are required to provide
+the real (compressed) file path for compatibility with other file
+entries and to avoid confusion.
+
+The existence of additional entries for uncompressed Manifest checksums
+was debated. However, plain entries for the uncompressed file would
+be confusing if only compressed file existed, and conflicting if both
+uncompressed and compressed variants existed. Furthermore, it has been
+pointed out that ``DIST`` entries do not have uncompressed variant
+either.
+
+
+Performance considerations
+--------------------------
+
+Performing a full-tree verification on every sync raises some
+performance concerns for end-user systems. The initial testing has shown
+that a cold-cache verification on a btrfs file system can take up around
+4 minutes, with the process being mostly I/O bound. On the other hand,
+it can be expected that the verification will be performed directly
+after syncing, taking advantage of warm filesystem cache.
+
+To improve speed on I/O and/or CPU-restrained systems even further,
+the algorithms can be easily extended to perform incremental
+verification. Given that rsync does not preserve mtimes by default,
+the tool can take advantage of mtime and Manifest comparisons to recheck
+only the parts of the repository that have changed.
+
+Furthermore, the package manager implementations can restrict checking
+only to the parts of the repository that are actually being used.
+
+
+Backwards Compatibility
+=======================
+
+This GLEP provides optional means of preserving backwards compatibility.
+To preserve the backwards compatibility, the following needs to be
+ensured:
+
+- all files within the package directory must be covered by ``Manifest``
+ file inside that package directory,
+
+- all distfiles used by the package must be covered by ``Manifest``
+ file inside the package directory,
+
+- all files inside the ``files/`` subdirectory of a package directory
+ need to be use the deprecated ``AUX`` tag (rather than ``DATA``),
+
+- all ``.ebuild`` files inside the package directory need to use
+ the deprecated ``EBUILD`` tag (rather than ``DATA``),
+
+- the Manifest files inside the package directory can be signed
+ to provide authenticity verification.
+
+Once the backwards compatibility is no longer a concern, the above
+no longer needs to hold and the deprecated tags can be removed.
+
+
+Reference Implementation
+========================
+
+The reference implementation for this GLEP is being developed
+as the gemato project [#GEMATO]_.
+
+
+Credits
+=======
+
+Thanks to all the people whose contributions were invaluable
+to the creation of this GLEP. This includes but is not limited to:
+
+- Robin Hugh Johnson,
+- Ulrich Müller.
+
+Additionally, thanks to Robin Hugh Johnson for the original
+MataManifest GLEP series which served both as inspiration and source
+of many concepts used in this GLEP. Recursively, also thanks to all
+the people who contributed to the original GLEPs.
+
+
+References
+==========
+
+.. [#GLEP44] GLEP 44: Manifest2 format
+ (https://www.gentoo.org/glep/glep-0044.html)
+
+.. [#GLEP57] GLEP 57: Security of distribution of Gentoo software
+ - Overview
+ (https://www.gentoo.org/glep/glep-0057.html)
+
+.. [#GLEP58] GLEP 58: Security of distribution of Gentoo software
+ - Infrastructure to User distribution - MetaManifest
+ (https://www.gentoo.org/glep/glep-0058.html)
+
+.. [#GLEP59] GLEP 59: Manifest2 hash policies and security implications
+ (https://www.gentoo.org/glep/glep-0059.html)
+
+.. [#GLEP60] GLEP 60: Manifest2 filetypes
+ (https://www.gentoo.org/glep/glep-0060.html)
+
+.. [#GLEP61] GLEP 61: Manifest2 compression
+ (https://www.gentoo.org/glep/glep-0061.html)
+
+.. [#PMS-FETCH] Package Manager Specification: Dependency Specification
+ Format - SRC_URI
+ (https://projects.gentoo.org/pms/6/pms.html#x1-940008.2.10)
+
+.. [#MD5] RFC1321: The MD5 Message-Digest Algorithm
+ (https://www.ietf.org/rfc/rfc1321.txt)
+
+.. [#RIPEMD160] The hash function RIPEMD-160
+ (https://homes.esat.kuleuven.be/~bosselae/ripemd160.html)
+
+.. [#SHS] FIPS PUB 180-4: Secure Hash Standard (SHS)
+ (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf)
+
+.. [#WHIRLPOOL] The WHIRLPOOL Hash Function
+ (http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
+
+.. [#BLAKE2] BLAKE2 — fast secure hashing
+ (https://blake2.net/)
+
+.. [#SHA3] FIPS PUB 202: SHA-3 Standard: Permutation-Based Hash
+ and Extendable-Output Functions
+ (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf)
+
+.. [#STREEBOG] GOST R 34.11-2012: Streebog Hash Function
+ (https://www.streebog.net/)
+
+.. [#GEMATO] gemato: Gentoo Manifest Tool
+ (https://github.com/mgorny/gemato/)
+
+Copyright
+=========
+This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
+Unported License. To view a copy of this license, visit
+http://creativecommons.org/licenses/by-sa/3.0/.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 0bf20d23d3b092ec5fde4738678b4bb9bec888e9
Author: Ulrich Müller <ulm <AT> gentoo <DOT> org>
AuthorDate: Sun Nov 12 21:13:22 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Sun Nov 12 21:13:22 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=0bf20d23
glep-0007: Mark as Moribund.
Bug: https://bugs.gentoo.org/634100
glep-0007.rst | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/glep-0007.rst b/glep-0007.rst
index f589f70..fb6b252 100644
--- a/glep-0007.rst
+++ b/glep-0007.rst
@@ -3,17 +3,19 @@ GLEP: 7
Title: New ombudsman position
Author: Grant Goodyear <g2boojum@gentoo.org>
Type: Standards Track
-Status: Final
+Status: Moribund
Version: 1
Created: 2003-07-06
-Last-Modified: 2014-01-15
+Last-Modified: 2017-11-12
Post-History:
Content-Type: text/x-rst
---
Status
======
+
Obsolete, this function is now handled by comrel.
+Marked as Moribund by decision of the Gentoo Council on 2017-11-12.
Abstract
========
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: b01783e9a95ea330c8c5c3876102a36d8ff4218e
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 16:36:55 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:36:55 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=b01783e9
glep-0074: Clarify sub-Manifest signing paragraph
Clarify that the sub-Manifests are always covered by the top-level
Manifest. The previous version may have wrongly suggested that a signed
sub-Manifest does not have to be included in top-level Manifest.
Spotted by k_f, fixed wording by dilfridge.
glep-0074.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 86b2361..97d7829 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -89,8 +89,8 @@ for a single directory. The sub-Manifest can only cover the files inside
the directory tree where it resides.
The sub-Manifest can also be signed using OpenPGP armored cleartext
-format. However, the signature verification can be omitted if it is
-covered by a signed top-level Manifest.
+format. However, the signature verification can be omitted since it
+already is covered by the signed top-level Manifest.
Directory tree coverage
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: cbc0cdfe7057db459d4d9d56af3da8100279fcb2
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:28:34 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=cbc0cdfe
glep-0074: Reorganize to have tag references after basic algos
Reorganize so that file & timestamp verification come first, then tag
references, then specialized algos and other informational sections.
Rename 'new Manifest tags' to 'modern ...' since some of them are old.
glep-0074.rst | 48 ++++++++++++++++++++++++------------------------
1 file changed, 24 insertions(+), 24 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index d476ff3..a37ad34 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -163,8 +163,30 @@ reject any package or even the whole repository if it may refer to files
for which the verification failed.
-New Manifest tags
------------------
+Timestamp verification
+----------------------
+
+The Manifest file can contain a ``TIMESTAMP`` entry to account
+for attacks against tree update distribution. If such an entry
+is present, it should be updated every time at least one
+of the Manifests changes. Every unique timestamp value must correspond
+to a single tree state.
+
+During the verification process, the client should compare the timestamp
+against the update time obtained from a local clock or a trusted time
+source. If the comparison result indicates that the Manifest at the time
+of receiving was already significantly outdated, the client should
+either fail the verification or require manual confirmation from user.
+
+Furthermore, the Manifest provider may employ additional methods
+of distributing the timestamps of recently generated Manifests
+using a secure channel from a trusted source for exact comparison.
+The exact details of such a solution are outside the scope of this
+specification.
+
+
+Modern Manifest tags
+--------------------
The Manifest files can specify the following tags:
@@ -228,28 +250,6 @@ allowed at the package directory level:
to ``files/`` subdirectory.
-Timestamp verification
-----------------------
-
-The Manifest file can contain a ``TIMESTAMP`` entry to account
-for attacks against tree update distribution. If such an entry
-is present, it should be updated every time at least one
-of the Manifests changes. Every unique timestamp value must correspond
-to a single tree state.
-
-During the verification process, the client should compare the timestamp
-against the update time obtained from a local clock or a trusted time
-source. If the comparison result indicates that the Manifest at the time
-of receiving was already significantly outdated, the client should
-either fail the verification or require manual confirmation from user.
-
-Furthermore, the Manifest provider may employ additional methods
-of distributing the timestamps of recently generated Manifests
-using a secure channel from a trusted source for exact comparison.
-The exact details of such a solution are outside the scope of this
-specification.
-
-
Algorithm for full-tree verification
------------------------------------
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 29d3b185220083178af1ce1680af68dc25da94e7
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Sun Nov 5 21:11:03 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=29d3b185
glep-0074: More suggestions from Robin H. Johnson
glep-0074.rst | 64 ++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 37 insertions(+), 27 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index e4d6a80..86b2361 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-30
+Last-Modified: 2017-11-06
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -125,9 +125,10 @@ that are not otherwise ignored, they need to be covered by an explicit
All the local (non-``DIST``) files covered by a Manifest tree must
reside on the same filesystem. It is an error to specify entries
-applying to files on another filesystem. If subdirectories
-that are not otherwise ignored reside on a different filesystem, they
-must be explicitly excluded via ``IGNORE``.
+applying to files on another filesystem. If files or directories that
+are not otherwise ignored reside on a different filesystem, or symbolic
+links point to targets on a different filesystem, they must
+be explicitly excluded via ``IGNORE``.
File verification
@@ -194,7 +195,7 @@ The Manifest files can specify the following tags:
to detect an outdated repository checkout as described in `Timestamp
verification`_.
-``MANIFEST <path> <size> <checksums>…``
+``MANIFEST <path> <size> <checksums>...``
Specifies a sub-Manifest. The sub-Manifest must be verified like
a regular file. If the verification succeeds, the entries from
the sub-Manifest are included for verification as described
@@ -206,12 +207,12 @@ The Manifest files can specify the following tags:
verification (always pass). *Path* must be a plain file or directory
path without a trailing slash, and must not contain wildcards.
-``DATA <path> <size> <checksums>…``
+``DATA <path> <size> <checksums>...``
Specifies a regular file subject to Manifest verification. The file
is required to pass verification. Used for all files that do not match
any other type.
-``DIST <filename> <size> <checksums>…``
+``DIST <filename> <size> <checksums>...``
Specifies a distfile entry used to verify files fetched as part
of ``SRC_URI``. The filename must match the filename used to store
the fetched file as specified in the PMS [#PMS-FETCH]_. The package
@@ -226,15 +227,15 @@ Deprecated Manifest tags
For backwards compatibility, the following tags are additionally
allowed at the package directory level:
-``EBUILD <filename> <size> <checksums>…``
+``EBUILD <filename> <size> <checksums>...``
Equivalent to the ``DATA`` type.
-``MISC <path> <size> <checksums>…``
+``MISC <path> <size> <checksums>...``
Equivalent to the ``DATA`` type. Historically indicated that
the package manager may ignore a verification failure if operating
in non-strict mode. However, that behavior is deprecated.
-``AUX <filename> <size> <checksums>…``
+``AUX <filename> <size> <checksums>...``
Equivalent to the ``DATA`` type, except that the filename is relative
to ``files/`` subdirectory.
@@ -314,13 +315,13 @@ of supported algorithms is outside the scope of this specification.
The algorithm names reserved at the time of writing are:
- ``MD5`` [#MD5]_,
-- ``RMD160`` — RIPEMD-160 [#RIPEMD160]_,
+- ``RMD160`` -- RIPEMD-160 [#RIPEMD160]_,
- ``SHA1`` [#SHS]_,
-- ``SHA256`` and ``SHA512`` — SHA-2 family of hashes [#SHS]_,
+- ``SHA256`` and ``SHA512`` -- SHA-2 family of hashes [#SHS]_,
- ``WHIRLPOOL`` [#WHIRLPOOL]_,
-- ``BLAKE2B`` and ``BLAKE2S`` — BLAKE2 family of hashes [#BLAKE2]_,
-- ``SHA3_256`` and ``SHA3_512`` — SHA-3 family of hashes [#SHA3]_,
-- ``STREEBOG256`` and ``STREEBOG512`` — Streebog family of hashes
+- ``BLAKE2B`` and ``BLAKE2S`` -- BLAKE2 family of hashes [#BLAKE2]_,
+- ``SHA3_256`` and ``SHA3_512`` -- SHA-3 family of hashes [#SHA3]_,
+- ``STREEBOG256`` and ``STREEBOG512`` -- Streebog family of hashes
[#STREEBOG]_.
The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
@@ -370,9 +371,9 @@ the following content::
IGNORE lost+found
IGNORE packages
MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
- …
+ ...
MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
- …
+ ...
An example modern Manifest (disregarding backwards compatibility)
for a package directory would have the following content::
@@ -484,15 +485,17 @@ files, and symbolic links to directories are followed as if they were
regular directories.
Dotfiles are implicitly ignored as that is a common notion used
-in software written for POSIX systems. All other common filenames
-require explicit ``IGNORE`` lines.
+in software written for POSIX systems. All other filenames require
+explicit ``IGNORE`` lines.
An ability to inject additional ignore entries is provided to account
-for site configuration affecting the repository tree — placing
+for site configuration affecting the repository tree -- placing
additional files in it, skipping some of the categories from syncing.
+This configuration can extend beyond the limits of this GLEP,
+e.g. by allowing wildcards or regular expressions.
The algorithm is restricted to work on a single filesystem. This is
-mostly relevant when scanning for top-level Manifest — we do not want
+mostly relevant when scanning for top-level Manifest -- we do not want
to cross filesystem boundaries then. However, to ensure consistent
bidirectional behavior we need to also ban them when operating downwards
the tree.
@@ -551,9 +554,12 @@ However, the usefulness of ``MISC`` in both cases is doubtful.
The cases for stripping unnecessary files mostly focused around space
savings. For this purpose, stripping ``metadata.xml`` and similar files
has little value. It is much more common for users to strip whole
-categories which can not be handled via the ``MISC`` type, and needs
-a dedicated package manager mechanism. The same mechanism can also
-handle files that used the ``MISC`` type.
+packages or categories. The ``MISC`` type is not suitable for that,
+and so a dedicated package manager mechanism needs to be developed
+instead. The same mechanism can also handle files that historically used
+the ``MISC`` type. As an example, the package manager may choose
+to generate both the rsync exclusion list and Manifest ignore list
+using a single source list.
The cases for autogenerated files involve such cache files
as ``use.local.desc``. However, we can not include ``md5-cache`` there
@@ -673,8 +679,8 @@ in a single file inside the package directory. It has been specifically
pointed out that:
- since distfiles are sometimes reused across different packages,
- the repeating checksums are redundant,
-
+ the repeating checksums are redundant [#DIST]_.
+
- mirror admins were interested in the possibility of verifying all
the distfiles with a single tool.
@@ -833,7 +839,7 @@ References
.. [#WHIRLPOOL] The WHIRLPOOL Hash Function
(http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
-.. [#BLAKE2] BLAKE2 — fast secure hashing
+.. [#BLAKE2] BLAKE2 -- fast secure hashing
(https://blake2.net/)
.. [#SHA3] FIPS PUB 202: SHA-3 Standard: Permutation-Based Hash
@@ -846,6 +852,10 @@ References
.. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
(https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
+.. [#DIST] According to Robin H. Johnson, 8.4% of all DIST entries
+ at the time of writing are duplicate, representing a 2 MiB
+ out of 25 MiB of DIST entries altogether.
+
.. [#GEMATO] gemato: Gentoo Manifest Tool
(https://github.com/mgorny/gemato/)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: c6d51030fc780226977607a57e7006fdbe9f2b15
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 2 18:19:35 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=c6d51030
glep-0074: Remove OPTIONAL
glep-0074.rst | 29 ++++-------------------------
1 file changed, 4 insertions(+), 25 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index b7b5a8c..f256451 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -148,13 +148,7 @@ used:
c. otherwise, the verification succeeds.
-3. If the file is covered by an entry of the ``OPTIONAL`` type:
-
- a. if the file is present, then the verification fails,
-
- b. otherwise, the verification succeeds.
-
-4. If the file is present but not listed in Manifest, the verification
+3. If the file is present but not listed in Manifest, the verification
fails.
Unless specified otherwise, the package manager must not allow using
@@ -221,13 +215,6 @@ The Manifest files can specify the following tags:
in non-strict mode. Used for files that do not affect the installed
packages (``metadata.xml``, ``use.desc``).
-``OPTIONAL <path>``
- Specifies a file that does not exist in the distribution but if it
- did, it would be marked as ``MISC``. In the strict mode, the file
- must not exist for the verification to pass. The package manager
- may ignore a stray file matching this entry if operating in non-strict
- mode.
-
``DIST <filename> <size> <checksums>…``
Specifies a distfile entry used to verify files fetched as part
of ``SRC_URI``. The filename must match the filename used to store
@@ -272,8 +259,8 @@ can be used:
4. Process all ``IGNORE`` entries. Remove any paths matching them
from the *present* set.
-5. Collect all files covered by ``DATA``, ``MISC``, ``OPTIONAL``,
- ``EBUILD`` and ``AUX`` entries into the *covered* set.
+5. Collect all files covered by ``DATA``, ``MISC``, ``EBUILD``
+ and ``AUX`` entries into the *covered* set.
6. Verify the entries in *covered* set for incompatible duplicates
and collisions with ignored files as explained in `Manifest file
@@ -550,12 +537,6 @@ It aims to account for two use cases:
2. Accounting for automatically generated files that might be updated
by standard tooling.
-The traditional ``MISC`` type is amended with a complementary
-``OPTIONAL`` tag to account for files that are not provided
-in the specific repository. It aims to ensure that the same path would
-be non-fatal when provided by the repository but fatal when created
-by the user tooling.
-
Timestamp field
---------------
@@ -643,9 +624,7 @@ on providing them via an additional rsync module.
If such files were injected into the repository, they would cause strict
verification failures of Manifests. To account for this, Infra could
-provide either ``OPTIONAL`` entries for the Manifest files to allow them
-in non-strict verification mode, or ``IGNORE`` entries to allow them
-in the strict mode.
+provide ``IGNORE`` entries to allow them to exist.
Splitting distfile checksums from file checksums
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 70fd19782e826360637029bcba4c237ac7ccfedb
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:28:16 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=70fd1978
glep-0074: Rewrite the file verificaton to cover OPTIONAL
glep-0074.rst | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 49fe0ca..d476ff3 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -135,16 +135,27 @@ File verification
When verifying a file against the Manifest, the following rules are
used:
-- if a file listed in Manifest is not present, then the verification
- for the file fails,
+1. If the file is covered directly or indirectly by an entry
+ of the ``IGNORE`` type, the verification always succeeds.
-- if a file listed in Manifest is present but has a different size
- or one of the checksums does not match, the verification fails,
+2. If the file is covered by an entry of the ``MANIFEST``, ``DATA``,
+ ``MISC``, ``EBUILD`` or ``AUX`` type:
-- if a file is present but not listed in Manifest, the verification
- fails,
+ a. if the file is not present, then the verification fails,
-- otherwise, the verification succeeds.
+ b. if the file is present but has a different size or one
+ of the checksums does not match, the verification fails,
+
+ c. otherwise, the verification succeeds.
+
+3. If the file is covered by an entry of the ``OPTIONAL`` type:
+
+ a. if the file is present, then the verification fails,
+
+ b. otherwise, the verification succeeds.
+
+4. If the file is present but not listed in Manifest, the verification
+ fails.
Unless specified otherwise, the package manager must not allow using
any files for which the verification failed. The package manager may
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 1f9e9d784f07fd8957613e4aab1e2cb80fd90cd8
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:29:41 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=1f9e9d78
glep-0074: Add two example files for reference
glep-0074.rst | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index a37ad34..65f32c3 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -369,6 +369,34 @@ uncompressed content and the specification is free to choose either
of the files using the same base name.
+An example Manifest file (informational)
+----------------------------------------
+
+An example top-level Manifest file for the Gentoo repository would have
+the following content::
+
+ TIMESTAMP 2017-10-30T10:11:12Z
+ IGNORE distfiles
+ IGNORE local
+ IGNORE lost+found
+ IGNORE packages
+ MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
+ ...
+ MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
+ ...
+
+An example modern Manifest (disregarding backwards compatibility)
+for a package directory would have the following content::
+
+ DATA SphinxTrain-0.9.1-r1.ebuild 932 SHA256 3d3b.. SHA512 be4d..
+ DATA SphinxTrain-1.0.8.ebuild 912 SHA256 f681.. SHA512 0749..
+ DATA files/gcc.patch 816 SHA256 b56e.. SHA512 2468..
+ DATA files/gcc34.patch 333 SHA256 c107.. SHA512 9919..
+ DIST SphinxTrain-0.9.1-beta.tar.gz 469617 SHA256 c1a4.. SHA512 1b33..
+ DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d..
+ MISC metadata.xml 664 SHA256 97c6.. SHA512 1175..
+
+
Rationale
=========
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 9de08400de2f199c2e457edeedd7b88e9a02be8c
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 16:56:46 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:56:46 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=9de08400
glep-0074: Clarify timestamp handling of sub-Manifests
glep-0074.rst | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index b4dd7a0..e8fc849 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -162,7 +162,7 @@ for which the verification failed.
Timestamp verification
----------------------
-The Manifest file can contain a ``TIMESTAMP`` entry to account
+The top-level Manifest file can contain a ``TIMESTAMP`` entry to account
for attacks against tree update distribution. If such an entry
is present, it should be updated every time at least one
of the Manifests changes. Every unique timestamp value must correspond
@@ -180,6 +180,11 @@ using a secure channel from a trusted source for exact comparison.
The exact details of such a solution are outside the scope of this
specification.
+``TIMESTAMP`` entries may also be present in sub-Manifests. Those
+timestamps must not be newer than the timestamp of the top-level
+Manifest (if present). This specification does not define any specific
+use for them.
+
Modern Manifest tags
--------------------
@@ -190,10 +195,9 @@ The Manifest files can specify the following tags:
Specifies a timestamp of when the Manifest file was last updated.
The timestamp must be a valid second-precision ISO8601 extended format
combined date and time in UTC timezone, i.e. using the following
- ``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optionally used
- in the top-level Manifest file. The package manager can use it
- to detect an outdated repository checkout as described in `Timestamp
- verification`_.
+ ``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optional.
+ The package manager can use it to detect an outdated repository
+ checkout as described in `Timestamp verification`_.
``MANIFEST <path> <size> <checksums>...``
Specifies a sub-Manifest. The sub-Manifest must be verified like
@@ -605,6 +609,9 @@ in the distribution process, past the Manifest generation phase. Those
files will most likely receive ``IGNORE`` entries and therefore
be not suitable to safe use.
+The specification permits additional timestamps in sub-Manifest files
+for local use. A generic testing tool should ignore them.
+
New vs deprecated tags
----------------------
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 516c2ecec8f48f2f8ab7ee47cb9aebcac8347ef5
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 16:49:55 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:49:55 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=516c2ece
glep-0074: Forbid compressing top-level Manifest
glep-0074.rst | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 97d7829..b4dd7a0 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -342,9 +342,11 @@ the compression and decompress Manifests transparently. The exact list
of algorithms and their corresponding suffixes are outside the scope
of this specification.
-Whenever this specification refers to top-level Manifest file,
-the implementation should account for compressed variants of this file
-with appropriate suffixes (e.g. ``Manifest.gz``).
+The top-level Manifest file must not be compressed. Since the OpenPGP
+signature covers the uncompressed text and is compressed itself,
+the data would have to be decompressed without any prior verification.
+This could expose users e.g. to zip bombs or exploits on decompressor
+vulnerabilities.
Whenever this specification refers to sub-Manifests, they can use any
names but are also required to use a specific compression suffix.
@@ -722,6 +724,23 @@ to the file format. The ``MANIFEST`` entries are required to provide
the real (compressed) file path for compatibility with other file
entries and to avoid confusion.
+The compression of top-level Manifest file has been prohibited
+as the specification currently does not provide any means of verifying
+the file prior to decompression. This would make it possibly for
+a malicious third party to provide a compressed Manifest exposing
+decompressor vulnerabilities, or being a zip bomb, and the tooling
+would have to unpack it before being able to verify the contents.
+
+The OpenPGP cleartext signature covers the contents of the Manifest,
+and is therefore compressed along with them. The possibility of using
+detached signature has been considered but it was rejected as
+unnecessary complexity for minor gain.
+
+Technically, a similar result could be effected via moving all the data
+into a compressed sub-Manifest in the top directory (e.g.
+``Manifest.sub.gz``), and including a ``MANIFEST`` entry for this file
+in a signed, uncompressed top-level Manifest.
+
The existence of additional entries for uncompressed Manifest checksums
was debated. However, plain entries for the uncompressed file would
be confusing if only compressed file existed, and conflicting if both
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 6a676ea8b5b593cfbf1ffc0ee2575f55c57ed0e5
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:45:28 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=6a676ea8
glep-0074: Clarify OPTIONAL desc
glep-0074.rst | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 65f32c3..b7b5a8c 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -222,10 +222,11 @@ The Manifest files can specify the following tags:
packages (``metadata.xml``, ``use.desc``).
``OPTIONAL <path>``
- Specifies a file that would be subject to non-obligatory Manifest
- verification if it existed. The package may ignore a stray file
- matching this entry if operating in non-strict mode. Used for paths
- that would match ``MISC`` if they existed.
+ Specifies a file that does not exist in the distribution but if it
+ did, it would be marked as ``MISC``. In the strict mode, the file
+ must not exist for the verification to pass. The package manager
+ may ignore a stray file matching this entry if operating in non-strict
+ mode.
``DIST <filename> <size> <checksums>…``
Specifies a distfile entry used to verify files fetched as part
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: b7d7db208c421db0a30170f80d3e41328a4dc7db
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 2 18:43:14 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=b7d7db20
glep-0074: Deprecate MISC and remove non-strict behavior
glep-0074.rst | 93 +++++++++++++++++++++++++++++++++++++----------------------
1 file changed, 59 insertions(+), 34 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index f256451..eee863a 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -49,16 +49,10 @@ This specification is designed with the following goals in mind:
1. It should provide means to ensure the authenticity of the complete
repository, including preventing the injection of additional files.
-2. Like the original Manifest2, the files should be split into two
- groups — files whose authenticity is critical, and those whose
- mismatch may be accepted in non-strict mode. The same classification
- should apply both to files listed in Manifests, and to stray files
- present only in the repository.
-
-3. The format should be universal enough to work both for the Gentoo
+2. The format should be universal enough to work both for the Gentoo
repository and third-party repositories of different characteristics.
-4. The Manifest files should be verifiable stand-alone, that is without
+3. The Manifest files should be verifiable stand-alone, that is without
knowing any details about the underlying repository format.
@@ -205,15 +199,9 @@ The Manifest files can specify the following tags:
verification (always pass).
``DATA <path> <size> <checksums>…``
- Specifies a file subject to obligatory Manifest verification.
- The file is required to pass verification. Used for all files directly
- affecting package manager operation (ebuilds, eclasses, profiles).
-
-``MISC <path> <size> <checksums>…``
- Specifies a file subject to non-obligatory Manifest verification.
- The package manager may ignore a verification failure if operating
- in non-strict mode. Used for files that do not affect the installed
- packages (``metadata.xml``, ``use.desc``).
+ Specifies a regular file subject to Manifest verification. The file
+ is required to pass verification. Used for all files that do not match
+ any other type.
``DIST <filename> <size> <checksums>…``
Specifies a distfile entry used to verify files fetched as part
@@ -233,6 +221,11 @@ allowed at the package directory level:
``EBUILD <filename> <size> <checksums>…``
Equivalent to the ``DATA`` type.
+``MISC <path> <size> <checksums>…``
+ Equivalent to the ``DATA`` type. Historically indicated that
+ the package manager may ignore a verification failure if operating
+ in non-strict mode. However, that behavior is deprecated.
+
``AUX <filename> <size> <checksums>…``
Equivalent to the ``DATA`` type, except that the filename is relative
to ``files/`` subdirectory.
@@ -378,11 +371,11 @@ for a package directory would have the following content::
DATA SphinxTrain-0.9.1-r1.ebuild 932 SHA256 3d3b.. SHA512 be4d..
DATA SphinxTrain-1.0.8.ebuild 912 SHA256 f681.. SHA512 0749..
+ DATA metadata.xml 664 SHA256 97c6.. SHA512 1175..
DATA files/gcc.patch 816 SHA256 b56e.. SHA512 2468..
DATA files/gcc34.patch 333 SHA256 c107.. SHA512 9919..
DIST SphinxTrain-0.9.1-beta.tar.gz 469617 SHA256 c1a4.. SHA512 1b33..
DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d..
- MISC metadata.xml 664 SHA256 97c6.. SHA512 1175..
Rationale
@@ -521,21 +514,48 @@ it. Those directories were ``distfiles``, ``local`` and ``packages``. It
could be also used to ignore VCS directories such as ``CVS``.
-Non-obligatory Manifest verification
-------------------------------------
+Non-strict Manifest verification
+--------------------------------
-While this specification recommends all tools to use strict verification
-by default, it allows declaring some files as non-obligatory like
-the original Manifest2 format did. This could be used on files that do
-not affect the normal package manager operation.
+Originally the Manifest2 format provided a special ``MISC`` tag that
+was used for ``metadata.xml`` and ``ChangeLog`` files. This tag
+indicated that the Manifest verification failures could be ignored for
+those files unless the package manager was working in strict mode.
-It aims to account for two use cases:
+The first versions of this specification continued the use of this tag.
+However, after a long debate it was decided to deprecate it along with
+the non-strict behavior, and require all files to strictly match.
-1. Stripping down files that are not strictly required to install
- packages from repository checkouts.
+Two arguments were mentioned for the usefulness of a ``MISC`` type:
-2. Accounting for automatically generated files that might be updated
- by standard tooling.
+1. being able to reduce the checkout size by stripping unnecessary
+ files out, and
+
+2. being able to run update automatically generated files locally
+ without causing unnecessary verification failures.
+
+However, the usefulness of ``MISC`` in both cases is doubtful.
+
+The cases for stripping unnecessary files mostly focused around space
+savings. For this purpose, stripping ``metadata.xml`` and similar files
+has little value. It is much more common for users to strip whole
+categories which can not be handled via the ``MISC`` type, and needs
+a dedicated package manager mechanism. The same mechanism can also
+handle files that used the ``MISC`` type.
+
+The cases for autogenerated files involve such cache files
+as ``use.local.desc``. However, we can not include ``md5-cache`` there
+due to security concerns which results in inconsistent cache handling.
+Furthermore, the tools were historically modified to provide stable
+output which means that their content can not change without
+a non-``MISC`` content being changed first. This practically defeats
+the purpose of using ``MISC``.
+
+Finally, the non-strict mode could be used as means to an attack.
+The allowance of missing or modified documentation file could be used
+to spread misinformation, resulting in bad decisions made by the user.
+A modified file could also be used e.g. to exploit vulnerabilities
+of an XML parser.
Timestamp field
@@ -569,17 +589,22 @@ be not suitable to safe use.
New vs deprecated tags
----------------------
-Out of the four types defined by Manifest2, two are reused and two are
-marked deprecated.
+Out of the four types defined by Manifest2, only one is reused
+and the remaining three is replaced by a single, universal ``DATA``
+type.
-The ``DIST`` and ``MISC`` tags are reused since they can be relatively
-clearly marked into the new concept.
+The ``DIST`` tag is reused since the specification does not change
+anything with regard to distfile handling.
The ``EBUILD`` tag could potentially be reused for generic file
verification data. However, it would be confusing if all the different
data files were marked as ``EBUILD``. Therefore, an equivalent ``DATA``
type was introduced as a replacement.
+The ``MISC`` tag and the relevant non-strict mode has been removed
+as being of little value, as detailed in the `Non-strict Manifest
+verification`_ section.
+
The ``AUX`` tag is deprecated as it is redundant to ``DATA``, and has
the limiting property of implicit ``files/`` path prefix.
@@ -622,7 +647,7 @@ Normally we are not including those files to reduce the checkout size.
However, some users have shown interest in them and Infra is working
on providing them via an additional rsync module.
-If such files were injected into the repository, they would cause strict
+If such files were injected into the repository, they would cause
verification failures of Manifests. To account for this, Infra could
provide ``IGNORE`` entries to allow them to exist.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 9e9da087261ed280adad4c52e243b8cc5f89b23e
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:27:31 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=9e9da087
glep-0074: Apply more suggestions from Robin
glep-0074.rst | 40 +++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 425381f..1147e62 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-10-29
+Last-Modified: 2017-10-30
Post-History: 2017-10-26
Content-Type: text/x-rst
Requires: 59, 61
@@ -99,9 +99,12 @@ format. However, the signature verification can be omitted if it is
covered by a signed top-level Manifest.
The Manifest files can also specify ``IGNORE`` entries to skip Manifest
-verification of subdirectories and/or files. Files and directories
-starting with a dot are always implicitly ignored. All files that
-are not ignored must be covered by at least one of the Manifests.
+verification of subdirectories and/or files. The package manager can
+support injecting ignore paths to account for additional files created,
+modified or removed by user's processes that would not be ignored
+by existing rules. Files and directories starting with a dot are always
+implicitly ignored. All files that are not ignored must be covered
+by at least one of the Manifests.
A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
@@ -517,21 +520,25 @@ The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
to include a generation timestamp in the Manifest. A similar feature
was originally proposed in GLEP 58 [#GLEP58]_.
-A malicious third-party may use the principles of exclusion and replay
-to deny an update to clients, while at the same time recording
-the identity of clients to attack. The timestamp field can be used
-to detect that.
+A malicious third-party may use the principles of exclusion or replay
+[#C08]_ to deny an update to clients, while at the same time recording
+the identity of clients to attack. The timestamp field can be used to
+detect that.
In order to provide a more complete protection, the Gentoo
Infrastructure should provide an ability to obtain the timestamps
of all Manifests from a recent timeframe over a secure channel
from a trusted source for comparison.
-Strictly speaking, this is already provided by the various
-``metadata/timestamp.*`` files provided already by Gentoo which are also
-covered by the Manifest. However, including the value in the Manifest
-itself has a little cost and provides the ability to perform
-the verification stand-alone.
+Strictly speaking, this information is already provided by the various
+``metadata/timestamp*`` files that are already present. However,
+including the value in the Manifest itself has a little cost
+and provides the ability to perform the verification stand-alone.
+
+Furthermore, some of the timestamp files are added very late
+in the distribution process, past the Manifest generation phase. Those
+files will most likely receive ``IGNORE`` entries and therefore
+be not suitable to safe use.
New vs deprecated tags
@@ -699,8 +706,8 @@ ensured:
- the Manifest files inside the package directory can be signed
to provide authenticity verification,
-- if the Manifest files inside the package directory are compressed,
- a uncompressed file of identical content must coexist.
+- an uncompressed Manifest file must exist in the package directory,
+ and a compressed Manifest of identical content may be present.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
@@ -777,6 +784,9 @@ References
.. [#STREEBOG] GOST R 34.11-2012: Streebog Hash Function
(https://www.streebog.net/)
+.. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
+ (https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
+
.. [#GEMATO] gemato: Gentoo Manifest Tool
(https://github.com/mgorny/gemato/)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 1c5bfc9902a8e00c1d1f0bab183e8925376d2249
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Oct 30 16:27:51 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=1c5bfc99
glep-0074: Split 'Directory tree coverage' section out
glep-0074.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index 1147e62..49fe0ca 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -98,6 +98,10 @@ The sub-Manifest can also be signed using OpenPGP armored cleartext
format. However, the signature verification can be omitted if it is
covered by a signed top-level Manifest.
+
+Directory tree coverage
+-----------------------
+
The Manifest files can also specify ``IGNORE`` entries to skip Manifest
verification of subdirectories and/or files. The package manager can
support injecting ignore paths to account for additional files created,
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: 85a30b31496898b73dbdb06c4ffb665dad23b653
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 2 19:08:12 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 16:33:01 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=85a30b31
glep-0074: Further cleanup
glep-0074.rst | 73 ++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 42 insertions(+), 31 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index eee863a..e4d6a80 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -96,13 +96,17 @@ covered by a signed top-level Manifest.
Directory tree coverage
-----------------------
-The Manifest files can also specify ``IGNORE`` entries to skip Manifest
-verification of subdirectories and/or files. The package manager can
-support injecting ignore paths to account for additional files created,
-modified or removed by user's processes that would not be ignored
-by existing rules. Files and directories starting with a dot are always
-implicitly ignored. All files that are not ignored must be covered
-by at least one of the Manifests.
+The specification provides three ways of skipping Manifest verification
+of specific files and directories (recursively):
+
+1. explicit ``IGNORE`` entries in Manifest files,
+
+2. injected ignore paths via package manager configuration,
+
+3. using names starting with a dot (``.``) which are always skipped.
+
+All files that are not ignored must be covered by at least one
+of the Manifests.
A single file may be matched by multiple identical or equivalent
Manifest entries, if and only if the entries have the same semantics,
@@ -113,14 +117,17 @@ to specify another entry for a file matching ``IGNORE``, or one of its
subdirectories.
The file entries (except for ``IGNORE``) can be specified for regular
-files only. Symbolic links are followed when opening files. It is
-an error to specify an entry for a different file type.
+files only. Symbolic links are followed when opening files
+and traversing directories. It is an error to specify an entry for
+a different file type. If the tree contain files of other types
+that are not otherwise ignored, they need to be covered by an explicit
+``IGNORE``.
All the local (non-``DIST``) files covered by a Manifest tree must
reside on the same filesystem. It is an error to specify entries
applying to files on another filesystem. If subdirectories
-of the Manifest tree reside on a different filesystem, they must
-be explicitly excluded via ``IGNORE``.
+that are not otherwise ignored reside on a different filesystem, they
+must be explicitly excluded via ``IGNORE``.
File verification
@@ -196,7 +203,8 @@ The Manifest files can specify the following tags:
``IGNORE <path>``
Ignores a subdirectory or file from Manifest checks. If the specified
path is present, it and its contents are omitted from the Manifest
- verification (always pass).
+ verification (always pass). *Path* must be a plain file or directory
+ path without a trailing slash, and must not contain wildcards.
``DATA <path> <size> <checksums>…``
Specifies a regular file subject to Manifest verification. The file
@@ -362,9 +370,9 @@ the following content::
IGNORE lost+found
IGNORE packages
MANIFEST app-accessibility/Manifest 14821 SHA256 1b5f.. SHA512 f7eb..
- ...
+ …
MANIFEST eclass/Manifest.gz 50812 SHA256 8c55.. SHA512 2915..
- ...
+ …
An example modern Manifest (disregarding backwards compatibility)
for a package directory would have the following content::
@@ -476,8 +484,12 @@ files, and symbolic links to directories are followed as if they were
regular directories.
Dotfiles are implicitly ignored as that is a common notion used
-in software written for POSIX systems. All other filenames require
-explicit ``IGNORE`` lines.
+in software written for POSIX systems. All other common filenames
+require explicit ``IGNORE`` lines.
+
+An ability to inject additional ignore entries is provided to account
+for site configuration affecting the repository tree — placing
+additional files in it, skipping some of the categories from syncing.
The algorithm is restricted to work on a single filesystem. This is
mostly relevant when scanning for top-level Manifest — we do not want
@@ -485,7 +497,7 @@ to cross filesystem boundaries then. However, to ensure consistent
bidirectional behavior we need to also ban them when operating downwards
the tree.
-The directories and files on different filesystems needs to be ignored
+The directories and files on different filesystems need to be ignored
explicitly as implicitly skipping them would cause confusion.
In particular, tools might then claim that a file does not exist when
it clearly does because it was skipped due to filesystem boundaries.
@@ -736,26 +748,25 @@ Backwards Compatibility
=======================
This GLEP provides optional means of preserving backwards compatibility.
-To preserve the backwards compatibility, the following needs to be
-ensured:
+To preserve the backwards compatibility, the following needs to hold
+for the ``Manifest`` file in every package directory:
+
+- all files must be covered by the single ``Manifest`` file,
-- all files within the package directory must be covered by ``Manifest``
- file inside that package directory,
+- all distfiles used by the package must be included,
-- all distfiles used by the package must be covered by ``Manifest``
- file inside the package directory,
+- all files inside the ``files/`` subdirectory need to use
+ the ``AUX`` tag (rather than ``DATA``),
-- all files inside the ``files/`` subdirectory of a package directory
- need to be use the deprecated ``AUX`` tag (rather than ``DATA``),
+- all ``.ebuild`` files need to use the ``EBUILD`` tag,
-- all ``.ebuild`` files inside the package directory need to use
- the deprecated ``EBUILD`` tag (rather than ``DATA``),
+` the ``metadata.xml`` and ``ChangeLog`` files need to use
+ the ``MISC`` tag,
-- the Manifest files inside the package directory can be signed
- to provide authenticity verification,
+- the Manifest can be signed to provide authenticity verification,
-- an uncompressed Manifest file must exist in the package directory,
- and a compressed Manifest of identical content may be present.
+- an uncompressed Manifest must always exist, and a compressed Manifest
+ of identical content may be present.
Once the backwards compatibility is no longer a concern, the above
no longer needs to hold and the deprecated tags can be removed.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-13 17:35 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-13 17:35 UTC (permalink / raw
To: gentoo-commits
commit: b53ced9b3e8900dd5584b768d622fcbff15e78be
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 17:06:27 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 13 17:06:27 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=b53ced9b
glep-0074: Explain combining multiple Manifest trees
The idea has been originally suggested by Robin H. Johnson.
glep-0074.rst | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index e8fc849..aa26147 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -365,6 +365,34 @@ uncompressed content and the specification is free to choose either
of the files using the same base name.
+Combining multiple Manifest trees (informational)
+-------------------------------------------------
+
+This specification permits nesting multiple hierarchical Manifest trees.
+In this layout, the specific directories of the Manifest tree can
+be verified both as a part of another top-level Manifest,
+and as an independent Manifest tree (when obtained without the parent
+directory).
+
+For this to work, the sub-Manifest file in the directory must also
+satisfy the requirements for the top-level Manifest file. That is:
+
+- it must be named ``Manifest`` and not compressed,
+
+- it must cover all the files in this directory and its subdirectories
+ (i.e. no files from the directory tree can be covered by parent
+ Manifest),
+
+- if authenticity verification is desired, it must be OpenPGP-signed.
+
+It should be noted that if such a directory is a subdirectory of a valid
+Manifest tree, the sub-Manifest needs to be valid according
+to the top-level Manifest and the OpenPGP signature is disregarded
+as detailed in `Manifest file locations and nesting`_. The top-level
+behavior is exhibited only when the directory is obtained without parent
+directories.
+
+
An example Manifest file (informational)
----------------------------------------
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-20 17:26 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-20 17:26 UTC (permalink / raw
To: gentoo-commits
commit: 09ed01f5a480b0b8042bddc93ef19a02e02326d0
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 13 17:06:27 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 16 10:17:02 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=09ed01f5
glep-0074: Explain combining multiple Manifest trees
The idea has been originally suggested by Robin H. Johnson.
glep-0074.rst | 34 +++++++++++++++++++++++++++++++---
1 file changed, 31 insertions(+), 3 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index e8fc849..42c0c9e 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,8 +8,8 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-11-06
-Post-History: 2017-10-26
+Last-Modified: 2017-11-16
+Post-History: 2017-10-26, 2017-11-16
Content-Type: text/x-rst
Requires: 59, 61
Replaces: 44, 58, 60
@@ -365,6 +365,34 @@ uncompressed content and the specification is free to choose either
of the files using the same base name.
+Combining multiple Manifest trees (informational)
+-------------------------------------------------
+
+This specification permits nesting multiple hierarchical Manifest trees.
+In this layout, the specific directories of the Manifest tree can
+be verified both as a part of another top-level Manifest,
+and as an independent Manifest tree (when obtained without the parent
+directory).
+
+For this to work, the sub-Manifest file in the directory must also
+satisfy the requirements for the top-level Manifest file. That is:
+
+- it must be named ``Manifest`` and not compressed,
+
+- it must cover all the files in this directory and its subdirectories
+ (i.e. no files from the directory tree can be covered by parent
+ Manifest),
+
+- if authenticity verification is desired, it must be OpenPGP-signed.
+
+It should be noted that if such a directory is a subdirectory of a valid
+Manifest tree, the sub-Manifest needs to be valid according
+to the top-level Manifest and the OpenPGP signature is disregarded
+as detailed in `Manifest file locations and nesting`_. The top-level
+behavior is exhibited only when the directory is obtained without parent
+directories.
+
+
An example Manifest file (informational)
----------------------------------------
@@ -792,7 +820,7 @@ for the ``Manifest`` file in every package directory:
- all ``.ebuild`` files need to use the ``EBUILD`` tag,
-` the ``metadata.xml`` and ``ChangeLog`` files need to use
+- the ``metadata.xml`` and ``ChangeLog`` files need to use
the ``MISC`` tag,
- the Manifest can be signed to provide authenticity verification,
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-20 17:26 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-20 17:26 UTC (permalink / raw
To: gentoo-commits
commit: 7f9bd9fa8aa0f21950a4c42e20fca1bc10f4c22c
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 20 17:22:40 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 20 17:22:40 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=7f9bd9fa
glep-0074: Include suggestions from Daniel Campbell
glep-0074.rst | 59 ++++++++++++++++++++++++++++++-----------------------------
1 file changed, 30 insertions(+), 29 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 42c0c9e..6081937 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -19,7 +19,7 @@ Abstract
========
This GLEP extends the Manifest file format to cover full-tree file
-integrity and authenticity checks.The format aims to be future-proof,
+integrity and authenticity checks. The format aims to be future-proof,
efficient and provide means of backwards compatibility.
@@ -435,7 +435,7 @@ The stand-alone format has been selected because of its three
advantages:
1. It is more future-proof. If an incompatible change to the repository
- format is introduced, only developers need to be upgrade the tools
+ format is introduced, only developers need to upgrade the tools
they use to generate the Manifests. The tools used to verify
the updated Manifests will continue to work.
@@ -498,7 +498,7 @@ the following implications:
While both models have their advantages, the hierarchical model was
selected because it reduces the number of OpenPGP operations
-which are comparatively costly to the minimum.
+(which are comparatively costly) to the minimum.
Tree layout restrictions
@@ -606,14 +606,14 @@ the purpose of using ``MISC``.
Finally, the non-strict mode could be used as means to an attack.
The allowance of missing or modified documentation file could be used
to spread misinformation, resulting in bad decisions made by the user.
-A modified file could also be used e.g. to exploit vulnerabilities
+A modified file could also be used, e.g. to exploit vulnerabilities
of an XML parser.
Timestamp field
---------------
-The top-level Manifests optionally allows using a ``TIMESTAMP`` tag
+The top-level Manifest optionally allows using a ``TIMESTAMP`` tag
to include a generation timestamp in the Manifest. A similar feature
was originally proposed in GLEP 58 [#GLEP58]_.
@@ -622,10 +622,10 @@ A malicious third-party may use the principles of exclusion or replay
the identity of clients to attack. The timestamp field can be used to
detect that.
-In order to provide a more complete protection, the Gentoo
-Infrastructure should provide an ability to obtain the timestamps
-of all Manifests from a recent timeframe over a secure channel
-from a trusted source for comparison.
+In order to provide more complete protection, the Gentoo Infrastructure
+should provide an ability to obtain the timestamps of all Manifests
+from a recent timeframe over a secure channel from a trusted source
+for comparison.
Strictly speaking, this information is already provided by the various
``metadata/timestamp*`` files that are already present. However,
@@ -635,7 +635,7 @@ and provides the ability to perform the verification stand-alone.
Furthermore, some of the timestamp files are added very late
in the distribution process, past the Manifest generation phase. Those
files will most likely receive ``IGNORE`` entries and therefore
-be not suitable to safe use.
+be unsafe to use.
The specification permits additional timestamps in sub-Manifest files
for local use. A generic testing tool should ignore them.
@@ -645,7 +645,7 @@ New vs deprecated tags
----------------------
Out of the four types defined by Manifest2, only one is reused
-and the remaining three is replaced by a single, universal ``DATA``
+and the remaining three are replaced by a single, universal ``DATA``
type.
The ``DIST`` tag is reused since the specification does not change
@@ -696,11 +696,11 @@ in the top-level Manifest.
Injecting ChangeLogs into the checkout
--------------------------------------
-One of the problems considered in the new Manifest format was that
-of injecting historical and autogenerated ChangeLog into the repository.
-Normally we are not including those files to reduce the checkout size.
-However, some users have shown interest in them and Infra is working
-on providing them via an additional rsync module.
+One of the problems considered in the new Manifest format was injecting
+historical and autogenerated ChangeLog into the repository. We normally
+don't include those files, to reduce the checkout size. However, some
+users have shown interest in them and Infra is working on providing them
+via an additional rsync module.
If such files were injected into the repository, they would cause
verification failures of Manifests. To account for this, Infra could
@@ -733,9 +733,9 @@ Hash algorithms
---------------
While maintaining a consistent supported hash set is important
-for interoperability, it is no good fit for the generic layout of this
-GLEP. Furthermore, it would require updating the GLEP in the future
-every time the used algorithms change.
+for interoperability, it is not a good fit for the generic layout
+of this GLEP. Furthermore, it would require updating the GLEP
+in the future every time the used algorithms change.
Instead, the specification focuses on listing the currently used
algorithm names for interoperability, and sets a recommendation
@@ -761,10 +761,11 @@ entries and to avoid confusion.
The compression of top-level Manifest file has been prohibited
as the specification currently does not provide any means of verifying
-the file prior to decompression. This would make it possibly for
-a malicious third party to provide a compressed Manifest exposing
-decompressor vulnerabilities, or being a zip bomb, and the tooling
-would have to unpack it before being able to verify the contents.
+the file prior to decompression. If the top-level Manifest is
+compressed, tooling will have to unpack the file before being able
+to verify the contents. This makes it possible for a malicious third
+party to attack the system by providing a compressed Manifest that
+exposes decompressor vulnerabilities, or a zip bomb.
The OpenPGP cleartext signature covers the contents of the Manifest,
and is therefore compressed along with them. The possibility of using
@@ -778,10 +779,10 @@ in a signed, uncompressed top-level Manifest.
The existence of additional entries for uncompressed Manifest checksums
was debated. However, plain entries for the uncompressed file would
-be confusing if only compressed file existed, and conflicting if both
-uncompressed and compressed variants existed. Furthermore, it has been
-pointed out that ``DIST`` entries do not have uncompressed variant
-either.
+be confusing if only the compressed file existed, and conflicting
+if both uncompressed and compressed variants existed. Furthermore,
+it has been pointed out that ``DIST`` entries do not have uncompressed
+variant either.
Performance considerations
@@ -792,7 +793,7 @@ performance concerns for end-user systems. The initial testing has shown
that a cold-cache verification on a btrfs file system can take up around
4 minutes, with the process being mostly I/O bound. On the other hand,
it can be expected that the verification will be performed directly
-after syncing, taking advantage of warm filesystem cache.
+after syncing, taking advantage of a warm filesystem cache.
To improve speed on I/O and/or CPU-restrained systems even further,
the algorithms can be easily extended to perform incremental
@@ -849,7 +850,7 @@ to the creation of this GLEP. This includes but is not limited to:
- Ulrich Müller.
Additionally, thanks to Robin Hugh Johnson for the original
-MataManifest GLEP series which served both as inspiration and source
+MetaManifest GLEP series which served both as inspiration and source
of many concepts used in this GLEP. Recursively, also thanks to all
the people who contributed to the original GLEPs.
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-20 18:41 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-20 18:41 UTC (permalink / raw
To: gentoo-commits
commit: 4124b2fc77b25a8951c4de95ff79d2623ff02361
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 20 17:41:13 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 20 17:41:13 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=4124b2fc
glep-0074: Explicitly specify UTF-8 encoding
glep-0074.rst | 2 ++
1 file changed, 2 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index 6081937..f96a58e 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -72,6 +72,8 @@ Unless specified otherwise, the paths used in the Manifest files
are relative to the directory containing the Manifest file. The paths
must not reference the parent directory (``..``).
+The Manifest files use UTF-8 encoding.
+
Manifest file locations and nesting
-----------------------------------
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-20 18:41 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-20 18:41 UTC (permalink / raw
To: gentoo-commits
commit: 9d819c9a981416936dcda2f55e54ea70e494e59e
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Mon Nov 20 18:40:41 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Mon Nov 20 18:40:41 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=9d819c9a
glep-0074: Disallow filenames containing whitespace
glep-0074.rst | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index f96a58e..46ad9fe 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -132,6 +132,13 @@ are not otherwise ignored reside on a different filesystem, or symbolic
links point to targets on a different filesystem, they must
be explicitly excluded via ``IGNORE``.
+All paths specified in the Manifest file must consist of characters
+corresponding to valid UTF-8 code points excluding the NULL character
+(``U+0000``) and characters classified as whitespace in the current
+version of the Unicode standard [#UNICODE]_. It is an error to use
+Manifest files in directories containing files whose names contain
+the disallowed characters.
+
File verification
-----------------
@@ -542,6 +549,45 @@ In particular, tools might then claim that a file does not exist when
it clearly does because it was skipped due to filesystem boundaries.
+Filename character set restriction
+----------------------------------
+
+The valid set of filename characters for the Gentoo repository
+is restricted by the devmanual 'File Naming Rules' section
+[#FILE-NAMING-RULES]_, and enforced via a git hook. The valid distfile
+names are not restricted explicitly -- however, the PMS dependency
+specification syntax [#PMS-FETCH]_ implicitly makes it impossible to use
+filenames containing whitespace.
+
+This specification aims to avoid arbitrary restrictions. For this
+reason, the filename characters are only restricted by excluding two
+technically problematic groups:
+
+1. The NULL character (``U+0000``) is normally used to indicate the end
+ of a null-terminated string. Its use could therefore break programs
+ written using C. Furthermore, it is not allowed in any known
+ filesystem.
+
+2. The whitespace characters are used to separate Manifest fields. While
+ technically it would be enough to restrict space (``U+0020``)
+ character that is normally used as the separator, all whitespace
+ characters are forbidden to avoid confusion and implementation
+ errors.
+
+While the specification could be extended to allow such filenames
+by using some form of escaping, there is currently no apparent need
+for such a feature.
+
+Historically, Portage attempted to overcome the whitespace limitation
+by attempting to locate the size field and take everything before it
+as filename. This was terribly fragile and even if it worked, it would
+solve the problem only partially.
+
+Since the same restrictions apply to ``IGNORE`` rules, it is currently
+not possible to either list or ignore the file using whitespace
+characters. Therefore, the presence of such files is forbidden entirely.
+
+
File verification model
-----------------------
@@ -880,10 +926,16 @@ References
.. [#GLEP61] GLEP 61: Manifest2 compression
(https://www.gentoo.org/glep/glep-0061.html)
+.. [#UNICODE] The Unicode standard
+ (https://unicode.org/versions/latest/)
+
.. [#PMS-FETCH] Package Manager Specification: Dependency Specification
Format - SRC_URI
(https://projects.gentoo.org/pms/6/pms.html#x1-940008.2.10)
+.. [#FILE-NAMING-RULES] Ebuild File Format -- Gentoo Development Guide
+ (https://devmanual.gentoo.org/ebuild-writing/file-format/#file-naming-rules)
+
.. [#MD5] RFC1321: The MD5 Message-Digest Algorithm
(https://www.ietf.org/rfc/rfc1321.txt)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-21 17:48 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-21 17:48 UTC (permalink / raw
To: gentoo-commits
commit: 54cc3ef5fad2ce24b8f05c2d76ea05e43d4cb1ab
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Tue Nov 21 17:14:53 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Tue Nov 21 17:14:53 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=54cc3ef5
glep-0074: Apply suggestions from Ulrich Müller
glep-0074.rst | 47 +++++++++++++++++++++++++----------------------
1 file changed, 25 insertions(+), 22 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 46ad9fe..278882d 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -181,7 +181,8 @@ During the verification process, the client should compare the timestamp
against the update time obtained from a local clock or a trusted time
source. If the comparison result indicates that the Manifest at the time
of receiving was already significantly outdated, the client should
-either fail the verification or require manual confirmation from user.
+either fail the verification or require manual confirmation from
+the user.
Furthermore, the Manifest provider may employ additional methods
of distributing the timestamps of recently generated Manifests
@@ -202,11 +203,11 @@ The Manifest files can specify the following tags:
``TIMESTAMP <iso8601>``
Specifies a timestamp of when the Manifest file was last updated.
- The timestamp must be a valid second-precision ISO8601 extended format
- combined date and time in UTC timezone, i.e. using the following
- ``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``. Optional.
- The package manager can use it to detect an outdated repository
- checkout as described in `Timestamp verification`_.
+ The timestamp must be a valid second-precision ISO 8601 extended
+ format combined date and time in UTC timezone, i.e. using
+ the following ``strftime()`` format string: ``%Y-%m-%dT%H:%M:%SZ``.
+ Optional. The package manager can use it to detect an outdated
+ repository checkout as described in `Timestamp verification`_.
``MANIFEST <path> <size> <checksums>...``
Specifies a sub-Manifest. The sub-Manifest must be verified like
@@ -218,7 +219,8 @@ The Manifest files can specify the following tags:
Ignores a subdirectory or file from Manifest checks. If the specified
path is present, it and its contents are omitted from the Manifest
verification (always pass). *Path* must be a plain file or directory
- path without a trailing slash, and must not contain wildcards.
+ path without a trailing slash. Wildcards are not supported
+ and wildcard characters are interpreted literally.
``DATA <path> <size> <checksums>...``
Specifies a regular file subject to Manifest verification. The file
@@ -250,7 +252,7 @@ allowed at the package directory level:
``AUX <filename> <size> <checksums>...``
Equivalent to the ``DATA`` type, except that the filename is relative
- to ``files/`` subdirectory.
+ to the ``files/`` subdirectory.
Algorithm for full-tree verification
@@ -267,9 +269,9 @@ can be used:
from the *present* set.
3. Process all ``MANIFEST`` entries, recursively. Verify the Manifest
- files according to `file verification`_ section, and include their
- entries in the current Manifest entry list (using paths relative
- to directories containing the Manifests).
+ files according to the `file verification`_ section, and include
+ their entries in the current Manifest entry list (using paths
+ relative to directories containing the Manifests).
4. Process all ``IGNORE`` entries. Remove any paths matching them
from the *present* set.
@@ -277,12 +279,12 @@ can be used:
5. Collect all files covered by ``DATA``, ``MISC``, ``EBUILD``
and ``AUX`` entries into the *covered* set.
-6. Verify the entries in *covered* set for incompatible duplicates
+6. Verify the entries in the *covered* set for incompatible duplicates
and collisions with ignored files as explained in `Manifest file
locations and nesting`_.
7. Verify all the files in the union of the *present* and *covered*
- sets, according to `file verification`_ section.
+ sets, according to the `file verification`_ section.
Algorithm for finding parent Manifests
@@ -299,7 +301,7 @@ the following algorithm can be used:
3. If the current directory contains a ``Manifest`` file:
- a. If a ``IGNORE`` entry in the ``Manifest`` file covers
+ a. If an ``IGNORE`` entry in the ``Manifest`` file covers
the *original* directory (or one of the parent directories), stop.
b. Otherwise, store the current directory as *last_found*.
@@ -560,7 +562,7 @@ specification syntax [#PMS-FETCH]_ implicitly makes it impossible to use
filenames containing whitespace.
This specification aims to avoid arbitrary restrictions. For this
-reason, the filename characters are only restricted by excluding two
+reason, filename characters are only restricted by excluding two
technically problematic groups:
1. The NULL character (``U+0000``) is normally used to indicate the end
@@ -568,7 +570,7 @@ technically problematic groups:
written using C. Furthermore, it is not allowed in any known
filesystem.
-2. The whitespace characters are used to separate Manifest fields. While
+2. Whitespace characters are used to separate Manifest fields. While
technically it would be enough to restrict space (``U+0020``)
character that is normally used as the separator, all whitespace
characters are forbidden to avoid confusion and implementation
@@ -628,7 +630,7 @@ Two arguments were mentioned for the usefulness of a ``MISC`` type:
1. being able to reduce the checkout size by stripping unnecessary
files out, and
-2. being able to run update automatically generated files locally
+2. being able to update automatically generated files locally
without causing unnecessary verification failures.
However, the usefulness of ``MISC`` in both cases is doubtful.
@@ -675,7 +677,7 @@ should provide an ability to obtain the timestamps of all Manifests
from a recent timeframe over a secure channel from a trusted source
for comparison.
-Strictly speaking, this information is already provided by the various
+Strictly speaking, this information is provided by the various
``metadata/timestamp*`` files that are already present. However,
including the value in the Manifest itself has a little cost
and provides the ability to perform the verification stand-alone.
@@ -817,7 +819,7 @@ exposes decompressor vulnerabilities, or a zip bomb.
The OpenPGP cleartext signature covers the contents of the Manifest,
and is therefore compressed along with them. The possibility of using
-detached signature has been considered but it was rejected as
+a detached signature has been considered but it was rejected as
unnecessary complexity for minor gain.
Technically, a similar result could be effected via moving all the data
@@ -829,8 +831,8 @@ The existence of additional entries for uncompressed Manifest checksums
was debated. However, plain entries for the uncompressed file would
be confusing if only the compressed file existed, and conflicting
if both uncompressed and compressed variants existed. Furthermore,
-it has been pointed out that ``DIST`` entries do not have uncompressed
-variant either.
+it has been pointed out that ``DIST`` entries do not have
+an uncompressed variant either.
Performance considerations
@@ -962,12 +964,13 @@ References
(https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
.. [#DIST] According to Robin H. Johnson, 8.4% of all DIST entries
- at the time of writing are duplicate, representing a 2 MiB
+ at the time of writing are duplicate, representing 2 MiB
out of 25 MiB of DIST entries altogether.
.. [#GEMATO] gemato: Gentoo Manifest Tool
(https://github.com/mgorny/gemato/)
+
Copyright
=========
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-21 17:48 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-21 17:48 UTC (permalink / raw
To: gentoo-commits
commit: d3b65ba5692a031f4221dee5754f16fbcaa70919
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Tue Nov 21 17:16:19 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Tue Nov 21 17:16:19 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=d3b65ba5
glep-0074: Mention that newline needs to be restricted too in rationale
glep-0074.rst | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 278882d..d0750f5 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -570,11 +570,12 @@ technically problematic groups:
written using C. Furthermore, it is not allowed in any known
filesystem.
-2. Whitespace characters are used to separate Manifest fields. While
- technically it would be enough to restrict space (``U+0020``)
- character that is normally used as the separator, all whitespace
- characters are forbidden to avoid confusion and implementation
- errors.
+2. Whitespace characters are used to separate Manifest fields
+ and entries. While technically it would be enough to restrict space
+ (``U+0020``) character that is normally used as the separator
+ and newline (``U+000A``) character that is used to separate lines,
+ all whitespace characters are forbidden to avoid confusion
+ and implementation errors.
While the specification could be extended to allow such filenames
by using some form of escaping, there is currently no apparent need
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-21 17:48 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-21 17:48 UTC (permalink / raw
To: gentoo-commits
commit: 5ba06543e8d104d86ee6823d16b2167b31ee3b87
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Tue Nov 21 17:22:34 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Tue Nov 21 17:22:34 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=5ba06543
glep-0074: Specify slash as path separator, disallow backwards slash
glep-0074.rst | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index d0750f5..6288175 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -134,10 +134,11 @@ be explicitly excluded via ``IGNORE``.
All paths specified in the Manifest file must consist of characters
corresponding to valid UTF-8 code points excluding the NULL character
-(``U+0000``) and characters classified as whitespace in the current
-version of the Unicode standard [#UNICODE]_. It is an error to use
-Manifest files in directories containing files whose names contain
-the disallowed characters.
+(``U+0000``), the backwards slash (``\``) and characters classified
+as whitespace in the current version of the Unicode standard
+[#UNICODE]_. It is an error to use Manifest files in directories
+containing files whose names contain the disallowed characters.
+The forward slash (``/``) must be used as path separator.
File verification
@@ -570,7 +571,14 @@ technically problematic groups:
written using C. Furthermore, it is not allowed in any known
filesystem.
-2. Whitespace characters are used to separate Manifest fields
+2. The backwards slash character (``\``) is frequently used as an escape
+ character, in particular in the languages derived from C and in shell
+ script. Furthermore, it is used as path separator on Windows systems.
+ It is forbidden to avoid implementation mistakes (in particular,
+ attempting to use it to escape whitespace or as path separator
+ on Windows) but also reserved for possible future extension.
+
+3. Whitespace characters are used to separate Manifest fields
and entries. While technically it would be enough to restrict space
(``U+0020``) character that is normally used as the separator
and newline (``U+000A``) character that is used to separate lines,
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-23 18:45 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-23 18:45 UTC (permalink / raw
To: gentoo-commits
commit: b3964b65293b26d75c2f71008736d508a0dd2b6b
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Wed Nov 22 16:51:48 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Wed Nov 22 16:51:48 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=b3964b65
glep-0074: Recommend escaping control characters, suggested by ulm
glep-0074.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/glep-0074.rst b/glep-0074.rst
index 3dc6730..8687969 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -163,6 +163,10 @@ and a backwards slash present in filename must be encoded. Backwards
slash used as path component separator should be replaced by forward
slash instead.
+The encoding can be used for other characters as well. In particular,
+escaping control characters is recommended to ensure that the file
+works correctly in text editors.
+
File verification
-----------------
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-23 18:45 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-23 18:45 UTC (permalink / raw
To: gentoo-commits
commit: d39f865f5bbad9523ad6c2cfd06af95d9fa7d402
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 23 18:44:54 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 23 18:44:54 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=d39f865f
glep-0074: Make extended filename encoding optional
glep-0074.rst | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 6db6caa..5270b7a 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -142,8 +142,15 @@ corresponding to valid UTF-8 code points excluding the backwards slash
(``\``) and characters classified as control characters and whitespace
in the current version of the Unicode standard [#UNICODE]_.
-Any of the excluded characters that are present in path must be encoded
-using one of the following escape sequences:
+The implementation can optionally support extended filename encoding
+to support those paths. If the encoding is not supported,
+the implementation must reject directories containing any files using
+non-compliant names, as well as Manifest files whose filename field
+contains such filenames.
+
+If the encoding is supported, then all of the excluded characters that
+are present in path must be encoded using one of the following escape
+sequences:
- characters in the ``U+0000`` to ``U+007F`` range can be encoded
as ``\xHH`` where ``HH`` specifies the zero-padded, hexadecimal
@@ -615,6 +622,13 @@ by attempting to locate the size field and take everything before it
as filename. This was terribly fragile and even if it worked, it would
solve the problem only partially.
+To preserve compatibility with the current implementations and given
+that all of the listed characters are not allowed for the foreseeable
+Gentoo uses, the extended encoding support is optional. If such support
+is not provided, the implementation must unconditionally reject any
+such files. Ignoring them implicitly would be confusing, and it is
+not possible to use them in explicit ``IGNORE`` entries.
+
The character encoding method provides means to overcome the character
restrictions to extend the tool usability beyond immediate Gentoo uses.
The backslash escape form based on Python unicode strings is used
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-23 18:45 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-23 18:45 UTC (permalink / raw
To: gentoo-commits
commit: da2aacef9e1ae5fa923836c463f54a963e63fb40
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Wed Nov 22 00:07:50 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Wed Nov 22 00:07:50 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=da2aacef
glep-0074: Clarify ignoring directories
glep-0074.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 6288175..b0daa05 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -115,8 +115,8 @@ Manifest entries, if and only if the entries have the same semantics,
specify the same size and the checksums common to both entries match.
It is an error for a single file to be matched by multiple entries
of different semantics, file size or checksum values. It is an error
-to specify another entry for a file matching ``IGNORE``, or one of its
-subdirectories.
+to specify another entry for a file that matches ``IGNORE``, or that
+is located inside an ignored directory.
The file entries (except for ``IGNORE``) can be specified for regular
files only. Symbolic links are followed when opening files
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-23 18:45 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-23 18:45 UTC (permalink / raw
To: gentoo-commits
commit: 11f19f96fea48ab780fdece39460e8fb8211909f
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Wed Nov 22 11:40:34 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Wed Nov 22 11:40:34 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=11f19f96
glep-0074: Provide encoding for disallowed characters
glep-0074.rst | 75 ++++++++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 56 insertions(+), 19 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index b0daa05..3dc6730 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -70,7 +70,8 @@ other space-separated values.
Unless specified otherwise, the paths used in the Manifest files
are relative to the directory containing the Manifest file. The paths
-must not reference the parent directory (``..``).
+must not reference the parent directory (``..``). Forward slash (``/``)
+is used as path component separator.
The Manifest files use UTF-8 encoding.
@@ -132,13 +133,35 @@ are not otherwise ignored reside on a different filesystem, or symbolic
links point to targets on a different filesystem, they must
be explicitly excluded via ``IGNORE``.
-All paths specified in the Manifest file must consist of characters
+
+Path and filename encoding
+--------------------------
+
+The path fields in the Manifest file must consist of characters
corresponding to valid UTF-8 code points excluding the NULL character
(``U+0000``), the backwards slash (``\``) and characters classified
as whitespace in the current version of the Unicode standard
-[#UNICODE]_. It is an error to use Manifest files in directories
-containing files whose names contain the disallowed characters.
-The forward slash (``/``) must be used as path separator.
+[#UNICODE]_.
+
+Any of the excluded characters that are present in path must be encoded
+using one of the following escape sequences:
+
+- characters in the ``U+0000`` to ``U+007F`` range can be encoded
+ as ``\xHH`` where ``HH`` specifies the zero-padded, hexadecimal
+ character code,
+
+- characters in the ``U+0000`` to ``U+FFFF`` range can be encoded
+ as ``\uHHHH`` where ``HHHH`` specifies the zero-padded, hexadecimal
+ character code,
+
+- characters in the UCS-4 range can be encoded as ``\UHHHHHHHH``
+ where ``HHHHHHHH`` specifies the zero-padded, hexadecimal character
+ code.
+
+It is invalid for backwards slash to be used in any other context,
+and a backwards slash present in filename must be encoded. Backwards
+slash used as path component separator should be replaced by forward
+slash instead.
File verification
@@ -563,7 +586,7 @@ specification syntax [#PMS-FETCH]_ implicitly makes it impossible to use
filenames containing whitespace.
This specification aims to avoid arbitrary restrictions. For this
-reason, filename characters are only restricted by excluding two
+reason, filename characters are only restricted by excluding three
technically problematic groups:
1. The NULL character (``U+0000``) is normally used to indicate the end
@@ -571,12 +594,10 @@ technically problematic groups:
written using C. Furthermore, it is not allowed in any known
filesystem.
-2. The backwards slash character (``\``) is frequently used as an escape
- character, in particular in the languages derived from C and in shell
- script. Furthermore, it is used as path separator on Windows systems.
- It is forbidden to avoid implementation mistakes (in particular,
- attempting to use it to escape whitespace or as path separator
- on Windows) but also reserved for possible future extension.
+2. The backwards slash character (``\``) is used as path separator
+ on Windows systems, so it's extremely unlikely to be used in real
+ filenames. For this reason it is used to implement character
+ encoding with minimal risk of breaking backwards compatibility.
3. Whitespace characters are used to separate Manifest fields
and entries. While technically it would be enough to restrict space
@@ -585,18 +606,34 @@ technically problematic groups:
all whitespace characters are forbidden to avoid confusion
and implementation errors.
-While the specification could be extended to allow such filenames
-by using some form of escaping, there is currently no apparent need
-for such a feature.
-
Historically, Portage attempted to overcome the whitespace limitation
by attempting to locate the size field and take everything before it
as filename. This was terribly fragile and even if it worked, it would
solve the problem only partially.
-Since the same restrictions apply to ``IGNORE`` rules, it is currently
-not possible to either list or ignore the file using whitespace
-characters. Therefore, the presence of such files is forbidden entirely.
+The character encoding method provides means to overcome the character
+restrictions to extend the tool usability beyond immediate Gentoo uses.
+The backslash escape form based on Python unicode strings is used
+since it can encode all characters within the Unicode range, the syntax
+is familiar to many programmers and the backwards slash character
+is extremely unlikely to appear in real filenames.
+
+Syntax is limited to the minimum necessary to implement the encoding.
+Shorthand forms (e.g. ``\t`` or ``\\``) are omitted to avoid unnecessary
+complexity, and to reduce the risk of shell users using backslash
+to escape space directly. The ``\x`` form is limited to ``\x00..\x7F``
+range to avoid ambiguity of higher values which might be interpreted
+either as UCS-2 code points or part of a UTF-8 encoded character.
+
+Encoding stores UCS-2/UCS-4 characters directly rather than hex-encoded
+UTF-8 string to simplify the implementation. In particular, it makes it
+possible to process the Manifest file as UTF-8 encoded text without
+having to perform additional UTF-8 decoding (and verification)
+of the escaped data.
+
+URL-encoding was considered as an alternative. However, it could collide
+with ``DIST`` entries that are implicitly named after the URL filename
+part where URL-encoding is pretty common.
File verification model
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-23 18:45 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-23 18:45 UTC (permalink / raw
To: gentoo-commits
commit: ed111f85c3e7ab98678ee0379589281a2c92380c
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 23 18:37:39 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 23 18:37:39 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=ed111f85
glep-0074: Always exclude control characters
glep-0074.rst | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 8687969..6db6caa 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -138,10 +138,9 @@ Path and filename encoding
--------------------------
The path fields in the Manifest file must consist of characters
-corresponding to valid UTF-8 code points excluding the NULL character
-(``U+0000``), the backwards slash (``\``) and characters classified
-as whitespace in the current version of the Unicode standard
-[#UNICODE]_.
+corresponding to valid UTF-8 code points excluding the backwards slash
+(``\``) and characters classified as control characters and whitespace
+in the current version of the Unicode standard [#UNICODE]_.
Any of the excluded characters that are present in path must be encoded
using one of the following escape sequences:
@@ -164,8 +163,7 @@ slash used as path component separator should be replaced by forward
slash instead.
The encoding can be used for other characters as well. In particular,
-escaping control characters is recommended to ensure that the file
-works correctly in text editors.
+escaping non-printable characters might be desirable.
File verification
@@ -593,16 +591,18 @@ This specification aims to avoid arbitrary restrictions. For this
reason, filename characters are only restricted by excluding three
technically problematic groups:
-1. The NULL character (``U+0000``) is normally used to indicate the end
- of a null-terminated string. Its use could therefore break programs
- written using C. Furthermore, it is not allowed in any known
- filesystem.
-
-2. The backwards slash character (``\``) is used as path separator
+1. The backwards slash character (``\``) is used as path separator
on Windows systems, so it's extremely unlikely to be used in real
filenames. For this reason it is used to implement character
encoding with minimal risk of breaking backwards compatibility.
+2. The control characters can trigger special behavior in various
+ programs and confuse them from recognizing text files. In particular,
+ the NULL character (``U+0000``) is normally used to indicate the end
+ of a null-terminated string. Its use could therefore break
+ implementations written in the C language. Other control characters
+ could trigger various formatting routines, garbling text output.
+
3. Whitespace characters are used to separate Manifest fields
and entries. While technically it would be enough to restrict space
(``U+0020``) character that is normally used as the separator
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [gentoo-commits] data/glep:glep-manifest commit in: /
@ 2017-11-23 20:52 Michał Górny
0 siblings, 0 replies; 61+ messages in thread
From: Michał Górny @ 2017-11-23 20:52 UTC (permalink / raw
To: gentoo-commits
commit: 27c2a9e43e4c887ebecaf07538c045b97502b91e
Author: Michał Górny <mgorny <AT> gentoo <DOT> org>
AuthorDate: Thu Nov 23 20:51:33 2017 +0000
Commit: Michał Górny <mgorny <AT> gentoo <DOT> org>
CommitDate: Thu Nov 23 20:51:33 2017 +0000
URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=27c2a9e4
glep-0074: Grammar corrections from Ulrich Müller
glep-0074.rst | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/glep-0074.rst b/glep-0074.rst
index 5270b7a..7791c1d 100644
--- a/glep-0074.rst
+++ b/glep-0074.rst
@@ -8,7 +8,7 @@ Type: Standards Track
Status: Draft
Version: 1
Created: 2017-10-21
-Last-Modified: 2017-11-16
+Last-Modified: 2017-11-23
Post-History: 2017-10-26, 2017-11-16
Content-Type: text/x-rst
Requires: 59, 61
@@ -139,17 +139,16 @@ Path and filename encoding
The path fields in the Manifest file must consist of characters
corresponding to valid UTF-8 code points excluding the backwards slash
-(``\``) and characters classified as control characters and whitespace
+(``\``) and characters classified as control characters or as whitespace
in the current version of the Unicode standard [#UNICODE]_.
The implementation can optionally support extended filename encoding
-to support those paths. If the encoding is not supported,
-the implementation must reject directories containing any files using
-non-compliant names, as well as Manifest files whose filename field
-contains such filenames.
+to support those paths. If encoding is not supported, the implementation
+must reject directories containing any files using non-compliant names,
+as well as Manifest files whose filename field contains such filenames.
-If the encoding is supported, then all of the excluded characters that
-are present in path must be encoded using one of the following escape
+If encoding is supported, then all of the excluded characters that
+are present in paths must be encoded using one of the following escape
sequences:
- characters in the ``U+0000`` to ``U+007F`` range can be encoded
@@ -164,9 +163,9 @@ sequences:
where ``HHHHHHHH`` specifies the zero-padded, hexadecimal character
code.
-It is invalid for backwards slash to be used in any other context,
-and a backwards slash present in filename must be encoded. Backwards
-slash used as path component separator should be replaced by forward
+It is invalid for the backwards slash to be used in any other context,
+and a backwards slash present in filename must be encoded. A backwards
+slash used as a path component separator should be replaced by a forward
slash instead.
The encoding can be used for other characters as well. In particular,
@@ -624,7 +623,7 @@ solve the problem only partially.
To preserve compatibility with the current implementations and given
that all of the listed characters are not allowed for the foreseeable
-Gentoo uses, the extended encoding support is optional. If such support
+Gentoo uses, extended encoding support is optional. If such support
is not provided, the implementation must unconditionally reject any
such files. Ignoring them implicitly would be confusing, and it is
not possible to use them in explicit ``IGNORE`` entries.
^ permalink raw reply related [flat|nested] 61+ messages in thread
end of thread, other threads:[~2017-11-23 20:52 UTC | newest]
Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-29 19:05 [gentoo-commits] data/glep:glep-manifest commit in: / Michał Górny
-- strict thread matches above, loose matches on Subject: below --
2017-10-29 19:05 Michał Górny
2017-10-30 16:52 Michał Górny
2017-10-30 16:52 Michał Górny
2017-10-30 16:52 Michał Górny
2017-10-30 16:52 Michał Górny
2017-10-30 16:52 Michał Górny
2017-10-30 16:52 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-02 19:09 Michał Górny
2017-11-05 21:11 Michał Górny
2017-11-06 21:54 Michał Górny
2017-11-13 16:08 [gentoo-commits] data/glep:master " Michał Górny
2017-11-13 17:35 ` [gentoo-commits] data/glep:glep-manifest " Michał Górny
2017-11-13 16:08 [gentoo-commits] data/glep:master " Michał Górny
2017-11-13 17:35 ` [gentoo-commits] data/glep:glep-manifest " Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-13 17:35 Michał Górny
2017-11-20 17:26 Michał Górny
2017-11-20 17:26 Michał Górny
2017-11-20 18:41 Michał Górny
2017-11-20 18:41 Michał Górny
2017-11-21 17:48 Michał Górny
2017-11-21 17:48 Michał Górny
2017-11-21 17:48 Michał Górny
2017-11-23 18:45 Michał Górny
2017-11-23 18:45 Michał Górny
2017-11-23 18:45 Michał Górny
2017-11-23 18:45 Michał Górny
2017-11-23 18:45 Michał Górny
2017-11-23 20:52 Michał Górny
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox