public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] [PATCH] glep-0075: Update for reference implementation
@ 2019-10-24 11:50 Michał Górny
  2019-10-24 20:39 ` Ulrich Mueller
  0 siblings, 1 reply; 4+ messages in thread
From: Michał Górny @ 2019-10-24 11:50 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny

Fill in the reference implementation section.  Reduce the requirements
for cutoffs to support only multiples of 4, as there is no point
in making the implementation more complex for something we aren't using
anyway.  Fix a typo.

Signed-off-by: Michał Górny <mgorny@gentoo.org>
---
 glep-0075.rst | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/glep-0075.rst b/glep-0075.rst
index 31553e7..4586463 100644
--- a/glep-0075.rst
+++ b/glep-0075.rst
@@ -7,8 +7,8 @@ Type: Standards Track
 Status: Draft
 Version: 1
 Created: 2018-01-26
-Last-Modified: 2018-12-01
-Post-History: 2018-01-27
+Last-Modified: 2019-10-24
+Post-History: 2018-01-27, 2019-10-24
 Content-Type: text/x-rst
 ---
 
@@ -100,11 +100,14 @@ and the policies for introducing new hashes are covered by GLEP 59
 The cutoffs list specifies one or more integers separated by colons
 (``:``), indicating the number of bits (starting with the most
 significant bit) of the hash used to form subsequent subdirectory names.
-For example, the list of ``2:4`` would indicate that top-level directory
-names are formed using 2 most significant bits of the hash (resulting
-in 2² = 4 directories), and each of this directories would have
-subdirectories formed using the next 4 bits of the hash (resulting
-in 2⁴ = 16 subdirectories each).
+For example, the list of ``4:8`` would indicate that top-level directory
+names are formed using 4 most significant bits of the hash (resulting
+in 2⁴ = 16 directories), and each of this directories would have
+subdirectories formed using the next 8 bits of the hash (resulting
+in 2⁸ = 256 subdirectories each).
+
+The implementations are only required to support cutoffs being multiples
+of 4.  Support for other values is optional.
 
 The exact algorithm for determining the distfile location follows:
 
@@ -296,6 +299,16 @@ relatively low complexity and being reasonably future-proof.
    (x — content checksum, + — filename checksum)
 
 
+Cutoff values
+-------------
+The original draft allowed any cutoff values.  This was changed since
+multiples of 4 are much easier to implement — they can be trivially cut
+from hexadecimal representation of the hash value.  This representation
+is commonly used by hash function implementations, including the Portage
+utility functions, pkgcore utility functions (snakeoil) and ``b2sum``
+utility from coreutils.
+
+
 Layout file
 -----------
 The presence of control file has been suggested in the original
@@ -363,7 +376,14 @@ to an appropriate subdirectory.
 
 Reference Implementation
 ========================
-TODO.
+The support for this specification has been implemented in Portage,
+as of version 2.3.77.  This includes both fetching distfiles,
+and maintaining mirrors via ``emirrordist``.  The implementation
+supports both listed layouts, with all hash functions supported
+by Portage and cutoffs being multiples of 4.
+
+As of 2019-10-18, the Gentoo Infrastructure team has successfully
+deployed the ``filename-hash BLAKE2B 8`` layout on Gentoo mirrors.
 
 
 References
@@ -389,7 +409,7 @@ References
    for each directory computed in a way to have the files distributed evenly'
    (https://archives.gentoo.org/gentoo-dev/message/611bdaa76be049c1d650e8995748e7b8)
 
-.. [#PKGNAME] Jason Zamal's reply including 'using the same dir layout
+.. [#PKGNAME] Jason Zaman's reply including 'using the same dir layout
    as the packages themselves)
    (https://archives.gentoo.org/gentoo-dev/message/f26ed870c3a6d4ecf69a821723642975)
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [gentoo-dev] [PATCH] glep-0075: Update for reference implementation
  2019-10-24 11:50 [gentoo-dev] [PATCH] glep-0075: Update for reference implementation Michał Górny
@ 2019-10-24 20:39 ` Ulrich Mueller
  2019-10-24 21:00   ` Hanno Böck
  2019-10-25  6:26   ` Michał Górny
  0 siblings, 2 replies; 4+ messages in thread
From: Ulrich Mueller @ 2019-10-24 20:39 UTC (permalink / raw
  To: Michał Górny; +Cc: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 796 bytes --]

>>>>> On Thu, 24 Oct 2019, Michał Górny wrote:

> +in 2⁴ = 16 directories), and each of this directories would have

s/this/these/ (This was there before, but can be corrected while at it.)

> +The implementations are only required to support cutoffs being multiples

s/The implementations/Implementations/

> +and maintaining mirrors via ``emirrordist``.  The implementation
> +supports both listed layouts, with all hash functions supported
> +by Portage and cutoffs being multiples of 4.

In the rationale section, one reason given for the choice of the hash
algorithm (BLAKE2B) was to "avoid code duplication". Isn't that argument
moot, if all hashes supported by Portage are implemented? (Or in other
words, couldn't a faster hash function like MD5 be used?)

Ulrich

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [gentoo-dev] [PATCH] glep-0075: Update for reference implementation
  2019-10-24 20:39 ` Ulrich Mueller
@ 2019-10-24 21:00   ` Hanno Böck
  2019-10-25  6:26   ` Michał Górny
  1 sibling, 0 replies; 4+ messages in thread
From: Hanno Böck @ 2019-10-24 21:00 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 584 bytes --]

On Thu, 24 Oct 2019 22:39:06 +0200
Ulrich Mueller <ulm@gentoo.org> wrote:

> In the rationale section, one reason given for the choice of the hash
> algorithm (BLAKE2B) was to "avoid code duplication". Isn't that
> argument moot, if all hashes supported by Portage are implemented?
> (Or in other words, couldn't a faster hash function like MD5 be used?)

FWIW blake2b is faster than md5. That was one of the design goals [1].


[1] https://blake2.net/

-- 
Hanno Böck
https://hboeck.de/

mail/jabber: hanno@hboeck.de
GPG: FE73757FA60E4E21B937579FA5880072BBB51E42

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [gentoo-dev] [PATCH] glep-0075: Update for reference implementation
  2019-10-24 20:39 ` Ulrich Mueller
  2019-10-24 21:00   ` Hanno Böck
@ 2019-10-25  6:26   ` Michał Górny
  1 sibling, 0 replies; 4+ messages in thread
From: Michał Górny @ 2019-10-25  6:26 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1333 bytes --]

On Thu, 2019-10-24 at 22:39 +0200, Ulrich Mueller wrote:
> > > > > > On Thu, 24 Oct 2019, Michał Górny wrote:
> > +in 2⁴ = 16 directories), and each of this directories would have
> 
> s/this/these/ (This was there before, but can be corrected while at it.)
> 
> > +The implementations are only required to support cutoffs being multiples
> 
> s/The implementations/Implementations/

Both fixed in place.  Since they're grammar fixes, I suppose there's
no need to send v2 over it.

> 
> > +and maintaining mirrors via ``emirrordist``.  The implementation
> > +supports both listed layouts, with all hash functions supported
> > +by Portage and cutoffs being multiples of 4.
> 
> In the rationale section, one reason given for the choice of the hash
> algorithm (BLAKE2B) was to "avoid code duplication". Isn't that argument
> moot, if all hashes supported by Portage are implemented? (Or in other
> words, couldn't a faster hash function like MD5 be used?)

That's a very Portage-centric thinking.  Technically, today's PM needs
only to be implement SHA512 and BLAKE2B.  The former is legacy,
so in the future we will probably throw it away and either leave BLAKE2B
only, or add another new hash.  In either case, BLAKE2B is the most
future-proof choice today.

-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-10-25  6:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-10-24 11:50 [gentoo-dev] [PATCH] glep-0075: Update for reference implementation Michał Górny
2019-10-24 20:39 ` Ulrich Mueller
2019-10-24 21:00   ` Hanno Böck
2019-10-25  6:26   ` Michał Górny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox