public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] proposal: use only one hash function in manifest files
@ 2022-04-04 23:41 Jason A. Donenfeld
  2022-04-05  1:48 ` John Helmert III
                   ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-04 23:41 UTC (permalink / raw
  To: gentoo-dev

Hi,

I'd like to propose the following for portage:

- Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
- Only generate and parse one hash function in Manifest files
- Remove support for multiple hash functions

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest? It's not about file
integrity, since certainly a single hash handles that use case fine.
And it's not about security either, since for that we use gpg
signatures, and gpg signatures are carried out over a _single_ hash of
the plain text being hashed, so the security of the system reduces to
breaking SHA2-512 anyway. So, if it's not about file integrity and
it's not about security, what is it about?

I don't really care which one we use, so long as it's not already
broken or too obscure/new. So in other words, any one of SHA2-256,
SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
pick one and roll with it?

Jason

PS: there _is_ a good reason for recording the file size in Manifest
files as we do now: it's quicker to compare sizes on large files than
it is to read and hash the whole thing, so this gives us a "free" way
of noticing quick corruption.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-04 23:41 [gentoo-dev] proposal: use only one hash function in manifest files Jason A. Donenfeld
@ 2022-04-05  1:48 ` John Helmert III
  2022-04-05 13:37 ` [gentoo-dev] " Jason A. Donenfeld
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 35+ messages in thread
From: John Helmert III @ 2022-04-05  1:48 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]

I don't really have any strong opinion, but I'll note this was
discussed here last year, too:

https://archives.gentoo.org/gentoo-dev/message/a51ef62765b577dccfde67d5d2d727ae

On Tue, Apr 05, 2022 at 01:41:50AM +0200, Jason A. Donenfeld wrote:
> Hi,
> 
> I'd like to propose the following for portage:
> 
> - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
> - Only generate and parse one hash function in Manifest files
> - Remove support for multiple hash functions
> 
> In other words, what are we actually getting by having _both_ SHA2-512
> and BLAKE2b for every file in every Manifest? It's not about file
> integrity, since certainly a single hash handles that use case fine.
> And it's not about security either, since for that we use gpg
> signatures, and gpg signatures are carried out over a _single_ hash of
> the plain text being hashed, so the security of the system reduces to
> breaking SHA2-512 anyway. So, if it's not about file integrity and
> it's not about security, what is it about?
> 
> I don't really care which one we use, so long as it's not already
> broken or too obscure/new. So in other words, any one of SHA2-256,
> SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
> pick one and roll with it?
> 
> Jason
> 
> PS: there _is_ a good reason for recording the file size in Manifest
> files as we do now: it's quicker to compare sizes on large files than
> it is to read and hash the whole thing, so this gives us a "free" way
> of noticing quick corruption.
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [gentoo-dev] Re: proposal: use only one hash function in manifest files
  2022-04-04 23:41 [gentoo-dev] proposal: use only one hash function in manifest files Jason A. Donenfeld
  2022-04-05  1:48 ` John Helmert III
@ 2022-04-05 13:37 ` Jason A. Donenfeld
  2022-04-05 14:10   ` Ulrich Mueller
  2022-04-05 14:49 ` [gentoo-dev] " Michał Górny
  2022-04-05 21:13 ` Jonas Stein
  3 siblings, 1 reply; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 13:37 UTC (permalink / raw
  To: gentoo-dev

To move things forward with something more concrete:

On 4/5/22, Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
> Hi,
>
> I'd like to propose the following for portage:
>
> - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
> - Only generate and parse one hash function in Manifest files
> - Remove support for multiple hash functions
>
> [...]
> I don't really care which one we use, so long as it's not already
> broken or too obscure/new. So in other words, any one of SHA2-256,
> SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
> pick one and roll with it?

As you might have realized from my work on other projects, I like
BLAKE2 a lot. However, I think there are two strong reasons for going
with SHA512 exclusively here:

- GPG signatures are already over the SHA512 of the plain text, so
they security of the system already reduces to that. By choosing
SHA512, we don't add more risk, whilst choosing something else means
we're in trouble if either one has a problem.
- Other package managers use SHA512 in their recipes, so it makes it
easier to compare tarball checksums.

The principle advantage of BLAKE2b is 64-bit speed, but SHA512
performs okay enough in that regard anyway.

Therefore, to amend my proposal:

- Use SHA512 as the Manifest hash.

Any objections?

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] Re: proposal: use only one hash function in manifest files
  2022-04-05 13:37 ` [gentoo-dev] " Jason A. Donenfeld
@ 2022-04-05 14:10   ` Ulrich Mueller
  2022-04-05 15:18     ` Jason A. Donenfeld
  0 siblings, 1 reply; 35+ messages in thread
From: Ulrich Mueller @ 2022-04-05 14:10 UTC (permalink / raw
  To: Jason A. Donenfeld; +Cc: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 894 bytes --]

>>>>> On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

> - GPG signatures are already over the SHA512 of the plain text, so
> they security of the system already reduces to that. By choosing
> SHA512, we don't add more risk, whilst choosing something else means
> we're in trouble if either one has a problem.

The OpenPGP signature is for the top-level Manifest only. In case there
was any trouble, it would be trivial to change the hash algorithm used
for this.

In constrast to that, updating the hashes in all Manifest files is a
huge pain in the neck. Basically, you must download all distfiles, which
is not trivial. For example, think of fetch-restricted files. (I've
helped twice with updating Manifest files, so I believe I know what I'm
talking about. :)

I think that be benefit of dropping one of the hashes would be close to
zero, especially if we would drop the faster one.

Ulrich

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 507 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-04 23:41 [gentoo-dev] proposal: use only one hash function in manifest files Jason A. Donenfeld
  2022-04-05  1:48 ` John Helmert III
  2022-04-05 13:37 ` [gentoo-dev] " Jason A. Donenfeld
@ 2022-04-05 14:49 ` Michał Górny
  2022-04-05 21:13 ` Jonas Stein
  3 siblings, 0 replies; 35+ messages in thread
From: Michał Górny @ 2022-04-05 14:49 UTC (permalink / raw
  To: gentoo-dev

On Tue, 2022-04-05 at 01:41 +0200, Jason A. Donenfeld wrote:
> Hi,
> 
> I'd like to propose the following for portage:
> 
> - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
> - Only generate and parse one hash function in Manifest files
> - Remove support for multiple hash functions
> 
> In other words, what are we actually getting by having _both_ SHA2-512
> and BLAKE2b for every file in every Manifest? It's not about file
> integrity, since certainly a single hash handles that use case fine.
> And it's not about security either, since for that we use gpg
> signatures, and gpg signatures are carried out over a _single_ hash of
> the plain text being hashed, so the security of the system reduces to
> breaking SHA2-512 anyway. So, if it's not about file integrity and
> it's not about security, what is it about?

If you mean "remove entirely", then that's a bad idea.  While
the original reasons for multiple hash functions might have been, erm,
not exactly correct, the dual-hash situation is needed for transitional
periods.  Particularly because we have a number of fetch-restricted
packages where we simply need to wait for someone with the distfile to
rehash them (or eventually remove them, if we can't get a new hash).

> I don't really care which one we use, so long as it's not already
> broken or too obscure/new. So in other words, any one of SHA2-256,
> SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
> pick one and roll with it?

Back when we added BLAKE2b, the idea was to eventually remove SHA512
(the previous hash).  However, this was rejected afterwards.

> PS: there _is_ a good reason for recording the file size in Manifest
> files as we do now: it's quicker to compare sizes on large files than
> it is to read and hash the whole thing, so this gives us a "free" way
> of noticing quick corruption.

The primary use of knowing the file size is to know whether to try to
resume fetching.

-- 
Best regards,
Michał Górny



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] Re: proposal: use only one hash function in manifest files
  2022-04-05 14:10   ` Ulrich Mueller
@ 2022-04-05 15:18     ` Jason A. Donenfeld
  0 siblings, 0 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 15:18 UTC (permalink / raw
  To: Ulrich Mueller; +Cc: gentoo-dev

Hi Ulrich,

On Tue, Apr 5, 2022 at 4:10 PM Ulrich Mueller <ulm@gentoo.org> wrote:
> The OpenPGP signature is for the top-level Manifest only. In case there
> was any trouble, it would be trivial to change the hash algorithm used
> for this.
>
> In constrast to that, updating the hashes in all Manifest files is a
> huge pain in the neck. Basically, you must download all distfiles, which
> is not trivial. For example, think of fetch-restricted files. (I've
> helped twice with updating Manifest files, so I believe I know what I'm
> talking about. :)

The thing is, if SHA-512 is broken, that will really be the least of
our concerns. TLS itself will be broken....

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
       [not found] <14775bf9818049998577ba4310f1bc6b1a83db16.camelgentoo!org>
@ 2022-04-05 16:25 ` Jason A. Donenfeld
  2022-04-05 16:25   ` Jason A. Donenfeld
  2022-04-05 18:57   ` Matt Turner
  0 siblings, 2 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 16:25 UTC (permalink / raw
  To: Michał Górny; +Cc: gentoo-dev

Hi Michal,

On Tue, Apr 05, 2022 at 02:49:12PM +0000, Michał Górny wrote:
> > I don't really care which one we use, so long as it's not already
> > broken or too obscure/new. So in other words, any one of SHA2-256,
> > SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
> > pick one and roll with it?
> 
> Back when we added BLAKE2b, the idea was to eventually remove SHA512
> (the previous hash).  However, this was rejected afterwards.

Maybe we should pick that back up? Do you remember the ultimate
rationale for rejecting it? Do you suppose those are still valid?

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 16:25 ` Jason A. Donenfeld
@ 2022-04-05 16:25   ` Jason A. Donenfeld
  2022-04-05 18:57   ` Matt Turner
  1 sibling, 0 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 16:25 UTC (permalink / raw
  To: Michał Górny; +Cc: gentoo-dev

Hi Michal,

On Tue, Apr 05, 2022 at 02:49:12PM +0000, Michał Górny wrote:
> > I don't really care which one we use, so long as it's not already
> > broken or too obscure/new. So in other words, any one of SHA2-256,
> > SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
> > pick one and roll with it?
> 
> Back when we added BLAKE2b, the idea was to eventually remove SHA512
> (the previous hash).  However, this was rejected afterwards.

Maybe we should pick that back up? Do you remember the ultimate
rationale for rejecting it? Do you suppose those are still valid?

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 16:25 ` Jason A. Donenfeld
  2022-04-05 16:25   ` Jason A. Donenfeld
@ 2022-04-05 18:57   ` Matt Turner
  2022-04-05 19:30     ` Jason A. Donenfeld
       [not found]     ` <0DBAAAB5-87A1-4A40-94A6-651E8FDCD264@gentoo.org>
  1 sibling, 2 replies; 35+ messages in thread
From: Matt Turner @ 2022-04-05 18:57 UTC (permalink / raw
  To: gentoo development; +Cc: Michał Górny

On Tue, Apr 5, 2022 at 11:47 AM Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
>
> Hi Michal,
>
> On Tue, Apr 05, 2022 at 02:49:12PM +0000, Michał Górny wrote:
> > > I don't really care which one we use, so long as it's not already
> > > broken or too obscure/new. So in other words, any one of SHA2-256,
> > > SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
> > > pick one and roll with it?
> >
> > Back when we added BLAKE2b, the idea was to eventually remove SHA512
> > (the previous hash).  However, this was rejected afterwards.
>
> Maybe we should pick that back up? Do you remember the ultimate
> rationale for rejecting it? Do you suppose those are still valid?

(Somehow you broke threading)

This was a topic in June 2021's Council meeting:

https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613-summary.txt#n33
https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613.txt#n137

Basically there was no great reason presented for making the change
and some (IMO specious) reasons for keeping multiple hashes. I don't
think anyone felt strongly enough about removing one hash to fight for
it.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 18:57   ` Matt Turner
@ 2022-04-05 19:30     ` Jason A. Donenfeld
  2022-04-05 20:14       ` Ulrich Mueller
  2022-04-05 20:37       ` Matt Turner
       [not found]     ` <0DBAAAB5-87A1-4A40-94A6-651E8FDCD264@gentoo.org>
  1 sibling, 2 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 19:30 UTC (permalink / raw
  To: gentoo-dev; +Cc: Matt Turner

Hi Matt,

On Tue, Apr 5, 2022 at 8:58 PM Matt Turner <mattst88@gentoo.org> wrote:
> This was a topic in June 2021's Council meeting:
>
> https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613-summary.txt#n33
> https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613.txt#n137
>
> Basically there was no great reason presented for making the change
> and some (IMO specious) reasons for keeping multiple hashes. I don't
> think anyone felt strongly enough about removing one hash to fight for
> it.

Huh. Something not brought up there or https://bugs.gentoo.org/784710
is the fact that the _security_ of the system reduces to SHA-512 as
used by our GPG signatures.

By the way, we're not currently _checking_ two hash functions during
src_prepare(), are we?

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 19:30     ` Jason A. Donenfeld
@ 2022-04-05 20:14       ` Ulrich Mueller
  2022-04-05 21:35         ` Jason A. Donenfeld
  2022-04-05 20:37       ` Matt Turner
  1 sibling, 1 reply; 35+ messages in thread
From: Ulrich Mueller @ 2022-04-05 20:14 UTC (permalink / raw
  To: Jason A. Donenfeld; +Cc: gentoo-dev, Matt Turner

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]

>>>>> On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

> Huh. Something not brought up there or https://bugs.gentoo.org/784710
> is the fact that the _security_ of the system reduces to SHA-512 as
> used by our GPG signatures.

The hash algorithm would be the least of my concerns about the security
of these signatures.

IIUC, the secret signing key is stored on a machine that is connected to
the network (Infra, please correct me if I'm wrong). So there are other
more likely attack vectors than a preimage attack on a 512 bit hash
function.

Also: https://xkcd.com/538/ :)

Ulrich

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 507 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 19:30     ` Jason A. Donenfeld
  2022-04-05 20:14       ` Ulrich Mueller
@ 2022-04-05 20:37       ` Matt Turner
  2022-04-05 21:49         ` Jason A. Donenfeld
  1 sibling, 1 reply; 35+ messages in thread
From: Matt Turner @ 2022-04-05 20:37 UTC (permalink / raw
  To: Jason A. Donenfeld; +Cc: gentoo development

On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
> By the way, we're not currently _checking_ two hash functions during
> src_prepare(), are we?

I don't know, but the hash-checking is definitely checked before src_prepare().


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-04 23:41 [gentoo-dev] proposal: use only one hash function in manifest files Jason A. Donenfeld
                   ` (2 preceding siblings ...)
  2022-04-05 14:49 ` [gentoo-dev] " Michał Górny
@ 2022-04-05 21:13 ` Jonas Stein
  2022-04-05 21:38   ` Jason A. Donenfeld
  2022-04-06  0:05   ` Sam James
  3 siblings, 2 replies; 35+ messages in thread
From: Jonas Stein @ 2022-04-05 21:13 UTC (permalink / raw
  To: gentoo-dev

Hi

> I'd like to propose the following for portage:
> 
> - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
> - Only generate and parse one hash function in Manifest files
> - Remove support for multiple hash functions

No, this has no benefit.

> In other words, what are we actually getting by having _both_ SHA2-512
> and BLAKE2b for every file in every Manifest? 

Implementations are often broken and we have to expect zero day attacks 
on hashes and on signatures. Hence it does not hurt to have a second hash.

It is very likely that we can not trust in X for a while in the next 
years, but it is very unlikely that two different implementations are 
affected.

Additionally calculating a second hash does not cost anything.
This was also the outcome of the discussion some time ago here.

-- 
Best,
Jonas


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 20:14       ` Ulrich Mueller
@ 2022-04-05 21:35         ` Jason A. Donenfeld
  0 siblings, 0 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 21:35 UTC (permalink / raw
  To: Ulrich Mueller; +Cc: gentoo-dev, Matt Turner

Hi Ulrich,

On Tue, Apr 5, 2022 at 10:15 PM Ulrich Mueller <ulm@gentoo.org> wrote:
>
> >>>>> On Tue, 05 Apr 2022, Jason A Donenfeld wrote:
>
> > Huh. Something not brought up there or https://bugs.gentoo.org/784710
> > is the fact that the _security_ of the system reduces to SHA-512 as
> > used by our GPG signatures.
>
> The hash algorithm would be the least of my concerns about the security
> of these signatures.
>
> IIUC, the secret signing key is stored on a machine that is connected to
> the network (Infra, please correct me if I'm wrong). So there are other
> more likely attack vectors than a preimage attack on a 512 bit hash
> function.

You missed the point, which is that having two hashes, SHA512 and
BLAKE2b, doesn't actually help anything, since an attacker only must
attack SHA512 in order to break the signature system, which is
actually what we're relying on for security. Yes there are other
attacks too on the signature system. But in terms of hashing, my point
is that adding an additional hash to manifest files to the one used by
the signature doesn't help anything from a security perspective, since
if you have an attack on the signature's hash, then no additional
hashing is going to actually help.

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 21:13 ` Jonas Stein
@ 2022-04-05 21:38   ` Jason A. Donenfeld
  2022-04-06  0:05   ` Sam James
  1 sibling, 0 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 21:38 UTC (permalink / raw
  To: gentoo-dev

Hi Jonas,

On Tue, Apr 5, 2022 at 11:20 PM Jonas Stein <jstein@gentoo.org> wrote:
> > In other words, what are we actually getting by having _both_ SHA2-512
> > and BLAKE2b for every file in every Manifest?
>
> Implementations are often broken and we have to expect zero day attacks
> on hashes and on signatures. Hence it does not hurt to have a second hash.
>
> It is very likely that we can not trust in X for a while in the next
> years, but it is very unlikely that two different implementations are
> affected.

This is the part that doesn't really make any sense to me. The
security of the system reduces to the SHA512 used by those GPG
signatures. If SHA512 breaks, the fact that our Manifest files also
use BLAKE2b isn't going to help us, since an attacker could
presumably, in that case, forge the signatures that we're using as a
root of trust. I don't see what a second hash buys us from a security
perspective here. What attack model do you have where it makes sense?

> Additionally calculating a second hash does not cost anything.

How is that possible? Doesn't calculating two things always cost more
than calculating one? If what you actually mean is, "performance is
not important," we can discuss that, but it sounds like you're saying
that there's zero performance impact. How does that work exactly? Is
only one calculated at emerge time or something clever like that?

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 20:37       ` Matt Turner
@ 2022-04-05 21:49         ` Jason A. Donenfeld
  2022-04-11 23:14           ` Joshua Kinard
  0 siblings, 1 reply; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-05 21:49 UTC (permalink / raw
  To: Matt Turner; +Cc: gentoo development

Hi Matt,

On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <mattst88@gentoo.org> wrote:
>
> On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
> > By the way, we're not currently _checking_ two hash functions during
> > src_prepare(), are we?
>
> I don't know, but the hash-checking is definitely checked before src_prepare().

Er, during the builtin fetch phase. Anyway, you know what I meant. :)

Anyway, looking at the portage source code, to answer my own question,
it looks like the file is actually being read twice and both hashes
computed. I would have at least expected an optimization like:

hash1_init(&hash1);
hash2_init(&hash2);
for chunks in file:
    hash1_update(&hash1, chunk);
    hash2_update(&hash2, chunk);
hash1_final(&hash1, out1);
hash2_final(&hash2, out2);

But actually what's happening is the even less efficient:

hash1_init(&hash1);
for chunks in file:
    hash1_update(&hash1, chunk);
hash1_final(&hash1, out1);
hash2_init(&hash2);
for chunks in file:
    hash2_update(&hash2, chunk);
hash1_final(&hash2, out2);

So the file winds up being open and read twice. For huge tarballs like
chromium or libreoffice...

But either way you do it - the missed optimization above or the
unoptimized reality below - there's still twice as much work being
done. This is all unless I've misread the source code, which is
possible, so if somebody knows this code well and I'm wrong here,
please do speak up.

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 21:13 ` Jonas Stein
  2022-04-05 21:38   ` Jason A. Donenfeld
@ 2022-04-06  0:05   ` Sam James
  2022-04-06  1:33     ` Rich Freeman
  1 sibling, 1 reply; 35+ messages in thread
From: Sam James @ 2022-04-06  0:05 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1678 bytes --]



> On 5 Apr 2022, at 22:13, Jonas Stein <jstein@gentoo.org> wrote:
> 
> Hi
> 
>> I'd like to propose the following for portage:
>> - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
>> - Only generate and parse one hash function in Manifest files
>> - Remove support for multiple hash functions
> 
> No, this has no benefit.

Which part has no benefit? I could see a case (although I don't think it's a super strong one)
for keeping support for multiple hash types in Portage, but only 1 in a Manifest.

I think Jason's made a fair case for dropping it.

> 
>> In other words, what are we actually getting by having _both_ SHA2-512
>> and BLAKE2b for every file in every Manifest?
> 
> Implementations are often broken and we have to expect zero day attacks on hashes and on signatures. Hence it does not hurt to have a second hash.

I don't think this is the case. They're not broken often, it's a very very big deal when they do, and we'd also have far bigger problems in such a case (as already pointed out, TLS would be an issue, but also GPG signatures, git commit hashes, ...).

> 
> It is very likely that we can not trust in X for a while in the next years, but it is very unlikely that two different implementations are affected.
> 

I don't think it is likely that e.g. SHA512 will be broken in the next few years, no, but if it is going to be, we have far bigger issues and we'd need to have double algorithms in our whole stack, which we don't have.

> Additionally calculating a second hash does not cost anything.

It does have a cost at both Manifest-generation time and emerge-time.

Thanks,
sam


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
       [not found]     ` <0DBAAAB5-87A1-4A40-94A6-651E8FDCD264@gentoo.org>
@ 2022-04-06  0:15       ` Jason A. Donenfeld
  2022-04-06  0:25         ` Sam James
                           ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-06  0:15 UTC (permalink / raw
  To: Sam James; +Cc: gentoo development, Michał Górny, Matt Turner

Hi Sam,

On Wed, Apr 6, 2022 at 2:02 AM Sam James <sam@gentoo.org> wrote:
> This matches my views and recollection. We could revisit it
> if there was a passionate advocate (which it looks like there may well be).
>
> While I wasn't against it before, I was sort of ambivalent given
> we had no strong reason to, but I'm more willing now given
> we're also cleaning out other Portage cruft at the same time.

I think actually the argument I'm making this time might be subtly
different from the motions that folks went through last year.
Specifically, the idea last year was to switch to using BLAKE2b only.
I think what the arguments I'm making now point to is switching to
SHA2-512 only.

There are two reasons for this.

1) Security: since the GPG signatures use SHA2-512, then the whole
system breaks if SHA2-512 breaks. If we choose BLAKE2b as our only
hash, then if either SHA2-512 or BLAKE2b break, then the system
breaks. But if we choose SHA2-512 as our only hash, then we only need
to worry about SHA2-512 breaking.

2) Comparability: other distros use SHA2-512, as well as various
upstreams, which means we can compare our hashes to theirs easily.

A reason why some people might prefer BLAKE2b over SHA2-512 is a
performance improvement. However, seeing as right now we're opening
the file, reading it, computing BLAKE2b, closing the file, opening the
file again, reading it again, computing SHA2-512, closing the file, I
don't think performance is actually something people care about. Seen
differently, removing either one of them will already give us a
performance "boost" or sorts.

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06  0:15       ` Jason A. Donenfeld
@ 2022-04-06  0:25         ` Sam James
  2022-04-06  4:13         ` Ulrich Mueller
  2022-04-06 17:23         ` Robin H. Johnson
  2 siblings, 0 replies; 35+ messages in thread
From: Sam James @ 2022-04-06  0:25 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny, Matt Turner, Robin H. Johnson

[-- Attachment #1: Type: text/plain, Size: 2455 bytes --]



> On 6 Apr 2022, at 01:15, Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
> 
> Hi Sam,
> 
> On Wed, Apr 6, 2022 at 2:02 AM Sam James <sam@gentoo.org> wrote:
>> This matches my views and recollection. We could revisit it
>> if there was a passionate advocate (which it looks like there may well be).
>> 
>> While I wasn't against it before, I was sort of ambivalent given
>> we had no strong reason to, but I'm more willing now given
>> we're also cleaning out other Portage cruft at the same time.
> 
> I think actually the argument I'm making this time might be subtly
> different from the motions that folks went through last year.
> Specifically, the idea last year was to switch to using BLAKE2b only.
> I think what the arguments I'm making now point to is switching to
> SHA2-512 only.

Oh, right. I see!

(Aside: I should've been clearer in my first email, what I meant was: I'm
fine with revisiting this, but I remember us feeling kind of lacklustre because
even the proposer (mgorny) ended up not having the oomph to push it through
given (small) opposition. I don't recall who had the stiff opposition at the time,
but I do recall it was only small, but nobody really felt like it was worth the hassle.

The overall Council feeling was "meh" without some momentum.)


> There are two reasons for this.
> 
> 1) Security: since the GPG signatures use SHA2-512, then the whole
> system breaks if SHA2-512 breaks. If we choose BLAKE2b as our only
> hash, then if either SHA2-512 or BLAKE2b break, then the system
> breaks. But if we choose SHA2-512 as our only hash, then we only need
> to worry about SHA2-512 breaking.
> 
> 2) Comparability: other distros use SHA2-512, as well as various
> upstreams, which means we can compare our hashes to theirs easily.
> 
> A reason why some people might prefer BLAKE2b over SHA2-512 is a
> performance improvement. However, seeing as right now we're opening
> the file, reading it, computing BLAKE2b, closing the file, opening the
> file again, reading it again, computing SHA2-512, closing the file, I
> don't think performance is actually something people care about. Seen
> differently, removing either one of them will already give us a
> performance "boost" or sorts.
> 

I think this seems pretty reasonable and I don't have any objection to it.

2) is a nice point and it's something Robin raised last time around too.

> Jason

best,
sam


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06  0:05   ` Sam James
@ 2022-04-06  1:33     ` Rich Freeman
  2022-04-06 17:29       ` Jason A. Donenfeld
  0 siblings, 1 reply; 35+ messages in thread
From: Rich Freeman @ 2022-04-06  1:33 UTC (permalink / raw
  To: gentoo-dev

On Tue, Apr 5, 2022 at 8:05 PM Sam James <sam@gentoo.org> wrote:
> > On 5 Apr 2022, at 22:13, Jonas Stein <jstein@gentoo.org> wrote:
> >
> >> In other words, what are we actually getting by having _both_ SHA2-512
> >> and BLAKE2b for every file in every Manifest?
> >
> > Implementations are often broken and we have to expect zero day attacks on hashes and on signatures. Hence it does not hurt to have a second hash.
>
> I don't think this is the case. They're not broken often, it's a very very big deal when they do, and we'd also have far bigger problems in such a case (as already pointed out, TLS would be an issue, but also GPG signatures, git commit hashes, ...).

Our security fails currently if EITHER SHA2-512 or a hardened version
of SHA-1 are defeated.  Our top gpg signature is bound to a git commit
record by SHA2-512, and the git commit record is bound to everything
else in the repository (including the manifest objects) by SHA-1,
because git hasn't transitioned away from that (as far as I'm aware it
is still a work in progress - the SHA-1 algorithm it uses is hardened
against known attacks).

That said, I think there is still an argument for having two hashes in
the manifests.  If we have two independent manifests, then if either
SHA-1 or SHA2-512 are defeated all we need to do is update git+gpg to
the patched version (which no doubt would be rushed into a release
quickly), and then do a commit to the repo and sign it with the Gentoo
key.  The new commit would have a full set of new hashes using a
secure hash function, and then a back-reference to the previous commit
using SHA-1 (assuming we didn't rebase the entire tree and lose all
our historical gpg signatures - we might consider creating a new repo
and saving a historical one).   That would have new hashes all the way
from the top commit down to all the objects it references, so the top
commit would now be secure.  When signed with an updated gpg the
signature would be attached with a secure hash.  So now we're secure
again.  If we're concerned about old signatures getting recycled in
preimage attacks we could of course revoke the key and issue a new
one.

What we don't need to do is redo all the manifests, and that is
important because we don't actually have the ability to redo those
centrally.  Anybody can add a commit to the repo and re-sign it, but
we'd need all the maintainers to go through and generate new manifests
for anything that is fetch-restricted, or aggressively treeclean.

So it isn't that having two hashes can't fail, but rather that if it
does fail it is easier to recover.

>
> >
> > It is very likely that we can not trust in X for a while in the next years, but it is very unlikely that two different implementations are affected.
> >
>
> I don't think it is likely that e.g. SHA512 will be broken in the next few years, no, but if it is going to be, we have far bigger issues and we'd need to have double algorithms in our whole stack, which we don't have.

I agree that this is an unlikely scenario, so it is a judgement call
as to whether the ease of recovery in the event of a failure is worth
the cost to maintain the second hash.  I agree that we'd need double
algorithms in the whole stack to prevent a failure, but in the current
state we do have advantages for recovering from a failure after the
fact.

It seems that the likely scenario is that we get advance warning of
weaknesses in a hash function, but without a practical exploit being
readily available.  In that case we could do a  more orderly
transition.  We'd still save time with the double hashed manifests,
and whether this makes a difference is hard to say.

>
> > Additionally calculating a second hash does not cost anything.
>
> It does have a cost at both Manifest-generation time and emerge-time.

This is certainly true, though if the current algorithm is reading the
files twice we could at least fix that.

I don't really have a strong opinion here.  I just wanted to point out
the recovery benefit of having two hashes on just the manifests, given
that it isn't easy to access all the distfiles.  I also wanted to
point out that we have SHA-1 exposure today, at least in git.

-- 
Rich


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06  0:15       ` Jason A. Donenfeld
  2022-04-06  0:25         ` Sam James
@ 2022-04-06  4:13         ` Ulrich Mueller
  2022-04-06 11:47           ` Jason A. Donenfeld
  2022-04-06 17:23         ` Robin H. Johnson
  2 siblings, 1 reply; 35+ messages in thread
From: Ulrich Mueller @ 2022-04-06  4:13 UTC (permalink / raw
  To: Jason A. Donenfeld
  Cc: Sam James, gentoo-dev, Michał Górny, Matt Turner

[-- Attachment #1: Type: text/plain, Size: 980 bytes --]

>>>>> On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

> I think actually the argument I'm making this time might be subtly
> different from the motions that folks went through last year.
> Specifically, the idea last year was to switch to using BLAKE2b only.
> I think what the arguments I'm making now point to is switching to
> SHA2-512 only.

Still, I think that if we drop one of the hashes then we should proceed
with the original plan. That is, keep the more modern BLAKE2B (which was
a participant of the SHA-3 competition [1]) and drop the older SHA512.

Back then, we had the choice between adding SHA3_512 and BLAKE2B, and we
preferred BLAKE2B for performance reasons.

I also think that the argument about the OpenPGP signature isn't very
strong, because replacing that signature by another one using a
different hash is trivial. As I said before, replacing all Manifest
files in the tree isn't.

Ulrich

[1] https://en.wikipedia.org/wiki/NIST_hash_function_competition

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 507 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06  4:13         ` Ulrich Mueller
@ 2022-04-06 11:47           ` Jason A. Donenfeld
  2022-04-06 16:38             ` Ulrich Mueller
  0 siblings, 1 reply; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-06 11:47 UTC (permalink / raw
  To: Ulrich Mueller; +Cc: Sam James, gentoo-dev, Michał Górny, Matt Turner

Hi Ulrich,

On 4/6/22, Ulrich Mueller <ulm@gentoo.org> wrote:
>>>>>> On Wed, 06 Apr 2022, Jason A Donenfeld wrote:
>
>> I think actually the argument I'm making this time might be subtly
>> different from the motions that folks went through last year.
>> Specifically, the idea last year was to switch to using BLAKE2b only.
>> I think what the arguments I'm making now point to is switching to
>> SHA2-512 only.
>
> Still, I think that if we drop one of the hashes then we should proceed
> with the original plan. That is, keep the more modern BLAKE2B (which was
> a participant of the SHA-3 competition [1]) and drop the older SHA512.

Why? Then we're dependent on two things, either of which could break,
rather than one.

To be clear, I'm a big fan of BLAKE2 myself and have used it in a
number of projects. And either one breaking would be a big deal. So
maybe it doesn't really matter that much. But strictly formally, it
seems like SHA512 is the most sound decision? I spelled out two
reasons for that to Sam; if you still disagree, maybe you can address
why you think my two reasons aren't very meaningful?

> I also think that the argument about the OpenPGP signature isn't very
> strong, because replacing that signature by another one using a
> different hash is trivial. As I said before, replacing all Manifest
> files in the tree isn't.

I looked into changing gnupg to use BLAKE2b for signatures, but it
doesn't appear to be supported. It's in gcrypt but not gpg. From
--version: `Hash: SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224`.
Since my argument rests on minimizing probability of a break, changing
the signature hash algo after it's broken doesn't help with much, so I
think this is something we'd want to happen now, rather than later, if
we're to use BLAKE2b exclusively.

I could potentially send a patch to gnupg for this if you want to take
the long path. But also: don't forget there's also the
interoperability argument that favors SHA512 too.

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 11:47           ` Jason A. Donenfeld
@ 2022-04-06 16:38             ` Ulrich Mueller
  2022-04-06 17:06               ` Jason A. Donenfeld
  0 siblings, 1 reply; 35+ messages in thread
From: Ulrich Mueller @ 2022-04-06 16:38 UTC (permalink / raw
  To: Jason A. Donenfeld
  Cc: Sam James, gentoo-dev, Michał Górny, Matt Turner

[-- Attachment #1: Type: text/plain, Size: 436 bytes --]

>>>>> On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

> Why? Then we're dependent on two things, either of which could break,
> rather than one.

See? If either of these should happen, then we'll be happy that we still
have both hashes in our Manifest files.

OTOH, if that argument is not relavant because the probability of both
is close to zero, then (from a security POV) it doesn't matter which of
the two hashes we remove.

Ulrich

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 507 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 16:38             ` Ulrich Mueller
@ 2022-04-06 17:06               ` Jason A. Donenfeld
  2022-04-06 17:31                 ` Robin H. Johnson
  2022-04-06 17:54                 ` Ulrich Mueller
  0 siblings, 2 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-06 17:06 UTC (permalink / raw
  To: Ulrich Mueller
  Cc: Sam James, gentoo development, Michał Górny,
	Matt Turner

Hi Ulrich,

On Wed, Apr 6, 2022 at 6:38 PM Ulrich Mueller <ulm@gentoo.org> wrote:
> > Why? Then we're dependent on two things, either of which could break,
> > rather than one.
>
> See? If either of these should happen, then we'll be happy that we still
> have both hashes in our Manifest files.
>
> OTOH, if that argument is not relavant because the probability of both
> is close to zero, then (from a security POV) it doesn't matter which of
> the two hashes we remove.

No, you're still missing the point.

If SHA-512 breaks, the security of the system fails, regardless of
what change we make. This is because GnuPG uses SHA-512 for its
signatures.

So I'll spell out the different possibilities:

1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
1a) Possibility: SHA-512 is broken. Result: system broken.
1b) Possibility: BLAKE2b is broken. Result: nothing.

2) GPG uses SHA-512. Manifest uses SHA-512.
2a) Possibility: SHA-512 is broken. Result: system broken.
2b) Possibility: BLAKE2b is broken. Result: nothing.

3) GPG uses SHA-512. Manifest uses BLAKE2b.
3a) Possibility: SHA-512 is broken. Result: system broken.
3b) Possibility: BLAKE2b is broken. Result: system broken.

See how from a security perspective, (2) is not worse than (1), but
(3) is worse than both (1) and (2)?

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06  0:15       ` Jason A. Donenfeld
  2022-04-06  0:25         ` Sam James
  2022-04-06  4:13         ` Ulrich Mueller
@ 2022-04-06 17:23         ` Robin H. Johnson
  2022-04-20  0:00           ` Robin H. Johnson
  2022-04-20 13:55           ` Jason A. Donenfeld
  2 siblings, 2 replies; 35+ messages in thread
From: Robin H. Johnson @ 2022-04-06 17:23 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2899 bytes --]

On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:
> 2) Comparability: other distros use SHA2-512, as well as various
> upstreams, which means we can compare our hashes to theirs easily.
Can we expand on this specific thread for a moment?

I was the author of GLEP59 about changing the Manifest hashes, and I
noted at the time, with references, that the effective strength of a set
of hashes is only that of the strongest hash.

One of my regrets from GLEP59 is that it's made it harder for use cases
outside of the normal user distfile workflow.

The use case that impacted me the most was being able to compare our
distfiles were over time vs external sources, esp. if the file goes
missing or was fetch-restricted and we can't produce a new hash of it.
Maybe upstream only ever published SHA1/SHA256, and we only ever
calculated SHA512/BLAKE2b on the file. Since we never had hashes from
both sides at the same time, we cannot prove it was the same file.

We need to be able to ship one or more hashes to users, for the specific
use case of validating the distfiles they download.

As a developer, I'd like to be able to track the other hashes for a
file, without forcing ourselves to retain the file. This might be to
compare with upstream published hashes, or to compare with other
distros.

In fact it would be really nice to have a semi-automated pipeline to
plug in signed upstream hashes to our Manifests, and make it possibly to
prove our new SHA512/BLAKE2B hash was taken over the correct input in
the first place, and there wasn't any subtle supply-chain attack early
in the packaging process.

Where would those hashes go? They don't need to be in the Manifest, or
at the very least they don't need to be distributed via rsync to users
(it only costs a small amount of bytes to do so).

Where else could they go? 
- Commit messages could work.
- Git notes to a lesser degree.
- alternate repos?

> A reason why some people might prefer BLAKE2b over SHA2-512 is a
> performance improvement. However, seeing as right now we're opening
> the file, reading it, computing BLAKE2b, closing the file, opening the
> file again, reading it again, computing SHA2-512, closing the file, I
> don't think performance is actually something people care about. Seen
> differently, removing either one of them will already give us a
> performance "boost" or sorts.
Or just only verifying the "strongest" hash gives you that boost.

I do want to check into the code that you pointed out, because I'm
really sure much older versions of Portage did the CORRECT thing of only
reading the file in a single pass.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06  1:33     ` Rich Freeman
@ 2022-04-06 17:29       ` Jason A. Donenfeld
  2022-04-06 18:34         ` Rich Freeman
  0 siblings, 1 reply; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-06 17:29 UTC (permalink / raw
  To: gentoo development

Hi Rich,

On 4/6/22, Rich Freeman <rich0@gentoo.org> wrote:
> On Tue, Apr 5, 2022 at 8:05 PM Sam James <sam@gentoo.org> wrote:
> Our security fails currently if EITHER SHA2-512 or a hardened version
> of SHA-1 are defeated.  Our top gpg signature is bound to a git commit
> record by SHA2-512, and the git commit record is bound to everything
> else in the repository (including the manifest objects) by SHA-1,
> because git hasn't transitioned away from that (as far as I'm aware it
> is still a work in progress - the SHA-1 algorithm it uses is hardened
> against known attacks).

Sort of. The security between infra and users relies on SHA2-512. The
security between devs and infra relies on SHA-1. I guess the "full
system" depends on both, but I've been focused on the more likely
issue of a community-run mirror serving bogus files.

> I agree that this is an unlikely scenario, so it is a judgement call
> as to whether the ease of recovery in the event of a failure is worth
> the cost to maintain the second hash.  I agree that we'd need double
> algorithms in the whole stack to prevent a failure, but in the current
> state we do have advantages for recovering from a failure after the
> fact.
>
> It seems that the likely scenario is that we get advance warning of
> weaknesses in a hash function, but without a practical exploit being
> readily available.  In that case we could do a  more orderly
> transition.  We'd still save time with the double hashed manifests,
> and whether this makes a difference is hard to say.

Yea I see this argument, but I don't quite buy it. Maintaining two
sets of hashes for the unlikely event that one gets broken AND we
absolutely cannot incrementally transition gradually to an unbroken
one seems rather overblown.

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 17:06               ` Jason A. Donenfeld
@ 2022-04-06 17:31                 ` Robin H. Johnson
  2022-04-20 16:33                   ` Jason A. Donenfeld
  2022-04-06 17:54                 ` Ulrich Mueller
  1 sibling, 1 reply; 35+ messages in thread
From: Robin H. Johnson @ 2022-04-06 17:31 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1621 bytes --]

On Wed, Apr 06, 2022 at 07:06:30PM +0200, Jason A. Donenfeld wrote:
> No, you're still missing the point.
> 
> If SHA-512 breaks, the security of the system fails, regardless of
> what change we make. This is because GnuPG uses SHA-512 for its
> signatures.
Question directly for you Jason, because you make a professional study
of this: does the type of breakage/successful attack against against
SHA-512 matter?

e.g. is it possible that some type of attack would only work against the
Manifest entry, but NOT against the GPG signature's embedded SHA-512 (or
the opposite).

The best hypothetical idea I had was that there exists some large
special input that lets an attacker reset the output to an arbitrary
hash after their malicious payload: but it wouldn't fit in the GPG
signature space.

> 
> So I'll spell out the different possibilities:
> 1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
score -1 + 0 = -1
> 2) GPG uses SHA-512. Manifest uses SHA-512.
score -1 + 0 = -1
> 3) GPG uses SHA-512. Manifest uses BLAKE2b.
score -1 + -1 = -2
> See how from a security perspective, (2) is not worse than (1), but
> (3) is worse than both (1) and (2)?
Yes, (2) is not worse than (1) for the overall security perspective.
That leaves the discussion does (1) have other benefits / value
propositions that make it worth less than (2). (see my other thread)

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 17:06               ` Jason A. Donenfeld
  2022-04-06 17:31                 ` Robin H. Johnson
@ 2022-04-06 17:54                 ` Ulrich Mueller
  1 sibling, 0 replies; 35+ messages in thread
From: Ulrich Mueller @ 2022-04-06 17:54 UTC (permalink / raw
  To: Jason A. Donenfeld
  Cc: gentoo-dev, Sam James, Michał Górny, Matt Turner

[-- Attachment #1: Type: text/plain, Size: 956 bytes --]

>>>>> On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

> So I'll spell out the different possibilities:

> 1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
> 1a) Possibility: SHA-512 is broken. Result: system broken.
> 1b) Possibility: BLAKE2b is broken. Result: nothing.

> 2) GPG uses SHA-512. Manifest uses SHA-512.
> 2a) Possibility: SHA-512 is broken. Result: system broken.
> 2b) Possibility: BLAKE2b is broken. Result: nothing.

> 3) GPG uses SHA-512. Manifest uses BLAKE2b.
> 3a) Possibility: SHA-512 is broken. Result: system broken.
> 3b) Possibility: BLAKE2b is broken. Result: system broken.

> See how from a security perspective, (2) is not worse than (1), but
> (3) is worse than both (1) and (2)?

No it isn't. We can replace the top-level signature easily, but
replacing all Manifest hashes in the tree is hard (i.e. 1a and 3a are
trivial to fix, but 2a and 3b aren't).

I've said this multiple times now, so I'm out of here.

Ulrich

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 507 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 17:29       ` Jason A. Donenfeld
@ 2022-04-06 18:34         ` Rich Freeman
  2022-04-07 15:21           ` Marek Szuba
  0 siblings, 1 reply; 35+ messages in thread
From: Rich Freeman @ 2022-04-06 18:34 UTC (permalink / raw
  To: gentoo-dev

On Wed, Apr 6, 2022 at 1:29 PM Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
>
> Sort of. The security between infra and users relies on SHA2-512. The
> security between devs and infra relies on SHA-1. I guess the "full
> system" depends on both, but I've been focused on the more likely
> issue of a community-run mirror serving bogus files.

Well, that depends on how you're syncing the tree.  If you're using
rsync then there is a signed manifest in the root, so I agree in that
case it is just SHA2-512.  If you're syncing using git then the
manifests only reference distfiles, and the only link between the
commit and the tree/objects are their SHA-1 hashes until git adopts a
different hash function.

> Yea I see this argument, but I don't quite buy it. Maintaining two
> sets of hashes for the unlikely event that one gets broken AND we
> absolutely cannot incrementally transition gradually to an unbroken
> one seems rather overblown.

It is very much a hand-waving judgement call.  This is one of those
low cost, low risk, high reward situations IMO.  The cost of
calculating hashes is fairly low (especially if done in a more sane
way).  The odds it will ever have a benefit are low.  If it does have
a benefit, it will be in a situation where the world is on fire and
we'll be very happy to not have to go verify a gazillion distfiles on
top of everything else we have to fix.  I'll defer to those wiser than
me to make the call.  :)

-- 
Rich


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 18:34         ` Rich Freeman
@ 2022-04-07 15:21           ` Marek Szuba
  0 siblings, 0 replies; 35+ messages in thread
From: Marek Szuba @ 2022-04-07 15:21 UTC (permalink / raw
  To: gentoo-dev


[-- Attachment #1.1: Type: text/plain, Size: 220 bytes --]

On 2022-04-06 19:34, Rich Freeman wrote:

> This is one of those low cost, low risk, high reward situations IMO.

*puts on Council hat*

The above pretty much covers my own opinion on the subject.

-- 
Marecki

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-05 21:49         ` Jason A. Donenfeld
@ 2022-04-11 23:14           ` Joshua Kinard
  2022-04-12 12:44             ` Mike Gilbert
  0 siblings, 1 reply; 35+ messages in thread
From: Joshua Kinard @ 2022-04-11 23:14 UTC (permalink / raw
  To: gentoo-dev

On 4/5/2022 17:49, Jason A. Donenfeld wrote:
> Hi Matt,
> 
> On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <mattst88@gentoo.org> wrote:
>>
>> On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
>>> By the way, we're not currently _checking_ two hash functions during
>>> src_prepare(), are we?
>>
>> I don't know, but the hash-checking is definitely checked before src_prepare().
> 
> Er, during the builtin fetch phase. Anyway, you know what I meant. :)
> 
> Anyway, looking at the portage source code, to answer my own question,
> it looks like the file is actually being read twice and both hashes
> computed. I would have at least expected an optimization like:
> 
> hash1_init(&hash1);
> hash2_init(&hash2);
> for chunks in file:
>     hash1_update(&hash1, chunk);
>     hash2_update(&hash2, chunk);
> hash1_final(&hash1, out1);
> hash2_final(&hash2, out2);
> 
> But actually what's happening is the even less efficient:
> 
> hash1_init(&hash1);
> for chunks in file:
>     hash1_update(&hash1, chunk);
> hash1_final(&hash1, out1);
> hash2_init(&hash2);
> for chunks in file:
>     hash2_update(&hash2, chunk);
> hash1_final(&hash2, out2);
> 
> So the file winds up being open and read twice. For huge tarballs like
> chromium or libreoffice...
> 
> But either way you do it - the missed optimization above or the
> unoptimized reality below - there's still twice as much work being
> done. This is all unless I've misread the source code, which is
> possible, so if somebody knows this code well and I'm wrong here,
> please do speak up.

Not to go off-topic, but where in Portage's source is this logic at?  It
seems like an easy fix for a slightly more efficient Portage.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
rsa6144/5C63F4E3F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And
our lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-11 23:14           ` Joshua Kinard
@ 2022-04-12 12:44             ` Mike Gilbert
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Gilbert @ 2022-04-12 12:44 UTC (permalink / raw
  To: Gentoo Dev

On Mon, Apr 11, 2022 at 7:14 PM Joshua Kinard <kumba@gentoo.org> wrote:
>
> On 4/5/2022 17:49, Jason A. Donenfeld wrote:
> > Hi Matt,
> >
> > On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <mattst88@gentoo.org> wrote:
> >>
> >> On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <zx2c4@gentoo.org> wrote:
> >>> By the way, we're not currently _checking_ two hash functions during
> >>> src_prepare(), are we?
> >>
> >> I don't know, but the hash-checking is definitely checked before src_prepare().
> >
> > Er, during the builtin fetch phase. Anyway, you know what I meant. :)
> >
> > Anyway, looking at the portage source code, to answer my own question,
> > it looks like the file is actually being read twice and both hashes
> > computed. I would have at least expected an optimization like:
> >
> > hash1_init(&hash1);
> > hash2_init(&hash2);
> > for chunks in file:
> >     hash1_update(&hash1, chunk);
> >     hash2_update(&hash2, chunk);
> > hash1_final(&hash1, out1);
> > hash2_final(&hash2, out2);
> >
> > But actually what's happening is the even less efficient:
> >
> > hash1_init(&hash1);
> > for chunks in file:
> >     hash1_update(&hash1, chunk);
> > hash1_final(&hash1, out1);
> > hash2_init(&hash2);
> > for chunks in file:
> >     hash2_update(&hash2, chunk);
> > hash1_final(&hash2, out2);
> >
> > So the file winds up being open and read twice. For huge tarballs like
> > chromium or libreoffice...
> >
> > But either way you do it - the missed optimization above or the
> > unoptimized reality below - there's still twice as much work being
> > done. This is all unless I've misread the source code, which is
> > possible, so if somebody knows this code well and I'm wrong here,
> > please do speak up.
>
> Not to go off-topic, but where in Portage's source is this logic at?  It
> seems like an easy fix for a slightly more efficient Portage.

I believe it's the portage.checksum.verify_all() function.

https://gitweb.gentoo.org/proj/portage.git/tree/lib/portage/checksum.py?h=portage-3.0.30#n471


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 17:23         ` Robin H. Johnson
@ 2022-04-20  0:00           ` Robin H. Johnson
  2022-04-20 13:55           ` Jason A. Donenfeld
  1 sibling, 0 replies; 35+ messages in thread
From: Robin H. Johnson @ 2022-04-20  0:00 UTC (permalink / raw
  To: gentoo-dev

On Wed, Apr 06, 2022 at 05:23:25PM +0000, Robin H. Johnson wrote:
> On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:
> > 2) Comparability: other distros use SHA2-512, as well as various
> > upstreams, which means we can compare our hashes to theirs easily.
> Can we expand on this specific thread for a moment?
> 
> I was the author of GLEP59 about changing the Manifest hashes, and I
> noted at the time, with references, that the effective strength of a set
> of hashes is only that of the strongest hash.
Bump for my parent message, that I'm very surprised at the lack of
responses to two messages in this thread.

https://archives.gentoo.org/gentoo-dev/message/18216da0128ee79733fa68bb77fa8b69
https://archives.gentoo.org/gentoo-dev/message/a9974ec34dfb25810dab47e3fa322a52

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 17:23         ` Robin H. Johnson
  2022-04-20  0:00           ` Robin H. Johnson
@ 2022-04-20 13:55           ` Jason A. Donenfeld
  1 sibling, 0 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-20 13:55 UTC (permalink / raw
  To: gentoo-dev

Hey Robin,

Sorry for the delay in getting back to you. As mentioned on IRC, both of
your messages bounced earlier, and I was at a conference all last week.
Catching up with this thread now...

On Wed, Apr 06, 2022 at 05:23:25PM +0000, Robin H. Johnson wrote:
> On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:
> > 2) Comparability: other distros use SHA2-512, as well as various
> > upstreams, which means we can compare our hashes to theirs easily.
> Can we expand on this specific thread for a moment?
> 
> I was the author of GLEP59 about changing the Manifest hashes, and I
> noted at the time, with references, that the effective strength of a set
> of hashes is only that of the strongest hash.
> 
> One of my regrets from GLEP59 is that it's made it harder for use cases
> outside of the normal user distfile workflow.
> 
> The use case that impacted me the most was being able to compare our
> distfiles were over time vs external sources, esp. if the file goes
> missing or was fetch-restricted and we can't produce a new hash of it.
> Maybe upstream only ever published SHA1/SHA256, and we only ever
> calculated SHA512/BLAKE2b on the file. Since we never had hashes from
> both sides at the same time, we cannot prove it was the same file.
> 
> We need to be able to ship one or more hashes to users, for the specific
> use case of validating the distfiles they download.
> 
> As a developer, I'd like to be able to track the other hashes for a
> file, without forcing ourselves to retain the file. This might be to
> compare with upstream published hashes, or to compare with other
> distros.
> 
> In fact it would be really nice to have a semi-automated pipeline to
> plug in signed upstream hashes to our Manifests, and make it possibly to
> prove our new SHA512/BLAKE2B hash was taken over the correct input in
> the first place, and there wasn't any subtle supply-chain attack early
> in the packaging process.
> 
> Where would those hashes go? They don't need to be in the Manifest, or
> at the very least they don't need to be distributed via rsync to users
> (it only costs a small amount of bytes to do so).
> 
> Where else could they go? 
> - Commit messages could work.
> - Git notes to a lesser degree.
> - alternate repos?

Interesting idea. This seems orthogonal to my proposal ("just use one
hash in the manifest and call it a day; make it the same as what gpg
uses for signing to minimize moving pieces"), and so I'm hesitant to
indulge too much in this thread, for fear of it being derailed with this
different thing you want.

With that said, I'm not quite sure I understood everything you're asking
for. You said that you want "to have a semi-automated pipeline to plug
in signed upstream hashes to our Manifests, and make it possibly to
prove our new SHA512/BLAKE2B hash was taken over the correct input", but
at the same time you also said that you want "to be able to track the
other hashes for a file, without forcing ourselves to retain the file."
What I'm wondering is: how do you propose that we calculate a SHA-512
hash of a file and "prove it correct" using, e.g., a signed SHA-256
hash, if we don't download the whole file?

It sounds like the thing that would be interesting to you would be for
infra to manage some sort of master hash database collecting all the
hashes from all over the internet of every file that hits distfiles,
verifying and then generating a bunch more hash variants of all kinds,
and then cross-verifying those with the hashes extracted from every
other distro, making for a wild hash verification aggregator machine. I
think I can see the utility of it. It would also unburden manifest
files, as those could then just have a SHA-512 hash and nothing else,
making things a bit lighter.


> > A reason why some people might prefer BLAKE2b over SHA2-512 is a
> > performance improvement. However, seeing as right now we're opening
> > the file, reading it, computing BLAKE2b, closing the file, opening the
> > file again, reading it again, computing SHA2-512, closing the file, I
> > don't think performance is actually something people care about. Seen
> > differently, removing either one of them will already give us a
> > performance "boost" or sorts.
> Or just only verifying the "strongest" hash gives you that boost.
> 
> I do want to check into the code that you pointed out, because I'm
> really sure much older versions of Portage did the CORRECT thing of only
> reading the file in a single pass.

Let me know if your findings are different from mine...

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [gentoo-dev] proposal: use only one hash function in manifest files
  2022-04-06 17:31                 ` Robin H. Johnson
@ 2022-04-20 16:33                   ` Jason A. Donenfeld
  0 siblings, 0 replies; 35+ messages in thread
From: Jason A. Donenfeld @ 2022-04-20 16:33 UTC (permalink / raw
  To: gentoo-dev

Hi Robin,

On Wed, Apr 06, 2022 at 05:31:09PM +0000, Robin H. Johnson wrote:
> On Wed, Apr 06, 2022 at 07:06:30PM +0200, Jason A. Donenfeld wrote:
> > No, you're still missing the point.
> > 
> > If SHA-512 breaks, the security of the system fails, regardless of
> > what change we make. This is because GnuPG uses SHA-512 for its
> > signatures.
> Question directly for you Jason, because you make a professional study
> of this: does the type of breakage/successful attack against against
> SHA-512 matter?
> 
> e.g. is it possible that some type of attack would only work against the
> Manifest entry, but NOT against the GPG signature's embedded SHA-512 (or
> the opposite).
> 
> The best hypothetical idea I had was that there exists some large
> special input that lets an attacker reset the output to an arbitrary
> hash after their malicious payload: but it wouldn't fit in the GPG
> signature space.
 
Generally speaking, the more control an attacker has over the input, the
easier certain types of attacks might be. So maybe in the most general
sense that applies. I wouldn't model a security analysis around that,
though. Rather, the usual way to apply that sort of thinking is to
design algorithms that rely on certain properties of hash functions, but
not others; for example, Ed25519 does not rely on the hash function
being collision resistant due to its construction.

Jason


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-04-20 16:34 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-04-04 23:41 [gentoo-dev] proposal: use only one hash function in manifest files Jason A. Donenfeld
2022-04-05  1:48 ` John Helmert III
2022-04-05 13:37 ` [gentoo-dev] " Jason A. Donenfeld
2022-04-05 14:10   ` Ulrich Mueller
2022-04-05 15:18     ` Jason A. Donenfeld
2022-04-05 14:49 ` [gentoo-dev] " Michał Górny
2022-04-05 21:13 ` Jonas Stein
2022-04-05 21:38   ` Jason A. Donenfeld
2022-04-06  0:05   ` Sam James
2022-04-06  1:33     ` Rich Freeman
2022-04-06 17:29       ` Jason A. Donenfeld
2022-04-06 18:34         ` Rich Freeman
2022-04-07 15:21           ` Marek Szuba
     [not found] <14775bf9818049998577ba4310f1bc6b1a83db16.camelgentoo!org>
2022-04-05 16:25 ` Jason A. Donenfeld
2022-04-05 16:25   ` Jason A. Donenfeld
2022-04-05 18:57   ` Matt Turner
2022-04-05 19:30     ` Jason A. Donenfeld
2022-04-05 20:14       ` Ulrich Mueller
2022-04-05 21:35         ` Jason A. Donenfeld
2022-04-05 20:37       ` Matt Turner
2022-04-05 21:49         ` Jason A. Donenfeld
2022-04-11 23:14           ` Joshua Kinard
2022-04-12 12:44             ` Mike Gilbert
     [not found]     ` <0DBAAAB5-87A1-4A40-94A6-651E8FDCD264@gentoo.org>
2022-04-06  0:15       ` Jason A. Donenfeld
2022-04-06  0:25         ` Sam James
2022-04-06  4:13         ` Ulrich Mueller
2022-04-06 11:47           ` Jason A. Donenfeld
2022-04-06 16:38             ` Ulrich Mueller
2022-04-06 17:06               ` Jason A. Donenfeld
2022-04-06 17:31                 ` Robin H. Johnson
2022-04-20 16:33                   ` Jason A. Donenfeld
2022-04-06 17:54                 ` Ulrich Mueller
2022-04-06 17:23         ` Robin H. Johnson
2022-04-20  0:00           ` Robin H. Johnson
2022-04-20 13:55           ` Jason A. Donenfeld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox