From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id F312F1396D9 for ; Mon, 13 Nov 2017 07:37:17 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id D6996E1202; Mon, 13 Nov 2017 07:37:12 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 799B5E11FA for ; Mon, 13 Nov 2017 07:37:12 +0000 (UTC) Received: from pomiot (d202-252.icpnet.pl [109.173.202.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: mgorny) by smtp.gentoo.org (Postfix) with ESMTPSA id 4553B33BF3C; Mon, 13 Nov 2017 07:37:10 +0000 (UTC) Message-ID: <1510558627.1239.5.camel@gentoo.org> Subject: Re: [gentoo-dev] Manifest2 hashes, take n+1-th: 3 hashes for the tie-breaker case From: =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?= To: gentoo-dev@lists.gentoo.org Date: Mon, 13 Nov 2017 08:37:07 +0100 In-Reply-To: <88fa2503-11de-2f34-b4a9-58159f14a1ac@gentoo.org> References: <1508440120.19870.14.camel@gentoo.org> <26AE424C-19DF-4059-A7DE-8ED6D605FF2C@gentoo.org> <1508817879.1688.6.camel@gentoo.org> <1508818272.1688.7.camel@gentoo.org> <88fa2503-11de-2f34-b4a9-58159f14a1ac@gentoo.org> Organization: Gentoo Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.24.5 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Archives-Salt: c1e903d1-32ef-44aa-aa6f-4cf44337e25f X-Archives-Hash: 5d3952dad016b630904c81203c125e17 W dniu nie, 12.11.2017 o godzinie 21∶22 -0500, użytkownik Joshua Kinard napisał: > On 10/24/2017 00:11, Michał Górny wrote: > > W dniu wto, 24.10.2017 o godzinie 06∶04 +0200, użytkownik Michał Górny > > napisał: > > [snip] > > > > > [BOBO06] is relevant research here, I cited it in the work that went into > > > > GLEP59, the last time we updated the hashes. The less-technical explanation of it is: > > > > "If you can express the output of H1(x)H2(x) in LESS bits than the combined > > > > output size of H1,H2, then the attacks get a little bit easier" > > > > > > > > Some important pieces from it: > > > > [J04] "showed that the concatenation of two Merkle-Damgard functions is not > > > > much more secure than the individual functions.", but this holds ONLY if > > > > the hash functions chosen are of the same construction (MD). > > > > Choosing hashes with different constructions (Merkle-Damgard, HAIFA, > > > > Sponge) is important, and given a black box environment, > > > > > > > > The original mail reached the same approximate decision, just to look > > > > for diverse hashes, but decided that 2 was enough. > > > > > > > > Q: What are the odds of a simultaneous successful attack against two hashes? > > > > A: IDK, but if the hash functions are truly independent, it must be provably > > > > lower than the odds of an attack against a single hash. > > > > > > We're talking about really huge (→∞) numbers here. It's not a 'random' > > > attack against one hash. It's an attack that allows to sneak a malicious > > > code with unchanged file size (since we store that too), and no apparent > > > side effects (what's the point of spending up that much resources > > > if the user is going to notice?). > > > > > > > Q: What's the big difference between a bug and a successful attack in a hash? > > > > A: Bugs are more likely initially, and attacks come later. > > > > > > Sounds like an entirely abstract point in time ;-). > > > > > > > All of that said, is there really a significant long-term gain in > > > > multiple hashes? (setting aside the short-term advantage in a transition > > > > period for changing hashes) > > > > > > No, and that's my point. One hash is perfectly fine. > > > > > > Two hashes are useful for transition purposes. If we take two fast > > > hashes (e.g. proposed SHA512 + BLAKE2B which have similar speed), > > > we can use 2 threads to prevent the speed loss (except for old single- > > > core machines). > > Minor clarification, old single core //and// uni-processor. Some older > machines have multiple physical CPUs that are single-core. Threading should be > okay on these, as long as the thread count stays under NR_CPUS. > > I also have a really old single-CPU system, MIPS (obviously). Not the fastest > on the block compared to the other equipment I've got, but does anyone know of > any simple timing scripts/programs available that can benchmark some of these > proposed digest hashes? If they turn out to be reasonably quick on my old > machine, I doubt then that speed will be too much of an issue. You could play with utils/benchmark.py inside gemato [1]. Note that it's not very precise though but should give a rough measurement. Also note that it is suited for one big file while we mostly deal with a lot of small files and that changes things a bit. [1]:https://github.com/mgorny/gemato > Also, for whatever hashes we ultimately go with, what are considerations for > the userland support for them on non-glibc systems? E.g., are they provided by > third-party libraries or do they need implementations in > uclibc/uclibc-ng/musl/*? And what about the Alt/BSD side of things? I assume > FreeBSD implements this already, but worth verifying with all of the > combinations of arches/libc's/kernels and whatnot. I mean, there still might > be a lonely m68k install out there... We've selected the hashes that are guaranteed to be included in CPython 3.6+. For older versions of Python, we are using the Python extension based on the reference implementation (just like the code in CPython) pyblake2. -- Best regards, Michał Górny