From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id CBEC513877A for ; Thu, 7 Aug 2014 21:19:14 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 2DC77E08F3; Thu, 7 Aug 2014 21:19:11 +0000 (UTC) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 349C0E08EC for ; Thu, 7 Aug 2014 21:19:09 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1XFV5X-00022B-QW for gentoo-amd64@lists.gentoo.org; Thu, 07 Aug 2014 23:19:07 +0200 Received: from ip68-231-22-224.ph.ph.cox.net ([68.231.22.224]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Aug 2014 23:19:07 +0200 Received: from 1i5t5.duncan by ip68-231-22-224.ph.ph.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Aug 2014 23:19:07 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: gentoo-amd64@lists.gentoo.org From: Duncan <1i5t5.duncan@cox.net> Subject: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) Date: Thu, 7 Aug 2014 21:18:56 +0000 (UTC) Message-ID: References: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-amd64@lists.gentoo.org Reply-to: gentoo-amd64@lists.gentoo.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: ip68-231-22-224.ph.ph.cox.net User-Agent: Pan/0.140 (Chocolate Salty Balls; GIT d447f7c /m/p/portage/src/egit-src/pan2) X-Archives-Salt: 5057445a-aa79-4e8f-b8e4-b0d844bd5db2 X-Archives-Hash: cef48324a3db89731b3a4d6560bffb49 Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted: > So that's all looking pretty good, as a first step. If it's a matter of > 3 1/2 minutes instead of 1-2 minutes then I can live with that part. > However that's just (I think) the portage tree and not signed source > code, correct? [I just posted a reply to the gpg specific stuff.] Technically correct, but not really so in implementation. See below... > Now, is the idea that I have a validated portage snapshot at this point > and stiff have to actually get the code using the regular emerge which > will do the checking because I have: > > FEATURES="buildpkg strict webrsync-gpg" No... It doesn't work that way. > I don't see any evidence that emerge checked what it downloaded, but > maybe those checks are only done when I really build the code? Here's what happens. FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the snapshot-tarball that webrsync downloads. Without that, it'd still download it the same, but it wouldn't verify the signature. This allows people who use the webrsync only because they're behind a firewall that wouldn't allow normal rsync, but who don't care about the gpg signing security stuff, to use the same tool as the people who actually use webrsync for the security aspect, regardless of whether they could use normal rsync or not. So that gets you a signed and verified tree. Correct so far. But as part of that tree, there are digest files for each package that verify the integrity of the ebuild as well as of the sources tarballs (distfiles). Now it's important to grasp the difference between gpg signing and simple hash digests, here. Anybody with the appropriate tools (md5sum, for example, does md5 hashes, but there's sha and other hashes as well, and the portage tree uses several hash algorithms in case one is broken) can take a hash of a file, and provided it's exactly the same bit-for-bit file they should get exactly the same hash. In fact, that's how portage checks the hashes of both the ebuild files and the distfiles it uses, regardless of this webrsync-gpg stuff. The tree ships the hash values that the gentoo package maintainer took of the files in its digest files, and portage takes its own hash of the files and compares it to the hash value stored in the digest files. If they match, portage is happy. If they don't, depending on how strict you have portage set to be (FEATURES=strict), it will either warn about (without strict) or entirely refuse to merge that package (with strict), until either the digest is updated, or a new file matching the old digest is downloaded. So far so good, but while the hashes protect against accidental damage as the file was being downloaded, because anyone can take a hash of the file, without something stronger, if say one of the mirror operators was a bad guy, they could replace the files with hacked files and as long as they replaced the digest files with the new ones they created for the hacked files at the same time, portage wouldn't know. So while hashes/digests alone protect quite well from accidental damage, they can't protect, by themselves, from deliberate replacement of those files with malware infested copies. Which is where the gpg signed tree snapshots come in. But before we can understand how they help, we need to understand how gpg signing differs from simple hashes. PGP, gpg, and various other public/private-pair key signing (and encryption) take advantage of a particular mathematical relationship property between the public and private keys. I'm not a cryptographer nor a mathematician, so I'm content to leave it at that rather handwavy assertion and not get into the details, but enough people I trust say the same thing about the details, and enough of our modern Internet banking and the like, depends upon the same idea, that I'm relatively confident in the general principle, at least. It works like this. People keep the private key from the pair private -- if it gets out, they've lost the secret. But people publish the public half of the key. The relationship of the keys is such that people can't figure out the private key from the public key, but if you have the private key, you can sign stuff with it, and people with the public key can verify the signature and thus trust that it really was the person with that key that signed the content. Similarly, people can use the public key to encrypt something, and only the person with the private key will be able to decrypt it -- having the public key doesn't help. Actually, as I understand it signing is simply a combination of hashing and encryption, such that a hash of the content to be signed is taken, and then that hash is encrypted with the private key. Now anyone with the public key can "decrypt" the hash and verify the content with it, thereby verifying that the private key used to sign the content by encrypting the hash was the one used. If some other key had been used, attempting to decrypt the hash with an unmatched public key would simply produce gibberish, and the supposedly "decrypted" hash wouldn't be the hash produced when checking the content, thereby failing to verify that the signed content actually came from the person that it was claimed to have come from. OK, we've now established that hashes simply verify that the content didn't get modified in transit, but they do NOT by themselves verify who SENT that content, so indeed, a man-in-the-middle could have replaced BOTH the content and the hash, and someone relying on just hashes couldn't tell the difference. And we've also established that a signature verifies that the content actually came from the person who had the private key matching the public key used to verify it, by mechanism of encrypting the hash of that content with the private key, so only by "decrypting" it with the matching public key, does the hash of the content match the one taken at the other end and encrypted with the private key. *NOW* we're equipped to see how the portage tree snapshot signing method actually allows us to verify distfiles as well. Because the tree includes digests that we can now verify came from our trusted source, gentoo, NOW those digests can be used to verify the distfiles, because the digests were part of the signed tree and nobody could tamper with that signed tree including those digests without detection. If our nefarious gentoo mirror operator tried to switch out the source tarballs AND the digests, he could do so for normal rsync users, and for webrsync users not doing gpg verification, without detection. But should he try that with someone that's using webrsync-gpg, he has no way to sign the tampered with tarball with the correct private key since he doesn't have it, and those using webrsync with FEATURES=webrsync-gpg would detect the tampered tarball as portage (via webrsync, via eix in your case) would reject that tarball as unverified. So the hash-digest method used to protect ordinary rsync users (and webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage, now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks as well, not because the digests themselves are different, but because we can now trust and verify that they came from a legitimate source. Tho it should be noted that "legitimate source" is defined as anyone having access to that that private signing key. So should someone breakin to the snapshotting server and steal that private key doing the signing, they now become a "legitimate source" as far as webrsync-gpg is concerned. So where does that leave us in practice? Basically here: You're now verifying that the snapshot tarballs are coming from a source with the private signing key, and we're assuming that gentoo security hasn't been broken and thus that only gentoo's snapshot signing servers (and their admins, of course) have access to the private signing key, which in turn means we're assuming the machine with that signing key must be gentoo, and thus that the snapshotted tarballs are legit. But it's actually webrsync in combination with FEATURES=webrsync-gpg that's doing that verification. Once the verified tarball is actually unpacked on our system, portage operate just as it normally does, simply verifying the usual hash digests against the ebuilds and the distfiles /exactly/ as it normally would. Repeating in different words to hopefully ensure it's understood: It's *ONLY* the fact that we have actually gpg-verified that snapshot tarball and thus the digests within it, that gives us any more security than an ordinary rsync user. After that's downloaded, verified and unpacked, portage operates exactly as it normally does. Meanwhile, part of that normal operation includes FEATURES=strict, if you've set it, which causes portage to refuse to merge the package if those digests don't match. But that part of things is just normal portage operation. Rsync users get it too -- they just don't have the additional assurance that those digest files actually came from gentoo (or at least from someone with gentoo's private signing key), that webrsync with FEATURES=webrsync-gpg provides. (Meanwhile, one further personal note FWIW. You may think that all these long explanations take quite some time to type up, and you'd be correct. But don't make the mistake of thinking that I don't get a benefit from it myself. My dad was a teacher, and one of the things he used to say that I've found to be truer than true, is that the best way to /learn/ something is to try to teach it to someone. That's exactly what I'm doing, and all the unexpected questions and corner cases that I'd have never thought about on my own, that people bring up and force me to think about in ordered to answer them, help me improve my own previously more handwavy and fuzzy "general concept" understanding as well. I'm much more confident in my own understanding of the general public/private key concepts, how gpg actually uses them and how its web-of-trust works, and more specifically, how portage can use that via webrsync-gpg to actually improve the gentooer's own security, than I ever was before. And it has been quite some time since I worked with gpg and saw it in interactive mode like that, too, and it turns out that in the intervening years, I've actually understood quite a bit more about how it all works than I did back then, thus my ability to dig that all up and present it here, while back a few years ago, I was just as clueless about how all that web-of-trust stuff worked, and make exactly the same mistake of "ultimately trusting" the distro's package-signing key, for exactly the same reasons. Turns out I absorbed rather more from all those security and encryption articles I've read over the years than I realized, but it actually took my replies right here in this thread to lay it all out logically so I too realized how much more I understand what's going on now, than I did back then.) So... Thanks for the thread! =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman