From: Duncan <1i5t5.duncan@cox.net>
To: gentoo-amd64@lists.gentoo.org
Subject: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
Date: Tue, 5 Aug 2014 05:52:00 +0000 (UTC) [thread overview]
Message-ID: <pan$d1f99$43cd96ae$721d7088$6ef9ba6d@cox.net> (raw)
In-Reply-To: CAK2H+ecr0cvFdY=mEsMKok6QyM6By0WS4GB1eQZACcnfmGXL-Q@mail.gmail.com
Mark Knecht posted on Mon, 04 Aug 2014 15:04:12 -0700 as excerpted:
> As the line in that favorite song goes "Paranoia strikes deep"...
FWIW, while my lists sig is the proprietary-master quote from Richard
Stallman below, since the (anti-)patriot bill was passed in the reaction
to 9-11, my private email sig is a famous quote from Benjamin Franklin:
"They that can give up essential liberty to obtain a little
temporary safety, deserve neither liberty nor safety."
So "I'm with ya..."
> <NOTE>
> I am NOT trying to start ANY political discussion here. I hope no one
> will go too far down that path, at least here on this list. There are
> better places to do that.
>
> I am also NOT suggesting anything like what I ask next has happened,
> either here or elsewhere. It's just a question.
>
> Thanks in advance.
> </NOTE>
>
> I'm currently reading a new book by Glen Greenwald called "No Place To
> Hide" which is about Greenwald's introduction to Edward Snowden and the
> release of all of the confidential NSA documents Snowden acquired. This
> got me wondering about Gentoo, or even just Linux in general. If the
> underlying issue in all of that Snowden stuff is that the NSA has the
> ability to intercept and hack into whatever they please, then how do I
> know that the source code I build on my Gentoo machines hasn't been
> modified by someone to provide access to my machine, networks, etc.?
These are good questions to ask, and to have some idea of the answers to,
as well.
Big picture, at some level, you pretty much have to accept that you
/don't/ know. However, there's /some/ level of security... tho honestly
a bit less on Gentoo than on some of the other distros (see below), tho
it'd still not be /entirely/ easy to subvert at least widely (for an
individual downloader is another question), but it could be done.
> Essentially, what is the security model for all this source code and how
> do I verify that it hasn't been tampered with in some manner?
>
> 1) That the code I build is exactly as written and accepted by the OS
> community?
At a basic level, source and ebuild integrity, protecting both from
accidental corruption (where it's pretty good) and from deliberate
tampering (where it may or may not be considered "acceptable", but if
someone with the resources wanted to bad enough, they could subvert), is
what ebuild and sources digests are all about. The idea is that the
gentoo package maintainer creates hash digests of multiple types for both
the ebuild and the sources, such that should the copy that a gentoo user
gets not match the copy that a gentoo maintainer created, the package
manager (PM, normally portage), if configured to do so (mainly
FEATURES=strict, also see stricter and assume-digests, plus the webrsync-
gpg feature mentioned below) will error out and refuse to emerge that
package.
But there are serious limits to that protection. Here's a few points to
consider:
1) While the ebuilds and sources are digested, those digests do *NOT*
extend to the rest of the tree, the various files in the profile
directory, the various eclasses, etc. So in theory at least, someone
could mess with say the package.mask file in profiles, or one of the
eclasses, and could potentially get away with it. But see point #3 as
there's a (partial) workaround for the paranoid.
2) Meanwhile, since hashing (unlike gpg signing) isn't designed to be
secure, primarily protecting against accidental damage not so much
deliberate compromise, with digest verification verifying that nothing
changed in transit but not who did the digest in the first place, there's
some risk that one or more gentoo rsync mirrors could be compromised or
be run by a bad actor in the first place. Should that occur, the bad
actor could attempt to replace BOTH the digested ebuild and/or sources
AND the digest files, updating the latter to reflect his compromised
version instead of the version originally digested by the gentoo
maintainer. Similarly, someone such as the NSA could at least in theory
do the same thing in transit, targeting a specific user's downloads while
leaving everyone else's downloads from the same mirror alone, so only the
target got the compromised version. While there's a reasonable chance
someone would catch a bad mirror, if a single downloader is specifically
targeted, unless they're specifically validating against other mirrors as
well and/or comparing digests (over a secure channel) against those
someone else downloaded, there's little chance they'd detect the
problem. So even digest-protected files aren't immune to compromise.
But as I said above, there's a (partial) workaround. See point #3.
3) While #1 applies to the tree in general when it is rsynced, gentoo
does have a somewhat higher security sync method for the paranoid and to
support users behind firewalls which don't pass rsync. Instead of
running emerge sync, this method uses the emerge-webrsync tool, which
downloads the entire main gentoo tree as a gpg-signed tarball. If you
have FEATURES=webrsync-gpg set (see the make.conf manpage, FEATURES,
webrsync-gpg), portage will verify the gpg signature on this tarball.
The two caveats here are (1) that the webrsync tarball is generated only
once per day, while the main tree is synced every few minutes, so the
rsynced tree is going to be more current, and (2) that each snapshot is
the entire tree, not just the changes, so for those updating daily or
close to it, fetching the full tarball every day instead of just the
changes will be more network traffic. Tho I think the tarball is
compressed (I've never tried this method personally so can't say for
sure) while the rsync tree isn't, so if you're updating monthly, I'd
guess it's less traffic to get the tarball.
The tarball is gpg-signed which is more secure than simple hash digests,
but the signature covers the entire thing, not individual files, so the
granularity of the digests is better. Additionally, the tarball signing
is automated, so while a signature validation pretty well ensures that
the tarball did indeed come from gentoo, should someone compromise gentoo
infrastructure security and somehow get a bad file in place, the daily
snapshot tarball would blindly sign and package up the bad file along
with all the rest.
So sync-method bottom line, if you're paranoid or simply want additional
gpg-signed security, use emerge-webrsync along with FEATURES=webrsync-gpg,
instead of normal rsync-based emerge sync. That pretty well ensures that
you're getting exactly the gentoo tree tarball gentoo built and signed,
which is certainly far more secure than normal rsync syncing, but because
the tarballing and signing is automated and covers the entire tree,
there's still the possibility that one or more files in that tarball are
compromised and that it hasn't been detected yet.
Meanwhile, I mentioned above that gentoo isn't as secure in this regard
as a number of other Linux distros. This is DEFINITELY the case for
normal rsync syncers, but even for webrsync-gpg syncers it remains the
case to some extent. Unfortunately, in practice it seems that isn't
likely to change in the near-term, and possibly not in the medium or
longer term either, unless some big gentoo compromise is detected and
makes the news. THEN we're likely to see changes.
Alternatively, when that big pie-in-the-sky main gentoo tree switch from
cvs (yes, still) to git eventually happens, the switch to full-signing
will be quite a bit easier, tho there will still be policies to enforce,
etc. But they've been talking about the switch to git for years, as
well, and... incrementally... drawing closer, including the fact that
major portions of gentoo are actually developed in git-based overlays
these days. But will the main tree ever actually switch to git? Who
knows? As of now it's still pie-in-the-sky, with no nailed down plans.
Perhaps at some point somebody and some gentoo council together will
decide it's time and move whatever mountains or molehills remain to get
it done, and at this point I think that's mostly what it'll take, perhaps
not, but unless that somebody steps up and makes that push come hell or
high water, assuming gentoo's still around by then, come 2025 we could
still be talking about doing it... someday...
Back to secure-by-policy gpg-signing...
The problem is that while we've known what must be done, and what other
distros have already done, for years, and while gentoo has made some
progress down the security road, in the absence of that ACTIVE KNOWN
COMPROMISE RIGHT NOW immediate threat, other things simply continue to be
higher priority, while REAL gentoo security continues to be back-burnered.
Basically, what must be done, thru all the way to policy enforcement and
refusing gentoo developer commits if they don't match policy, is enforce
a policy that every gentoo dev has a registered gpg key (AFAIK that much
is already the case), and that every commit they make is SIGNED by that
personal developer key, with gentoo-infra verification of those
signatures, rejecting any commit that doesn't verify.
FWIW, there's GLEPs detailing most of this. They've just never been
fully implemented, tho incrementally, bits and pieces have been, over
time.
As I said, other distros have done this, generally when they HAD to, when
they had that compromise hitting the news. Tho I think a few distros
have implemented such a signed-no-exceptions policy when some OTHER
distro got hit. Gentoo hasn't had that happen yet, and while the
infrastructure is generally there to sign at least individual package
commits, and some devs actually do so (you can see the signed digests for
some packages, for instance), that hasn't been enforced tree-wide, and in
fact, there's a few relatively minor but still important policy questions
to resolve first, before such enforcement is actually activated.
Here's one such signing-policy question to consider. Currently, package
maintainer devs make changes to their ebuilds, and later, after a period
of testing, arch-devs keyword a particular ebuild stable for their arch.
Occasionally arch-devs may add a bit of conditional code that applies to
their arch only, as well.
Now consider this. Suppose a compromised package is detected after the
package has been keyworded stable. The last several signed commits to
that package were keywording only, while the commit introducing the
compromise was sometime earlier.
Question: Are those arch-devs that signed their keywording-only commits
responsible too, because they signed off on the package, meaning they now
have to inspect every package they keyword, checking for compromises that
might not be entirely obvious to them, or are they only responsible for
the keywording changes they actually committed, and aren't obligated to
actually inspect the rest of the ebuild they're now signing?
OK, so we say that they're only responsible for the keywording. Simple
enough. But what about this? Suppose they add an arch-conditional that
combined with earlier code in the package results in a compromise. But
the conditional code they added looks straightforward enough on its own,
and really does solve a problem on that arch, and without that code, the
original code looks innocently functional as well. But together, anyone
installing that package on that arch is now open to the world. Both devs
signed, the code of both devs is legit and looks innocent enough on its
own, but taken together, they result in a bad situation. Now it's not so
clear that an arch-dev shouldn't have to inspect and sign for the results
of the package after his commit, is it? Yet enforcing that as policy
will seriously slow-down arch stable keywording, and some archs can't
keep up as it is, so such a policy will be an effective death sentence
for them as a gentoo-stable supported arch.
Certainly there are answers to that sort of question, and various distros
have faced and come up with their own policy answers, often because in
the face of a REAL DISTRO COMPROMISE making the news, they've had no
other choice. To some extent, gentoo is lucky in that it hasn't been
faced with making those hard choices yet. But the fact is, all gentoo
users remain less safe than we could be, because those hard choices
haven't been made and enforced... because we've not been forced to do so.
Meanwhile, even were we to have done so, there's still the possibility
that upstream development might be compromised. Every year or two, some
upstream project or another makes news due to some compromise or
another. Sometimes vulnerable versions have been distributed for awhile,
and various distros have picked them up. In an upstream-compromise
situation like that, there's little a distro can do, with the exception
of going slow enough that their packages are all effectively outdated,
which also happens to be a relatively effective counter to this sort of
issue since if a several years old version changes it'll be detected
right away, and (one hopes) most compromises to a project server will be
detected within months at the longest, so anything a year or more old
should be relatively safe from this sort of issue, simply by virtue of
its age.
Obviously the people and enterprise distros willing to run years outdated
code do have that advantage, and that's a risk that people wishing to run
reasonably current code simply have to take as a result of that choice,
regardless of the distro they chose to get that current code from.
But even if you choose to run an old distro so aren't likely to be hit by
current upstream compromises, that has and enforces a full signing policy
so every commit can be accounted for, and even if none of those
developers at either the distro or upstream levels deliberately breaks
the trust and goes bad, there's still the issue below...
> 2) That the compilers and interpreters don't do anything except build
> the code?
There's a very famous in security circles paper that effectively proves
that unless you can absolutely trust every single layer in the build
line, including the hardware layer (which means its sources) and the
compiler and tools used to build your operational tools, and the compiler
and tools used to build them, and... all the way back... you simply
cannot absolutely trust the results, period.
I never kept the link, but it seems the title actually stuck in memory
well enough for me to google it: "Reflections on Trusting Trust"
=:^) Here's the google link:
https://www.google.com/search?q=%22reflections+on+trusting+trust%22
That means that in ordered to absolutely prove the gcc (for example) on
our own systems, even if we can read and understand every line of gcc
source, we must absolutely prove the tools on the original installation
media and in the stage tarballs that we used to build our system. Which
means we must not only have the code to them and trust the builders, but
we must have the code and trust the builders of the tools they used, and
the builders and tools of those tools, and...
Meanwhile, the same rule effectively applies to the hardware as well.
And while Richard Stallman may run a computer that is totally open source
hardware and firmware (down to the BIOS or equivalent), for which he has
all the schemantics, etc, most of us run at least some semi-proprietary
hardware of /some/ sort. Which means even if we /could/ fully understand
the sources ourselves, without them and without that full understanding,
at that level, we simply have to trust... someone... basically, the
people who design and manufacture that hardware.
Thus, in practice, (nearly) everyone ends up drawing the line
/somewhere/. The Stallmans of the world draw it pretty strictly,
refusing to run anything which at minimum has replaceable firmware which
doesn't itself have sources available. (As Stallman defines it, if the
firmware is effectively burned in such that the manufacturer themselves
can't update it, then that's good enough for the line he draws. Tho that
leads to absurdities such as an OpenMOKO phone that at extra expense has
the firmware burned onto a separate chip such that it can't be replaced
by anyone, in ordered to be able to use hardware that would otherwise be
running firmware that the supplier refuses to open-source -- because the
extra expense to do it that way means the manufacturer can't replace the
firmware either, so it's on the OK side of Stallman's line.)
Meanwhile, I personally draw the line at what runs at the OS level on my
computer. That means I won't run proprietary graphics drivers or flash,
but I will and do load source-less firmware onto the Radeon-based
graphics hardware I do run, in ordered to use the freedomware kernel
drivers for the same hardware that I refuse to run the proprietary frglx
drivers on.
Other people are fine running flash and/or proprietary graphics drivers,
but won't run a mostly-proprietary full OS such as MS Windows or Apple
OSX.
Still others prefer to run open source where it fits their needs, but
won't go out of their way to do so if proprietary works better for them,
and still others simply don't care either way, running whatever works
best regardless of the freedom or lack thereof of its sources.
Anyway, when it comes to hardware and compiler, in practice the best you
can do is run a FLOSS compiler such as gcc, while trusting the tools you
used to build the first ancestor, basically, the gcc and tools in the
stage tarballs, as well as whatever you booted (probably either a gentoo-
installer or another distro) in ordered to chroot into that unpacked
stage and build from there. Beyond that, well... good luck, but you're
still going to end up drawing the line /somewhere/.
> There's certainly lots of other issues about security, like protecting
> passwords, protecting physical access to the network and machines, root
> kits and the like, etc., but assuming none of that is in question (I
> don't have any reason to think the NSA has been in my home!) ;-) I'm
> looking for info on how the code is protected from the time it's signed
> off until it's built and running here.
>
> If someone knows of a good web site to read on this subject let me know.
> I've gone through my Linux life more or less like most everyone went
> through life 20 years ago, but paranoia strikes deep.
Indeed. Hope the above was helpful. I think it's a pretty accurate
picture from at least my own perspective, as someone who cares enough
about it to at least spend a not insignificant amount of time keeping up
on the current situation in this area, both for linux in general, and for
gentoo in particular.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-08-05 5:52 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-04 22:04 [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) Mark Knecht
2014-08-05 5:52 ` Duncan [this message]
2014-08-05 18:50 ` [gentoo-amd64] " Mark Knecht
2014-08-06 21:33 ` Mark Knecht
2014-08-07 0:58 ` Duncan
2014-08-07 18:16 ` Mark Knecht
2014-08-07 19:53 ` Duncan
2014-08-07 21:18 ` Duncan
2014-08-08 18:34 ` Mark Knecht
2014-08-09 1:38 ` Duncan
2014-08-05 19:16 ` [gentoo-amd64] " Frank Peters
2014-08-05 19:57 ` Rich Freeman
[not found] <46751df7496f4e4f97fb23e10fc9f5b4@mail10.futurewins.com>
2014-08-05 11:36 ` Rich Freeman
2014-08-05 17:50 ` Mark Knecht
2014-08-05 20:36 ` Frank Peters
2014-08-05 23:20 ` [gentoo-amd64] " Duncan
2014-08-06 12:14 ` james.a.elian
2014-08-06 12:14 ` james.a.elian
2014-08-07 15:36 ` [gentoo-amd64] " Max Cizauskas
2014-08-07 16:06 ` Lie Ryan
2014-08-07 17:20 ` [gentoo-amd64] " Duncan
2014-08-07 19:38 ` Mark Knecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$d1f99$43cd96ae$721d7088$6ef9ba6d@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=gentoo-amd64@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox