[gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)

public inbox for gentoo-amd64@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
@ 2014-08-04 22:04 Mark Knecht
  2014-08-05  5:52 ` [gentoo-amd64] " Duncan
  2014-08-05 19:16 ` [gentoo-amd64] " Frank Peters
  0 siblings, 2 replies; 17+ messages in thread
From: Mark Knecht @ 2014-08-04 22:04 UTC (permalink / raw
  To: Gentoo AMD64

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]

As the line in that favorite song goes "Paranoia strikes deep"...

<NOTE>
I am NOT trying to start ANY political discussion here. I hope no one will
go too far down that path, at least here on this list. There are better
places to do that.

I am also NOT suggesting anything like what I ask next has happened, either
here or elsewhere. It's just a question.

Thanks in advance.
</NOTE>

I'm currently reading a new book by Glen Greenwald called "No Place To
Hide" which is about Greenwald's introduction to Edward Snowden and the
release of all of the confidential NSA documents Snowden acquired. This got
me wondering about Gentoo, or even just Linux in general. If the underlying
issue in all of that Snowden stuff is that the NSA has the ability to
intercept and hack into whatever they please, then how do I know that the
source code I build on my Gentoo machines hasn't been modified by someone
to provide access to my machine, networks, etc.?

Essentially, what is the security model for all this source code and how do
I verify that it hasn't been tampered with in some manner?

1) That the code I build is exactly as written and accepted by the OS
community?

2) That the compilers and interpreters don't do anything except build the
code?

There's certainly lots of other issues about security, like protecting
passwords, protecting physical access to the network and machines, root
kits and the like, etc., but assuming none of that is in question (I don't
have any reason to think the NSA has been in my home!) ;-) I'm looking for
info on how the code is protected from the time it's signed off until it's
built and running here.

If someone knows of a good web site to read on this subject let me know.
I've gone through my Linux life more or less like most everyone went
through life 20 years ago, but paranoia strikes deep.

Thanks in advance,
Mark

[-- Attachment #2: Type: text/html, Size: 2283 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-04 22:04 [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) Mark Knecht
@ 2014-08-05  5:52 ` Duncan
  2014-08-05 18:50   ` Mark Knecht
  2014-08-06 21:33   ` Mark Knecht
  2014-08-05 19:16 ` [gentoo-amd64] " Frank Peters
  1 sibling, 2 replies; 17+ messages in thread
From: Duncan @ 2014-08-05  5:52 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Mon, 04 Aug 2014 15:04:12 -0700 as excerpted:

> As the line in that favorite song goes "Paranoia strikes deep"...

FWIW, while my lists sig is the proprietary-master quote from Richard 
Stallman below, since the (anti-)patriot bill was passed in the reaction 
to 9-11, my private email sig is a famous quote from Benjamin Franklin:

"They that can give up essential liberty to obtain a little
temporary safety, deserve neither liberty nor safety."

So "I'm with ya..."

> <NOTE>
> I am NOT trying to start ANY political discussion here. I hope no one
> will go too far down that path, at least here on this list. There are
> better places to do that.
> 
> I am also NOT suggesting anything like what I ask next has happened,
> either here or elsewhere. It's just a question.
> 
> Thanks in advance.
> </NOTE>
> 
> I'm currently reading a new book by Glen Greenwald called "No Place To
> Hide" which is about Greenwald's introduction to Edward Snowden and the
> release of all of the confidential NSA documents Snowden acquired. This
> got me wondering about Gentoo, or even just Linux in general. If the
> underlying issue in all of that Snowden stuff is that the NSA has the
> ability to intercept and hack into whatever they please, then how do I
> know that the source code I build on my Gentoo machines hasn't been
> modified by someone to provide access to my machine, networks, etc.?

These are good questions to ask, and to have some idea of the answers to, 
as well.

Big picture, at some level, you pretty much have to accept that you 
/don't/ know.  However, there's /some/ level of security... tho honestly 
a bit less on Gentoo than on some of the other distros (see below), tho 
it'd still not be /entirely/ easy to subvert at least widely (for an 
individual downloader is another question), but it could be done.

> Essentially, what is the security model for all this source code and how
> do I verify that it hasn't been tampered with in some manner?
> 
> 1) That the code I build is exactly as written and accepted by the OS
> community?

At a basic level, source and ebuild integrity, protecting both from 
accidental corruption (where it's pretty good) and from deliberate 
tampering (where it may or may not be considered "acceptable", but if 
someone with the resources wanted to bad enough, they could subvert), is 
what ebuild and sources digests are all about.  The idea is that the 
gentoo package maintainer creates hash digests of multiple types for both 
the ebuild and the sources, such that should the copy that a gentoo user 
gets not match the copy that a gentoo maintainer created, the package 
manager (PM, normally portage), if configured to do so (mainly 
FEATURES=strict, also see stricter and assume-digests, plus the webrsync-
gpg feature mentioned below) will error out and refuse to emerge that 
package.

But there are serious limits to that protection.  Here's a few points to 
consider:

1) While the ebuilds and sources are digested, those digests do *NOT* 
extend to the rest of the tree, the various files in the profile 
directory, the various eclasses, etc.  So in theory at least, someone 
could mess with say the package.mask file in profiles, or one of the 
eclasses, and could potentially get away with it.  But see point #3 as 
there's a (partial) workaround for the paranoid.

2) Meanwhile, since hashing (unlike gpg signing) isn't designed to be 
secure, primarily protecting against accidental damage not so much 
deliberate compromise, with digest verification verifying that nothing 
changed in transit but not who did the digest in the first place, there's 
some risk that one or more gentoo rsync mirrors could be compromised or 
be run by a bad actor in the first place.  Should that occur, the bad 
actor could attempt to replace BOTH the digested ebuild and/or sources 
AND the digest files, updating the latter to reflect his compromised 
version instead of the version originally digested by the gentoo 
maintainer.  Similarly, someone such as the NSA could at least in theory 
do the same thing in transit, targeting a specific user's downloads while 
leaving everyone else's downloads from the same mirror alone, so only the 
target got the compromised version.  While there's a reasonable chance 
someone would catch a bad mirror, if a single downloader is specifically 
targeted, unless they're specifically validating against other mirrors as 
well and/or comparing digests (over a secure channel) against those 
someone else downloaded, there's little chance they'd detect the 
problem.  So even digest-protected files aren't immune to compromise.

But as I said above, there's a (partial) workaround.  See point #3.

3) While #1 applies to the tree in general when it is rsynced, gentoo 
does have a somewhat higher security sync method for the paranoid and to 
support users behind firewalls which don't pass rsync.  Instead of 
running emerge sync, this method uses the emerge-webrsync tool, which 
downloads the entire main gentoo tree as a gpg-signed tarball.  If you 
have FEATURES=webrsync-gpg set (see the make.conf manpage, FEATURES, 
webrsync-gpg), portage will verify the gpg signature on this tarball.

The two caveats here are (1) that the webrsync tarball is generated only 
once per day, while the main tree is synced every few minutes, so the 
rsynced tree is going to be more current, and (2) that each snapshot is 
the entire tree, not just the changes, so for those updating daily or 
close to it, fetching the full tarball every day instead of just the 
changes will be more network traffic.  Tho I think the tarball is 
compressed (I've never tried this method personally so can't say for 
sure) while the rsync tree isn't, so if you're updating monthly, I'd 
guess it's less traffic to get the tarball.

The tarball is gpg-signed which is more secure than simple hash digests, 
but the signature covers the entire thing, not individual files, so the 
granularity of the digests is better.  Additionally, the tarball signing 
is automated, so while a signature validation pretty well ensures that 
the tarball did indeed come from gentoo, should someone compromise gentoo 
infrastructure security and somehow get a bad file in place, the daily 
snapshot tarball would blindly sign and package up the bad file along 
with all the rest.

So sync-method bottom line, if you're paranoid or simply want additional 
gpg-signed security, use emerge-webrsync along with FEATURES=webrsync-gpg, 
instead of normal rsync-based emerge sync.  That pretty well ensures that 
you're getting exactly the gentoo tree tarball gentoo built and signed, 
which is certainly far more secure than normal rsync syncing, but because 
the tarballing and signing is automated and covers the entire tree, 
there's still the possibility that one or more files in that tarball are 
compromised and that it hasn't been detected yet.

Meanwhile, I mentioned above that gentoo isn't as secure in this regard 
as a number of other Linux distros.  This is DEFINITELY the case for 
normal rsync syncers, but even for webrsync-gpg syncers it remains the 
case to some extent.  Unfortunately, in practice it seems that isn't 
likely to change in the near-term, and possibly not in the medium or 
longer term either, unless some big gentoo compromise is detected and 
makes the news.  THEN we're likely to see changes.

Alternatively, when that big pie-in-the-sky main gentoo tree switch from 
cvs (yes, still) to git eventually happens, the switch to full-signing 
will be quite a bit easier, tho there will still be policies to enforce, 
etc.  But they've been talking about the switch to git for years, as 
well, and... incrementally... drawing closer, including the fact that 
major portions of gentoo are actually developed in git-based overlays 
these days.  But will the main tree ever actually switch to git?  Who 
knows?  As of now it's still pie-in-the-sky, with no nailed down plans.  
Perhaps at some point somebody and some gentoo council together will 
decide it's time and move whatever mountains or molehills remain to get 
it done, and at this point I think that's mostly what it'll take, perhaps 
not, but unless that somebody steps up and makes that push come hell or 
high water, assuming gentoo's still around by then, come 2025 we could 
still be talking about doing it... someday...

Back to secure-by-policy gpg-signing...

The problem is that while we've known what must be done, and what other 
distros have already done, for years, and while gentoo has made some 
progress down the security road, in the absence of that ACTIVE KNOWN 
COMPROMISE RIGHT NOW immediate threat, other things simply continue to be 
higher priority, while REAL gentoo security continues to be back-burnered.

Basically, what must be done, thru all the way to policy enforcement and 
refusing gentoo developer commits if they don't match policy, is enforce 
a policy that every gentoo dev has a registered gpg key (AFAIK that much 
is already the case), and that every commit they make is SIGNED by that 
personal developer key, with gentoo-infra verification of those 
signatures, rejecting any commit that doesn't verify.

FWIW, there's GLEPs detailing most of this.  They've just never been 
fully implemented, tho incrementally, bits and pieces have been, over 
time.

As I said, other distros have done this, generally when they HAD to, when 
they had that compromise hitting the news.  Tho I think a few distros 
have implemented such a signed-no-exceptions policy when some OTHER 
distro got hit.  Gentoo hasn't had that happen yet, and while the 
infrastructure is generally there to sign at least individual package 
commits, and some devs actually do so (you can see the signed digests for 
some packages, for instance), that hasn't been enforced tree-wide, and in 
fact, there's a few relatively minor but still important policy questions 
to resolve first, before such enforcement is actually activated.

Here's one such signing-policy question to consider.  Currently, package 
maintainer devs make changes to their ebuilds, and later, after a period 
of testing, arch-devs keyword a particular ebuild stable for their arch.  
Occasionally arch-devs may add a bit of conditional code that applies to 
their arch only, as well.

Now consider this.  Suppose a compromised package is detected after the 
package has been keyworded stable.  The last several signed commits to 
that package were keywording only, while the commit introducing the 
compromise was sometime earlier.

Question:  Are those arch-devs that signed their keywording-only commits 
responsible too, because they signed off on the package, meaning they now 
have to inspect every package they keyword, checking for compromises that 
might not be entirely obvious to them, or are they only responsible for 
the keywording changes they actually committed, and aren't obligated to 
actually inspect the rest of the ebuild they're now signing?

OK, so we say that they're only responsible for the keywording.  Simple 
enough.  But what about this?  Suppose they add an arch-conditional that 
combined with earlier code in the package results in a compromise.  But 
the conditional code they added looks straightforward enough on its own, 
and really does solve a problem on that arch, and without that code, the 
original code looks innocently functional as well.  But together, anyone 
installing that package on that arch is now open to the world.  Both devs 
signed, the code of both devs is legit and looks innocent enough on its 
own, but taken together, they result in a bad situation.  Now it's not so 
clear that an arch-dev shouldn't have to inspect and sign for the results 
of the package after his commit, is it?  Yet enforcing that as policy 
will seriously slow-down arch stable keywording, and some archs can't 
keep up as it is, so such a policy will be an effective death sentence 
for them as a gentoo-stable supported arch.

Certainly there are answers to that sort of question, and various distros 
have faced and come up with their own policy answers, often because in 
the face of a REAL DISTRO COMPROMISE making the news, they've had no 
other choice.  To some extent, gentoo is lucky in that it hasn't been 
faced with making those hard choices yet.  But the fact is, all gentoo 
users remain less safe than we could be, because those hard choices 
haven't been made and enforced... because we've not been forced to do so.

Meanwhile, even were we to have done so, there's still the possibility 
that upstream development might be compromised.  Every year or two, some 
upstream project or another makes news due to some compromise or 
another.  Sometimes vulnerable versions have been distributed for awhile, 
and various distros have picked them up.  In an upstream-compromise 
situation like that, there's little a distro can do, with the exception 
of going slow enough that their packages are all effectively outdated, 
which also happens to be a relatively effective counter to this sort of 
issue since if a several years old version changes it'll be detected 
right away, and (one hopes) most compromises to a project server will be 
detected within months at the longest, so anything a year or more old 
should be relatively safe from this sort of issue, simply by virtue of 
its age.

Obviously the people and enterprise distros willing to run years outdated 
code do have that advantage, and that's a risk that people wishing to run 
reasonably current code simply have to take as a result of that choice, 
regardless of the distro they chose to get that current code from.

But even if you choose to run an old distro so aren't likely to be hit by 
current upstream compromises, that has and enforces a full signing policy 
so every commit can be accounted for, and even if none of those 
developers at either the distro or upstream levels deliberately breaks 
the trust and goes bad, there's still the issue below...

> 2) That the compilers and interpreters don't do anything except build
> the code?

There's a very famous in security circles paper that effectively proves 
that unless you can absolutely trust every single layer in the build 
line, including the hardware layer (which means its sources) and the 
compiler and tools used to build your operational tools, and the compiler 
and tools used to build them, and... all the way back... you simply 
cannot absolutely trust the results, period.

I never kept the link, but it seems the title actually stuck in memory 
well enough for me to google it: "Reflections on Trusting Trust"
=:^)  Here's the google link:

https://www.google.com/search?q=%22reflections+on+trusting+trust%22

That means that in ordered to absolutely prove the gcc (for example) on 
our own systems, even if we can read and understand every line of gcc 
source, we must absolutely prove the tools on the original installation 
media and in the stage tarballs that we used to build our system.  Which 
means we must not only have the code to them and trust the builders, but 
we must have the code and trust the builders of the tools they used, and 
the builders and tools of those tools, and...

Meanwhile, the same rule effectively applies to the hardware as well.  
And while Richard Stallman may run a computer that is totally open source 
hardware and firmware (down to the BIOS or equivalent), for which he has 
all the schemantics, etc, most of us run at least some semi-proprietary 
hardware of /some/ sort.  Which means even if we /could/ fully understand 
the sources ourselves, without them and without that full understanding, 
at that level, we simply have to trust... someone... basically, the 
people who design and manufacture that hardware.

Thus, in practice, (nearly) everyone ends up drawing the line
/somewhere/.  The Stallmans of the world draw it pretty strictly, 
refusing to run anything which at minimum has replaceable firmware which 
doesn't itself have sources available.  (As Stallman defines it, if the 
firmware is effectively burned in such that the manufacturer themselves 
can't update it, then that's good enough for the line he draws.  Tho that 
leads to absurdities such as an OpenMOKO phone that at extra expense has 
the firmware burned onto a separate chip such that it can't be replaced 
by anyone, in ordered to be able to use hardware that would otherwise be 
running firmware that the supplier refuses to open-source -- because the 
extra expense to do it that way means the manufacturer can't replace the 
firmware either, so it's on the OK side of Stallman's line.)

Meanwhile, I personally draw the line at what runs at the OS level on my 
computer.  That means I won't run proprietary graphics drivers or flash, 
but I will and do load source-less firmware onto the Radeon-based 
graphics hardware I do run, in ordered to use the freedomware kernel 
drivers for the same hardware that I refuse to run the proprietary frglx 
drivers on.

Other people are fine running flash and/or proprietary graphics drivers, 
but won't run a mostly-proprietary full OS such as MS Windows or Apple 
OSX.

Still others prefer to run open source where it fits their needs, but 
won't go out of their way to do so if proprietary works better for them, 
and still others simply don't care either way, running whatever works 
best regardless of the freedom or lack thereof of its sources.

Anyway, when it comes to hardware and compiler, in practice the best you 
can do is run a FLOSS compiler such as gcc, while trusting the tools you 
used to build the first ancestor, basically, the gcc and tools in the 
stage tarballs, as well as whatever you booted (probably either a gentoo-
installer or another distro) in ordered to chroot into that unpacked 
stage and build from there.  Beyond that, well... good luck, but you're 
still going to end up drawing the line /somewhere/.

> There's certainly lots of other issues about security, like protecting
> passwords, protecting physical access to the network and machines, root
> kits and the like, etc., but assuming none of that is in question (I
> don't have any reason to think the NSA has been in my home!) ;-) I'm
> looking for info on how the code is protected from the time it's signed
> off until it's built and running here.
> 
> If someone knows of a good web site to read on this subject let me know.
> I've gone through my Linux life more or less like most everyone went
> through life 20 years ago, but paranoia strikes deep.

Indeed.  Hope the above was helpful.  I think it's a pretty accurate 
picture from at least my own perspective, as someone who cares enough 
about it to at least spend a not insignificant amount of time keeping up 
on the current situation in this area, both for linux in general, and for 
gentoo in particular.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05  5:52 ` [gentoo-amd64] " Duncan
@ 2014-08-05 18:50   ` Mark Knecht
  2014-08-06 21:33   ` Mark Knecht
  1 sibling, 0 replies; 17+ messages in thread
From: Mark Knecht @ 2014-08-05 18:50 UTC (permalink / raw
  To: Gentoo AMD64

[-- Attachment #1: Type: text/plain, Size: 1587 bytes --]

On Mon, Aug 4, 2014 at 10:52 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>
> Mark Knecht posted on Mon, 04 Aug 2014 15:04:12 -0700 as excerpted:
>
> > As the line in that favorite song goes "Paranoia strikes deep"...
>
> FWIW,

I __LOVE__ the idea that my favorite old song has ended up being
a contraction everyone uses...

> while my lists sig is the proprietary-master quote from Richard
> Stallman below, since the (anti-)patriot bill was passed in the reaction
> to 9-11, my private email sig is a famous quote from Benjamin Franklin:
>
> "They that can give up essential liberty to obtain a little
> temporary safety, deserve neither liberty nor safety."
>
> So "I'm with ya..."

Good to know. (Not that I didn't already!)

<SNIP>
> These are good questions to ask, and to have some idea of the answers to,
> as well.
>
> Big picture, at some level, you pretty much have to accept that you
> /don't/ know.

OK.

<SNIP>
> I never kept the link, but it seems the title actually stuck in memory
> well enough for me to google it: "Reflections on Trusting Trust"
> =:^)  Here's the google link:
>
> https://www.google.com/search?q=%22reflections+on+trusting+trust%22
>

This is a great paper and the Moral section is dead on right. The line:

"No amount of source-level verification or scrutiny will protect you
from using untrusted code."

is spot on and just about impossible for folks like me. (And I'm _way_
beyond the average computer use, as is anyone reading this list.)

I'll respond/etc. to other parts of your post later but want to
give a quick thanks right now.

Cheers,
Mark

[-- Attachment #2: Type: text/html, Size: 2199 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05  5:52 ` [gentoo-amd64] " Duncan
  2014-08-05 18:50   ` Mark Knecht
@ 2014-08-06 21:33   ` Mark Knecht
  2014-08-07  0:58     ` Duncan
  1 sibling, 1 reply; 17+ messages in thread
From: Mark Knecht @ 2014-08-06 21:33 UTC (permalink / raw
  To: Gentoo AMD64

Hi Duncan

On Mon, Aug 4, 2014 at 10:52 PM, Duncan <1i5t5.duncan@cox.net> wrote:
<SNIP>
>
> 3) While #1 applies to the tree in general when it is rsynced, gentoo
> does have a somewhat higher security sync method for the paranoid and to
> support users behind firewalls which don't pass rsync.  Instead of
> running emerge sync, this method uses the emerge-webrsync tool, which
> downloads the entire main gentoo tree as a gpg-signed tarball.  If you
> have FEATURES=webrsync-gpg set (see the make.conf manpage, FEATURES,
> webrsync-gpg), portage will verify the gpg signature on this tarball.
>

I'm finally able to investigate this today. I'm not finding very
detailed instructions anywhere , more like notes people would use if
they've done this before and understand all the issues. Being that
it's my first excursion down this road I have much to learn.

OK, I've modified make.conf as such:

FEATURES="buildpkg strict webrsync-gpg"
PORTAGE_GPG_DIR="/etc/portage/gpg"

and created /etc/portage/gpg:

c2RAID6 portage # ls -al
total 72
drwxr-xr-x 13 root root 4096 Aug  6 14:25 .
drwxr-xr-x 87 root root 4096 Aug  6 09:10 ..
drwxr-xr-x  2 root root 4096 Apr 27 10:26 bin
-rw-r--r--  1 root root   22 Jan  1  2014 categories
drwxr-xr-x  2 root root 4096 Jul  6 09:42 env
drwx------  2 root root 4096 Aug  6 14:03 gpg
-rw-r--r--  1 root root 1573 Aug  6 14:03 make.conf
lrwxrwxrwx  1 root root   63 Mar  5  2013 make.profile ->
../../usr/portage/profiles/default/linux/amd64/13.0/desktop/kde
[the rest deleted...]

eix-sync seems to be working but it may (or may not) be caught in some
loop where it just keeps looking for older data. I let it go until it
got back into July and then did a Ctrl-C:

c2RAID6 portage # eix-sync -wa
 * Running emerge-webrsync
Fetching most recent snapshot ...
Trying to retrieve 20140805 snapshot from http://gentoo.osuosl.org ...
Fetching file portage-20140805.tar.xz.md5sum ...
Fetching file portage-20140805.tar.xz.gpgsig ...
Fetching file portage-20140805.tar.xz ...
Checking digest ...
Checking signature ...
gpg: Signature made Tue Aug  5 17:55:23 2014 PDT using RSA key ID C9189250
gpg: Can't check signature: No public key
Fetching file portage-20140805.tar.bz2.md5sum ...
Fetching file portage-20140805.tar.bz2.gpgsig ...
Fetching file portage-20140805.tar.bz2 ...
Checking digest ...
Checking signature ...
gpg: Signature made Tue Aug  5 17:55:22 2014 PDT using RSA key ID C9189250
gpg: Can't check signature: No public key
Fetching file portage-20140805.tar.gz.md5sum ...
20140805 snapshot was not found
Trying to retrieve 20140804 snapshot from http://gentoo.osuosl.org ...
Fetching file portage-20140804.tar.xz.md5sum ...
Fetching file portage-20140804.tar.xz.gpgsig ...
Fetching file portage-20140804.tar.xz ...
Checking digest ...
Checking signature ...
gpg: Signature made Mon Aug  4 17:55:27 2014 PDT using RSA key ID C9189250
gpg: Can't check signature: No public key

QUESTIONS:

1) Is the 'No public key' message talking about me, or something at
the source? I haven't got any keys so maybe i need to generate one?

2) Once I do get this working correctly it would make sense to me that
I need to delete all existing distfiles to ensure that anything on my
system actually came from this tarball. Is that correct?

<SNIP>
> So sync-method bottom line, if you're paranoid or simply want additional
> gpg-signed security, use emerge-webrsync along with FEATURES=webrsync-gpg,
> instead of normal rsync-based emerge sync.  That pretty well ensures that
> you're getting exactly the gentoo tree tarball gentoo built and signed,
> which is certainly far more secure than normal rsync syncing, but because
> the tarballing and signing is automated and covers the entire tree,
> there's still the possibility that one or more files in that tarball are
> compromised and that it hasn't been detected yet.

Or, as we both have eluded to, the bad guy is intercepting the
transmission and giving me a different tarball...

For now, it's more than enough to take a baby first step.

Thanks for all your sharing of info!

Cheers,
Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-06 21:33   ` Mark Knecht
@ 2014-08-07  0:58     ` Duncan
  2014-08-07 18:16       ` Mark Knecht
  0 siblings, 1 reply; 17+ messages in thread
From: Duncan @ 2014-08-07  0:58 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Wed, 06 Aug 2014 14:33:28 -0700 as excerpted:

> OK, I've modified make.conf as such:
> 
> FEATURES="buildpkg strict webrsync-gpg"
> PORTAGE_GPG_DIR="/etc/portage/gpg"
> 
> and created /etc/portage/gpg:

> drwxr-xr-x  2 root root 4096 Jul  6 09:42

> eix-sync seems to be working but it may (or may not) be caught in some
> loop where it just keeps looking for older data. I let it go until it
> got back into July and then did a Ctrl-C:
> 
> c2RAID6 portage # eix-sync -wa
>  * Running emerge-webrsync
> Fetching most recent snapshot ...
> Trying to retrieve 20140805 snapshot from http://gentoo.osuosl.org ...
> Fetching file portage-20140805.tar.xz.md5sum ...
> Fetching file portage-20140805.tar.xz.gpgsig ...
> Fetching file portage-20140805.tar.xz ...
> Checking digest ...
> Checking signature ...
> gpg: Signature made Tue Aug  5 17:55:23 2014 PDT using RSA key ID
> C9189250
> gpg: Can't check signature: No public key
> Fetching file [repeat in a loop with older dates]

> QUESTIONS:
> 
> 1) Is the 'No public key' message talking about me, or something at the
> source? I haven't got any keys so maybe i need to generate one?

It's saying you need to (separately) download the *GENTOO* key using some 
other method, and put it in the appropriate place so it can verify the 
signatures its finding.

Note that while I've not used webrsync, for some years (until I switched 
from signed kernel tarball to git-cloned kernel) I ran a script that I 
wrote up myself, that downloaded the kernel tarball as well as its gpg-
signatures, and gpg-verified the signature on the tarball before 
unpacking it and going ahead with reconfiguration and build.

So I have a reasonable grasp of the general concepts -- good enough I 
could script it -- but I don't know the webrsync specifics.

But that's definitely a missing separately downloaded public key, so it 
can't verify the signatures on the tarballs it's downloading, and is thus 
rejecting them.

Of course in this case such a rejection is a good thing, since if it was 
acting as if nothing was wrong and simply trusting the tarball even when 
it couldn't verify the signature, it would be broken in security terms 
anyway! =:^)

So that's what you'll need to do, presumably based on instructions you 
find for getting that key and putting it in the right spot so webrsync 
can access it.  But unfortunately since I've not used it myself, I can't 
supply you those instructions.

Or wait!  Actually I can, as google says that's actually part of the 
gentoo handbook! =:^)  (Watch the link-wrap and reassemble as necessary, 
I'm lazy today.  The arch doesn't matter for this bit so x86/amd64, it's 
all the same.)

https://www.gentoo.org/doc/en/handbook/handbook-x86.xml?
part=2&chap=3#webrsync-gpg

Based on the above, it seems you've created the gpg directory and set 
appropriate permissions, but either you haven't downloaded the keys as 
described in the link above, or perhaps you're missing the PORTAGE_GPG_DIR 
setting.

> 2) Once I do get this working correctly it would make sense to me that I
> need to delete all existing distfiles to ensure that anything on my
> system actually came from this tarball. Is that correct?

Not unless you're extremely paranoid, tho it wouldn't hurt anything, just 
mean you blew away your cache and have more downloading to do until you 
have it again.

Once you're verifying the tarball, part of the tarball's signed and 
verified contents is going to be the distfile digests.  Once they're 
coming from the tarball, you have a reasonable confidence that they 
haven't been tampered with, and given the multi-algorithm hashing, even 
if one algorithm was hacked and the file could be faked based on only it, 
the other hashes should catch the problem.

Of course, once you're doing this, it's even MORE important not to simply 
redigest the occasional source or ebuild that comes up with an error due 
to a bad digest.  For people not verifying things to do that is one 
thing, but once you're verifying, simply doing a redigest yourself on 
anything that comes up bad directly bypasses all that extra safety you're 
trying to give yourself.  So if a distfile tarball comes up bad, go ahead 
and delete it and try again, but if it comes up bad repeatedly, don't 
just redigest it, either check for a bad-digest bug on that package and 
file one if necessary, or simply wait a day or two and try again, as the 
problem is generally caught and fixed by then.

(I've seen rumors of people on the forums, etc, suggesting a redigest at 
any failure, and it always scares me.  Those digests are there for a 
reason, and if they're failing, you better make **** sure you know it's a 
legitimate bug (say kde making a last minute change to the tarballs after 
release to the distros for testing, but before public release, as they do 
occasionally) before redigesting.  And even then, if you can just wait a 
couple days it should normally work itself out, without you having to 
worry about it or take that additional risk.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-07  0:58     ` Duncan
@ 2014-08-07 18:16       ` Mark Knecht
  2014-08-07 19:53         ` Duncan
  2014-08-07 21:18         ` Duncan
  0 siblings, 2 replies; 17+ messages in thread
From: Mark Knecht @ 2014-08-07 18:16 UTC (permalink / raw
  To: Gentoo AMD64

This is a bit long but it's mostly just stuff copied from my terminal
for completeness.
-MWK

On Wed, Aug 6, 2014 at 5:58 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Wed, 06 Aug 2014 14:33:28 -0700 as excerpted:
>
>> OK, I've modified make.conf as such:
>>
>> FEATURES="buildpkg strict webrsync-gpg"
>> PORTAGE_GPG_DIR="/etc/portage/gpg"
>>
>> and created /etc/portage/gpg:
>
>> drwxr-xr-x  2 root root 4096 Jul  6 09:42
>
<SNIP>
>
> Or wait!  Actually I can, as google says that's actually part of the
> gentoo handbook! =:^)  (Watch the link-wrap and reassemble as necessary,
> I'm lazy today.  The arch doesn't matter for this bit so x86/amd64, it's
> all the same.)
>
> https://www.gentoo.org/doc/en/handbook/handbook-x86.xml?
> part=2&chap=3#webrsync-gpg
>

Great link! Thanks. So I think the important stuff is here, the first
2 lines I managed
on my own, but the gpg part is what's new to me:

[QUOTE]
# mkdir -p /etc/portage/gpg
# chmod 0700 /etc/portage/gpg
(... Substitute the keys with those mentioned on the release
engineering site ...)
# gpg --homedir /etc/portage/gpg --keyserver subkeys.pgp.net
--recv-keys 0xDB6B8C1F96D8BF6D
# gpg --homedir /etc/portage/gpg --edit-key 0xDB6B8C1F96D8BF6D trust
[/QOUTE]

From the comment about the Release Engineering site, I think that's here:

https://www.gentoo.org/proj/en/releng/

And the keys match with is good.

Anyway, running the first command is fine. The second command wants me to
make a choice. For now I chose to 'ultimately trust'. (Aren't I gullible!?!)

[COPY]
c2RAID6 ~ # gpg --homedir /etc/portage/gpg --edit-key 0xDB6B8C1F96D8BF6D trust
gpg (GnuPG) 2.0.25; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.


pub  4096R/96D8BF6D  created: 2011-11-25  expires: 2015-11-24  usage: C
                     trust: unknown       validity: unknown
sub  4096R/C9189250  created: 2011-11-25  expires: 2015-11-24  usage: S
[ unknown] (1). Gentoo Portage Snapshot Signing Key (Automated Signing Key)

pub  4096R/96D8BF6D  created: 2011-11-25  expires: 2015-11-24  usage: C
                     trust: unknown       validity: unknown
sub  4096R/C9189250  created: 2011-11-25  expires: 2015-11-24  usage:
S
[ unknown] (1). Gentoo Portage Snapshot Signing Key (Automated Signing
Key)

Please decide how far you trust this user to correctly verify other
users' keys
(by looking at passports, checking fingerprints from different
sources, etc.)

  1 = I don't know or won't say
  2 = I do NOT trust
  3 = I trust marginally
  4 = I trust fully
  5 = I trust ultimately
  m = back to the main menu

Your decision? 5
Do you really want to set this key to ultimate trust? (y/N) y

pub  4096R/96D8BF6D  created: 2011-11-25  expires: 2015-11-24  usage:
C
                     trust: ultimate      validity: unknown
sub  4096R/C9189250  created: 2011-11-25  expires: 2015-11-24  usage:
S
[ unknown] (1). Gentoo Portage Snapshot Signing Key (Automated Signing
Key)
Please note that the shown key validity is not necessarily correct
unless you restart the program.

gpg>  list

pub  4096R/96D8BF6D  created: 2011-11-25  expires: 2015-11-24  usage: C
                     trust: ultimate      validity: unknown
sub  4096R/C9189250  created: 2011-11-25  expires: 2015-11-24  usage: S
[ unknown] (1)* Gentoo Portage Snapshot Signing Key (Automated Signing Key)

gpg> check
uid  Gentoo Portage Snapshot Signing Key (Automated Signing Key)
sig!3        96D8BF6D 2011-11-25  [self-signature]
6 signatures not checked due to missing keys

gpg>   quit
c2RAID6 ~ #


[/COPY]



I'm not sure how to short of a reboot 'restart the program', nor what the line

6 signatures not checked due to missing keys

really means. That said it appears to be working better than yesterday:




c2RAID6 ~ # eix-sync -w
 * Running emerge-webrsync
Fetching most recent snapshot ...
Trying to retrieve 20140806 snapshot from http://gentoo.osuosl.org ...
Fetching file portage-20140806.tar.xz.md5sum ...
Fetching file portage-20140806.tar.xz.gpgsig ...
Fetching file portage-20140806.tar.xz ...
Checking digest ...
Checking signature ...
gpg: Signature made Wed Aug  6 17:55:26 2014 PDT using RSA key ID C9189250
gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2015-11-24
gpg: Good signature from "Gentoo Portage Snapshot Signing Key
(Automated Signing Key)" [ultimate]
Getting snapshot timestamp ...
Syncing local tree ...

Number of files: 178933
Number of files transferred: 6846
Total file size: 327.27M bytes
Total transferred file size: 19.96M bytes
Literal data: 19.96M bytes
Matched data: 0 bytes
File list size: 4.32M
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 12.38M
Total bytes received: 156.23K

sent 12.38M bytes  received 156.23K bytes  166.03K bytes/sec
total size is 327.27M  speedup is 26.11
Cleaning up ...
 * Copying old database to /var/cache/eix/previous.eix
 * Running eix-update
Reading Portage settings ..
<SNIP>
[474] "zx2c4" layman/zx2c4 (cache: eix*
/tmp/eix-remote.MbcFER9d/zx2c4.eix [*/zx2c4])
     Reading Packages .. Finished
Applying masks ..
Calculating hash tables ..
Writing database file /var/cache/eix/remote.eix ..
Database contains 31587 packages in 234 categories.
 * Calling eix-diff
Diffing databases (17596 -> 17598 packages)
[>]   == games-util/umodpack (0.5_beta16-r1 -> 0.5_beta16-r2):
portable and useful [un]packer for Unreal Tournament's Umod files
[U]   == media-libs/libbluray (0.5.0-r1{tbz2}@06/19/14;
(~)0.5.0-r1{tbz2} -> (~)0.6.1): Blu-ray playback libraries
[>]   == net-misc/chrony (1.30^t -> 1.30-r1^t): NTP client and server programs
[U]   == sys-devel/gnuconfig (20131128{tbz2}@02/18/14; 20131128{tbz2}
-> 20140212): Updated config.sub and config.guess file from GNU
[U]   == virtual/libgudev (215(0/0){tbz2}@08/05/14; 215(0/0){tbz2} ->
215-r1(0/0)): Virtual for libgudev providers
[U]   == virtual/libudev (215(0/1){tbz2}@08/05/14; 215(0/1){tbz2} ->
215-r1(0/1)): Virtual for libudev providers
[D]   == www-client/google-chrome-beta
(37.0.2062.58_p1{tbz2}@08/05/14; (~)37.0.2062.58_p1^msd{tbz2} ->
~37.0.2062.68_p1^msd): The web browser from Google
[U]   == www-client/google-chrome-unstable
(38.0.2107.3_p1{tbz2}@08/06/14; (~)38.0.2107.3_p1^msd{tbz2} ->
(~)38.0.2114.2_p1^msd): The web browser from Google
[N]   >> dev-ruby/prawn-table (~0.1.0): Provides support for tables in Prawn
[N]   >> sys-apps/cv (~0.4.1): Coreutils Viewer: show progress for cp,
rm, dd, and so forth
 * Time statistics:
   136 seconds for syncing
    43 seconds for eix-update
     2 seconds for eix-diff
   197 seconds total
c2RAID6 ~ #




So that's all looking pretty good, as a first step. If it's a matter
of 3 1/2 minutes instead of 1-2 minutes then I can live with that
part. However that's just (I think) the portage tree and not signed
source code, correct?

Now, is the idea that I have a validated portage snapshot at this
point and stiff have to actually get the code using the regular emerge
which will do the checking because I have:

FEATURES="buildpkg strict webrsync-gpg"

I don't see any evidence that emerge checked what it downloaded, but
maybe those checks are only done when I really build the code?




c2RAID6 ~ # emerge -fDuN @world
Calculating dependencies... done!

>>> Fetching (1 of 5) sys-devel/gnuconfig-20140212
>>> Downloading 'http://gentoo.osuosl.org/distfiles/gnuconfig-20140212.tar.bz2'
--2014-08-07 11:12:11--
http://gentoo.osuosl.org/distfiles/gnuconfig-20140212.tar.bz2
Resolving gentoo.osuosl.org... 140.211.166.134
Connecting to gentoo.osuosl.org|140.211.166.134|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 44808 (44K) [application/x-bzip2]
Saving to: '/usr/portage/distfiles/gnuconfig-20140212.tar.bz2'

100%[================================================================>]
44,808       113KB/s   in 0.4s

2014-08-07 11:12:13 (113 KB/s) -
'/usr/portage/distfiles/gnuconfig-20140212.tar.bz2' saved
[44808/44808]

 * gnuconfig-20140212.tar.bz2 SHA256 SHA512 WHIRLPOOL size ;-) ...
                            [ ok ]

>>> Fetching (2 of 5) media-libs/libbluray-0.6.1
>>> Downloading 'http://gentoo.osuosl.org/distfiles/libbluray-0.6.1.tar.bz2'
--2014-08-07 11:12:13--
http://gentoo.osuosl.org/distfiles/libbluray-0.6.1.tar.bz2
Resolving gentoo.osuosl.org... 140.211.166.134
Connecting to gentoo.osuosl.org|140.211.166.134|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 586646 (573K) [application/x-bzip2]
Saving to: '/usr/portage/distfiles/libbluray-0.6.1.tar.bz2'

100%[================================================================>]
586,646      716KB/s   in 0.8s

2014-08-07 11:12:15 (716 KB/s) -
'/usr/portage/distfiles/libbluray-0.6.1.tar.bz2' saved [586646/586646]

 * libbluray-0.6.1.tar.bz2 SHA256 SHA512 WHIRLPOOL size ;-) ...
                            [ ok ]

>>> Fetching (3 of 5) virtual/libudev-215-r1

>>> Fetching (4 of 5) virtual/libgudev-215-r1

>>> Fetching (5 of 5) www-client/google-chrome-unstable-38.0.2114.2_p1
>>> Downloading 'http://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-unstable/google-chrome-unstable_38.0.2114.2-1_amd64.deb'
--2014-08-07 11:12:16--
http://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-unstable/google-chrome-unstable_38.0.2114.2-1_amd64.deb
Resolving dl.google.com... 74.125.239.2, 74.125.239.6, 74.125.239.4, ...
Connecting to dl.google.com|74.125.239.2|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 47472462 (45M) [application/x-debian-package]
Saving to: '/usr/portage/distfiles/google-chrome-unstable_38.0.2114.2-1_amd64.deb'

100%[================================================================>]
47,472,462  6.81MB/s   in 7.1s

2014-08-07 11:12:23 (6.37 MB/s) -
'/usr/portage/distfiles/google-chrome-unstable_38.0.2114.2-1_amd64.deb'
saved [47472462/47472462]

 * google-chrome-unstable_38.0.2114.2-1_amd64.deb SHA256 SHA512
WHIRLPOOL size ;-) ...             [ ok ]
c2RAID6 ~ #


Cheers,
Mark


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-07 18:16       ` Mark Knecht
@ 2014-08-07 19:53         ` Duncan
  2014-08-07 21:18         ` Duncan
  1 sibling, 0 replies; 17+ messages in thread
From: Duncan @ 2014-08-07 19:53 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted:

> From the comment about the Release Engineering site, I think that's
> here:
> 
> https://www.gentoo.org/proj/en/releng/
> 
> And the keys match with is good.
> 
> Anyway, running the first command is fine. The second command wants me
> to make a choice. For now I chose to 'ultimately trust'. (Aren't I
> gullible!?!)

[...]

> Please decide how far you trust this user to correctly verify other
> users' keys (by looking at passports, checking fingerprints from
> different sources, etc.)
> 
>   1 = I don't know or won't say
>   2 = I do NOT trust
>   3 = I trust marginally
>   4 = I trust fully
>   5 = I trust ultimately
>   m = back to the main menu
> 
> Your decision? 5
> Do you really want to set this key to ultimate trust? (y/N) y

GPG is built on a "web of trust" idea.  Basically, the idea is that if 
you know and trust someone (well, in this case their key), then they can 
vouch for someone else that you don't know.

At various community conferences and the like, there's often a
"key signing party".  Attending devs and others active in the community 
check passports and etc, in theory validating the identity of the other 
person, then sign their key, saying they've checked and the person using 
this key is really who they say they are.

What you're doing here is giving gpg some idea of how much trust you want 
to put in the key, not just to verify that whatever that person sends you 
did indeed get signed with their key, but more importantly, to what 
extent you trust that key to vouch for OTHER keys it has signed that you 
don't know about yet.

If an otherwise unknown-trust key is signed by an ultimate-trust key, 
it'll automatically be considered valid, tho it won't itself be trusted 
to sign /other/ keys until you specifically assign a level of trust to 
it, too.

OTOH, it'd probably take a fully-trusted key plus perhaps a marginally 
trusted key, to validate an otherwise unknown key signed by both but not 
signed by an ultimately-trusted key.

And it'd take more (at least three, maybe five or something, I get the 
idea but have forgotten the specifics) marginal trust key signatures to 
verify an otherwise unknown key in the absence of a stronger-trust key 
signature of it as well.

Don't know or won't say I think means it doesn't count either way, and do 
NOT trust probably counts as a negative vote, thus requiring more votes 
from at least marginal-trust signatures to validate than it would 
otherwise.  I'm sure the details are in the gpg docs if you're interested 
in reading up...

Meanwhile, the key in question here is the gentoo snapshot-validation 
key, which should only be used to sign the tree tarballs, not a personal 
key, and gentoo should use a different key to, for instance, sign 
personal gentoo dev keys, so you're not likely to see it used to sign 
other keys and the above web-of-trust stuff doesn't matter so much in 
this case.

OTOH... (more below)

> Please note that the shown key validity is not necessarily correct
> unless you restart the program.

> gpg> check uid  Gentoo Portage Snapshot Signing Key
> (Automated Signing Key)
> sig!3        96D8BF6D 2011-11-25  [self-signature]
> 6 signatures not checked due to missing keys

> I'm not sure how to short of a reboot 'restart the program'

That's simply saying that you're in gpg interactive mode, and any edits 
you make in that gpg session won't necessarily show up or become 
effective until you quit gpg and start a new session.

For example, I believe if you change the level of trust of some key, then 
in the same gpg interactive session check the validity of another key 
that the first one signed, the edit to the trust level of the first key 
won't necessarily be reflected in the validity assigned to the second key 
signed by the first.  If you quit gpg it'll write the change you made, 
and restarting gpg should then give you an accurate assessment of the 
second key, reflecting the now saved change you made to the trust level 
of the first key that signed the second.

> nor what the line [means:]
> 
> 6 signatures not checked due to missing keys

That simply indicates that the gentoo key is signed by a bunch of (six) 
others, probably gentoo infra, perhaps the foundation, etc, that if you 
had a larger web of trust already built, would vouch for the validity of 
the portage snapshotting key.  Since you don't have that web of trust 
built yet, you gotta do without, but you gotta start somewhere...

... Which is the "more below" I referred to above.  The snapshot-
validation key shouldn't be used to sign other keys, because that's not 
its purpose.  Restricting a key to a single purpose helps a bit to keep 
it from leaking, but more importantly, restricts the damage should it 
indeed leak.  If the snapshotting key gets stolen, it means snapshots 
signed by it can be no longer trusted, but since it's not used to sign 
other keys, at least the thief can't use the stolen key to vouch for 
other keys, because the key isn't used for that.

At least... he couldn't if you hadn't set the key to ultimate trust, that 
you indeed trust it to vouch for other keys, alone, without any other 
vouching for them as well.

So I'd definitely recommend editing that key again, and reducing the 
trust level.  I /think/ you can actually set it to do NOT trust for the 
purpose of signing other keys, since that is indeed the case, without 
affecting its usage for signing the portage tree snapshots.  However, I'm 
not positive of that.  I'd test that to be sure, and simply set it back 
to "don't want to say" or to "marginally", if that turns out to be 
required to validate the snapshot with it.  (Tho I don't believe it 
should, because that would break the whole way the web of trust is 
supposed to work and the concept of using a key for only one thing, not 
letting you simply accept the signature for content signing, without also 
requiring you to accept it as trustworthy for signing other keys.)

I believe I have a different aspect of your post to reply to as well, but 
that can be a different reply...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-07 18:16       ` Mark Knecht
  2014-08-07 19:53         ` Duncan
@ 2014-08-07 21:18         ` Duncan
  2014-08-08 18:34           ` Mark Knecht
  1 sibling, 1 reply; 17+ messages in thread
From: Duncan @ 2014-08-07 21:18 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted:

> So that's all looking pretty good, as a first step. If it's a matter of
> 3 1/2 minutes instead of 1-2 minutes then I can live with that part.
> However that's just (I think) the portage tree and not signed source
> code, correct?

[I just posted a reply to the gpg specific stuff.]

Technically correct, but not really so in implementation.  See below...

> Now, is the idea that I have a validated portage snapshot at this point
> and stiff have to actually get the code using the regular emerge which
> will do the checking because I have:
> 
> FEATURES="buildpkg strict webrsync-gpg"

No...  It doesn't work that way.

> I don't see any evidence that emerge checked what it downloaded, but
> maybe those checks are only done when I really build the code?

Here's what happens.

FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the 
snapshot-tarball that webrsync downloads.  Without that, it'd still 
download it the same, but it wouldn't verify the signature.  This allows 
people who use the webrsync only because they're behind a firewall that 
wouldn't allow normal rsync, but who don't care about the gpg signing 
security stuff, to use the same tool as the people who actually use 
webrsync for the security aspect, regardless of whether they could use 
normal rsync or not.

So that gets you a signed and verified tree.  Correct so far.

But as part of that tree, there are digest files for each package that 
verify the integrity of the ebuild as well as of the sources tarballs 
(distfiles).

Now it's important to grasp the difference between gpg signing and simple 
hash digests, here.

Anybody with the appropriate tools (md5sum, for example, does md5 hashes, 
but there's sha and other hashes as well, and the portage tree uses 
several hash algorithms in case one is broken) can take a hash of a file, 
and provided it's exactly the same bit-for-bit file they should get 
exactly the same hash.

In fact, that's how portage checks the hashes of both the ebuild files 
and the distfiles it uses, regardless of this webrsync-gpg stuff.  The 
tree ships the hash values that the gentoo package maintainer took of the 
files in its digest files, and portage takes its own hash of the files 
and compares it to the hash value stored in the digest files.  If they 
match, portage is happy.  If they don't, depending on how strict you have 
portage set to be (FEATURES=strict), it will either warn about (without 
strict) or entirely refuse to merge that package (with strict), until 
either the digest is updated, or a new file matching the old digest is 
downloaded.

So far so good, but while the hashes protect against accidental damage as 
the file was being downloaded, because anyone can take a hash of the 
file, without something stronger, if say one of the mirror operators was 
a bad guy, they could replace the files with hacked files and as long as 
they replaced the digest files with the new ones they created for the 
hacked files at the same time, portage wouldn't know.

So while hashes/digests alone protect quite well from accidental damage, 
they can't protect, by themselves, from deliberate replacement of those 
files with malware infested copies.

Which is where the gpg signed tree snapshots come in.  But before we can 
understand how they help, we need to understand how gpg signing differs 
from simple hashes.

PGP, gpg, and various other public/private-pair key signing (and 
encryption) take advantage of a particular mathematical relationship 
property between the public and private keys.  I'm not a cryptographer 
nor a mathematician, so I'm content to leave it at that rather handwavy 
assertion and not get into the details, but enough people I trust say the 
same thing about the details, and enough of our modern Internet banking 
and the like, depends upon the same idea, that I'm relatively confident 
in the general principle, at least.

It works like this.  People keep the private key from the pair private -- 
if it gets out, they've lost the secret.  But people publish the public 
half of the key.  The relationship of the keys is such that people can't 
figure out the private key from the public key, but if you have the 
private key, you can sign stuff with it, and people with the public key 
can verify the signature and thus trust that it really was the person 
with that key that signed the content.  Similarly, people can use the 
public key to encrypt something, and only the person with the private key 
will be able to decrypt it -- having the public key doesn't help.

Actually, as I understand it signing is simply a combination of hashing 
and encryption, such that a hash of the content to be signed is taken, 
and then that hash is encrypted with the private key.  Now anyone with 
the public key can "decrypt" the hash and verify the content with it, 
thereby verifying that the private key used to sign the content by 
encrypting the hash was the one used.  If some other key had been used, 
attempting to decrypt the hash with an unmatched public key would simply 
produce gibberish, and the supposedly "decrypted" hash wouldn't be the 
hash produced when checking the content, thereby failing to verify that 
the signed content actually came from the person that it was claimed to 
have come from.

OK, we've now established that hashes simply verify that the content 
didn't get modified in transit, but they do NOT by themselves verify who 
SENT that content, so indeed, a man-in-the-middle could have replaced 
BOTH the content and the hash, and someone relying on just hashes 
couldn't tell the difference.

And we've also established that a signature verifies that the content 
actually came from the person who had the private key matching the public 
key used to verify it, by mechanism of encrypting the hash of that 
content with the private key, so only by "decrypting" it with the 
matching public key, does the hash of the content match the one taken at 
the other end and encrypted with the private key.

*NOW* we're equipped to see how the portage tree snapshot signing method 
actually allows us to verify distfiles as well.  Because the tree 
includes digests that we can now verify came from our trusted source, 
gentoo, NOW those digests can be used to verify the distfiles, because 
the digests were part of the signed tree and nobody could tamper with 
that signed tree including those digests without detection.

If our nefarious gentoo mirror operator tried to switch out the source 
tarballs AND the digests, he could do so for normal rsync users, and for 
webrsync users not doing gpg verification, without detection.  But should 
he try that with someone that's using webrsync-gpg, he has no way to sign 
the tampered with tarball with the correct private key since he doesn't 
have it, and those using webrsync with FEATURES=webrsync-gpg would detect 
the tampered tarball as portage (via webrsync, via eix in your case) 
would reject that tarball as unverified.

So the hash-digest method used to protect ordinary rsync users (and 
webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage, 
now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks 
as well, not because the digests themselves are different, but because we 
can now trust and verify that they came from a legitimate source.

Tho it should be noted that "legitimate source" is defined as anyone 
having access to that that private signing key.  So should someone breakin 
to the snapshotting server and steal that private key doing the signing, 
they now become a "legitimate source" as far as webrsync-gpg is concerned.

So where does that leave us in practice?

Basically here:

You're now verifying that the snapshot tarballs are coming from a source 
with the private signing key, and we're assuming that gentoo security 
hasn't been broken and thus that only gentoo's snapshot signing servers 
(and their admins, of course) have access to the private signing key, 
which in turn means we're assuming the machine with that signing key must 
be gentoo, and thus that the snapshotted tarballs are legit.

But it's actually webrsync in combination with FEATURES=webrsync-gpg 
that's doing that verification.

Once the verified tarball is actually unpacked on our system, portage 
operate just as it normally does, simply verifying the usual hash digests 
against the ebuilds and the distfiles /exactly/ as it normally would.  

Repeating in different words to hopefully ensure it's understood:

It's *ONLY* the fact that we have actually gpg-verified that snapshot 
tarball and thus the digests within it, that gives us any more security 
than an ordinary rsync user.  After that's downloaded, verified and 
unpacked, portage operates exactly as it normally does.

Meanwhile, part of that normal operation includes FEATURES=strict, if 
you've set it, which causes portage to refuse to merge the package if 
those digests don't match.  But that part of things is just normal 
portage operation.  Rsync users get it too -- they just don't have the 
additional assurance that those digest files actually came from gentoo 
(or at least from someone with gentoo's private signing key), that 
webrsync with FEATURES=webrsync-gpg provides.

(Meanwhile, one further personal note FWIW.  You may think that all these 
long explanations take quite some time to type up, and you'd be correct.  
But don't make the mistake of thinking that I don't get a benefit from it 
myself.  My dad was a teacher, and one of the things he used to say that 
I've found to be truer than true, is that the best way to /learn/ 
something is to try to teach it to someone.  That's exactly what I'm 
doing, and all the unexpected questions and corner cases that I'd have 
never thought about on my own, that people bring up and force me to think 
about in ordered to answer them, help me improve my own previously more 
handwavy and fuzzy "general concept" understanding as well.  I'm much 
more confident in my own understanding of the general public/private key 
concepts, how gpg actually uses them and how its web-of-trust works, and 
more specifically, how portage can use that via webrsync-gpg to actually 
improve the gentooer's own security, than I ever was before.

And it has been quite some time since I worked with gpg and saw it in 
interactive mode like that, too, and it turns out that in the intervening 
years, I've actually understood quite a bit more about how it all works 
than I did back then, thus my ability to dig that all up and present it 
here, while back a few years ago, I was just as clueless about how all 
that web-of-trust stuff worked, and make exactly the same mistake of 
"ultimately trusting" the distro's package-signing key, for exactly the 
same reasons.  Turns out I absorbed rather more from all those security 
and encryption articles I've read over the years than I realized, but it 
actually took my replies right here in this thread to lay it all out 
logically so I too realized how much more I understand what's going on 
now, than I did back then.)

So... Thanks for the thread! =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-07 21:18         ` Duncan
@ 2014-08-08 18:34           ` Mark Knecht
  2014-08-09  1:38             ` Duncan
  0 siblings, 1 reply; 17+ messages in thread
From: Mark Knecht @ 2014-08-08 18:34 UTC (permalink / raw
  To: Gentoo AMD64

Hi Duncan,
   Responding to one thing here, the rest in-line:

[QUOTE]
(Meanwhile, one further personal note FWIW.  You may think that all these
long explanations take quite some time to type up, and you'd be correct.
But don't make the mistake of thinking that I don't get a benefit from it
myself.  My dad was a teacher, and one of the things he used to say that
I've found to be truer than true, is that the best way to /learn/
something is to try to teach it to someone.
[/QUOTE]

I couldn't agree more and appreciate your efforts. And even if I might
already understand some of what you document I'm sure there are
others that come later looking for answers who get lots from these
conversations, solve problems and we never hear about it. Anyway,
a big thanks.

On Thu, Aug 7, 2014 at 2:18 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted:
>
>> So that's all looking pretty good, as a first step. If it's a matter of
>> 3 1/2 minutes instead of 1-2 minutes then I can live with that part.
>> However that's just (I think) the portage tree and not signed source
>> code, correct?
>
> [I just posted a reply to the gpg specific stuff.]
>
> Technically correct, but not really so in implementation.  See below...
>
>> Now, is the idea that I have a validated portage snapshot at this point
>> and stiff have to actually get the code using the regular emerge which
>> will do the checking because I have:
>>
>> FEATURES="buildpkg strict webrsync-gpg"
>
> No...  It doesn't work that way.
>
>> I don't see any evidence that emerge checked what it downloaded, but
>> maybe those checks are only done when I really build the code?
>
> Here's what happens.
>
> FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the
> snapshot-tarball that webrsync downloads.  Without that, it'd still
> download it the same, but it wouldn't verify the signature.  This allows
> people who use the webrsync only because they're behind a firewall that
> wouldn't allow normal rsync, but who don't care about the gpg signing
> security stuff, to use the same tool as the people who actually use
> webrsync for the security aspect, regardless of whether they could use
> normal rsync or not.
>

And to clarify, I believe this step is responsible for putting into place on
a Gentoo machine much of what's in /usr/portage, most specifically in the
app categorization  directories. In the old days the Gentoo Install Guide
used to have us download the portage snapshots for a location such as

http://distfiles.gentoo.org/snapshots/

That's now been replaced by a call to emerge-webrsync so newbies
might not have that view. Additionally, even if we're downloading the
snapshot tarball it appears, at least on my system, it's deleted after
it's expanded/ Or at least it's not showing up in a locate command.


> So that gets you a signed and verified tree.  Correct so far.
>
> But as part of that tree, there are digest files for each package that
> verify the integrity of the ebuild as well as of the sources tarballs
> (distfiles).
>

Yep.

> Now it's important to grasp the difference between gpg signing and simple
> hash digests, here.
>
> Anybody with the appropriate tools (md5sum, for example, does md5 hashes,
> but there's sha and other hashes as well, and the portage tree uses
> several hash algorithms in case one is broken) can take a hash of a file,
> and provided it's exactly the same bit-for-bit file they should get
> exactly the same hash.
>
> In fact, that's how portage checks the hashes of both the ebuild files
> and the distfiles it uses, regardless of this webrsync-gpg stuff.  The
> tree ships the hash values that the gentoo package maintainer took of the
> files in its digest files, and portage takes its own hash of the files
> and compares it to the hash value stored in the digest files.  If they
> match, portage is happy.  If they don't, depending on how strict you have
> portage set to be (FEATURES=strict), it will either warn about (without
> strict) or entirely refuse to merge that package (with strict), until
> either the digest is updated, or a new file matching the old digest is
> downloaded.
>
> So far so good, but while the hashes protect against accidental damage as
> the file was being downloaded, because anyone can take a hash of the
> file, without something stronger, if say one of the mirror operators was
> a bad guy, they could replace the files with hacked files and as long as
> they replaced the digest files with the new ones they created for the
> hacked files at the same time, portage wouldn't know.
>
> So while hashes/digests alone protect quite well from accidental damage,
> they can't protect, by themselves, from deliberate replacement of those
> files with malware infested copies.
>
> Which is where the gpg signed tree snapshots come in.  But before we can
> understand how they help, we need to understand how gpg signing differs
> from simple hashes.
>

Some years ago (1997/98) I purchased one of Bruce Schneier's books - looking
at Amazon I recollect "Applied Cryptography: Protocols, Algorithms, and
Source Code in C" - so I've been through a lot of this in the area of
semiconductor
design. (5C Encryption model for 'protecting' movie content. What a joke...)

> PGP, gpg, and various other public/private-pair key signing (and
> encryption) take advantage of a particular mathematical relationship
> property between the public and private keys.  I'm not a cryptographer
> nor a mathematician, so I'm content to leave it at that rather handwavy
> assertion and not get into the details, but enough people I trust say the
> same thing about the details, and enough of our modern Internet banking
> and the like, depends upon the same idea, that I'm relatively confident
> in the general principle, at least.
>
> It works like this.  People keep the private key from the pair private --
> if it gets out, they've lost the secret.  But people publish the public
> half of the key.  The relationship of the keys is such that people can't
> figure out the private key from the public key, but if you have the
> private key, you can sign stuff with it, and people with the public key
> can verify the signature and thus trust that it really was the person
> with that key that signed the content.  Similarly, people can use the
> public key to encrypt something, and only the person with the private key
> will be able to decrypt it -- having the public key doesn't help.
>
> Actually, as I understand it signing is simply a combination of hashing
> and encryption, such that a hash of the content to be signed is taken,
> and then that hash is encrypted with the private key.  Now anyone with
> the public key can "decrypt" the hash and verify the content with it,
> thereby verifying that the private key used to sign the content by
> encrypting the hash was the one used.  If some other key had been used,
> attempting to decrypt the hash with an unmatched public key would simply
> produce gibberish, and the supposedly "decrypted" hash wouldn't be the
> hash produced when checking the content, thereby failing to verify that
> the signed content actually came from the person that it was claimed to
> have come from.
>

If I recall correctly the flow looks like:

File -> (Sender Private/Receiver Public) -> Encrypted File

Encrypted File -> (Sender Public/Receiver Private) -> File

and this should be safe, albeit Rich's comment early on was

"3.  Have an army of the best cryptographers in the world, etc."

coupled with lots of compute power leaves me with little doubt it's
not a 100% thing...

>
> OK, we've now established that hashes simply verify that the content
> didn't get modified in transit, but they do NOT by themselves verify who
> SENT that content, so indeed, a man-in-the-middle could have replaced
> BOTH the content and the hash, and someone relying on just hashes
> couldn't tell the difference.
>
> And we've also established that a signature verifies that the content
> actually came from the person who had the private key matching the public
> key used to verify it, by mechanism of encrypting the hash of that
> content with the private key, so only by "decrypting" it with the
> matching public key, does the hash of the content match the one taken at
> the other end and encrypted with the private key.
>
> *NOW* we're equipped to see how the portage tree snapshot signing method
> actually allows us to verify distfiles as well.  Because the tree
> includes digests that we can now verify came from our trusted source,
> gentoo, NOW those digests can be used to verify the distfiles, because
> the digests were part of the signed tree and nobody could tamper with
> that signed tree including those digests without detection.
>

Correct. Hashes for all that stuff is in the Manifest files and I don't create
my own Manifests ever.

> If our nefarious gentoo mirror operator tried to switch out the source
> tarballs AND the digests, he could do so for normal rsync users, and for
> webrsync users not doing gpg verification, without detection.  But should
> he try that with someone that's using webrsync-gpg, he has no way to sign
> the tampered with tarball with the correct private key since he doesn't
> have it, and those using webrsync with FEATURES=webrsync-gpg would detect
> the tampered tarball as portage (via webrsync, via eix in your case)
> would reject that tarball as unverified.
>

Well, maybe yes, maybe no as per the comment above, but agreed in general.

> So the hash-digest method used to protect ordinary rsync users (and
> webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage,
> now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks
> as well, not because the digests themselves are different, but because we
> can now trust and verify that they came from a legitimate source.
>
> Tho it should be noted that "legitimate source" is defined as anyone
> having access to that that private signing key.  So should someone breakin
> to the snapshotting server and steal that private key doing the signing,
> they now become a "legitimate source" as far as webrsync-gpg is concerned.
>

Yep.

>
> So where does that leave us in practice?
>
> Basically here:
>
> You're now verifying that the snapshot tarballs are coming from a source
> with the private signing key, and we're assuming that gentoo security
> hasn't been broken and thus that only gentoo's snapshot signing servers
> (and their admins, of course) have access to the private signing key,
> which in turn means we're assuming the machine with that signing key must
> be gentoo, and thus that the snapshotted tarballs are legit.
>
> But it's actually webrsync in combination with FEATURES=webrsync-gpg
> that's doing that verification.
>
> Once the verified tarball is actually unpacked on our system, portage
> operate just as it normally does, simply verifying the usual hash digests
> against the ebuilds and the distfiles /exactly/ as it normally would.
>

Understood.

> Repeating in different words to hopefully ensure it's understood:
>
> It's *ONLY* the fact that we have actually gpg-verified that snapshot
> tarball and thus the digests within it, that gives us any more security
> than an ordinary rsync user.  After that's downloaded, verified and
> unpacked, portage operates exactly as it normally does.
>
>
> Meanwhile, part of that normal operation includes FEATURES=strict, if
> you've set it, which causes portage to refuse to merge the package if
> those digests don't match.  But that part of things is just normal
> portage operation.  Rsync users get it too -- they just don't have the
> additional assurance that those digest files actually came from gentoo
> (or at least from someone with gentoo's private signing key), that
> webrsync with FEATURES=webrsync-gpg provides.
>

Yep, I set that first before I got the gpg stuff working. I'll leave
it in place
for now.

>
> (Meanwhile, one further personal note FWIW.  You may think that all these
> long explanations take quite some time to type up, and you'd be correct.
> But don't make the mistake of thinking that I don't get a benefit from it
> myself.  My dad was a teacher, and one of the things he used to say that
> I've found to be truer than true, is that the best way to /learn/
> something is to try to teach it to someone.  That's exactly what I'm
> doing, and all the unexpected questions and corner cases that I'd have
> never thought about on my own, that people bring up and force me to think
> about in ordered to answer them, help me improve my own previously more
> handwavy and fuzzy "general concept" understanding as well.  I'm much
> more confident in my own understanding of the general public/private key
> concepts, how gpg actually uses them and how its web-of-trust works, and
> more specifically, how portage can use that via webrsync-gpg to actually
> improve the gentooer's own security, than I ever was before.
>
> And it has been quite some time since I worked with gpg and saw it in
> interactive mode like that, too, and it turns out that in the intervening
> years, I've actually understood quite a bit more about how it all works
> than I did back then, thus my ability to dig that all up and present it
> here, while back a few years ago, I was just as clueless about how all
> that web-of-trust stuff worked, and make exactly the same mistake of
> "ultimately trusting" the distro's package-signing key, for exactly the
> same reasons.  Turns out I absorbed rather more from all those security
> and encryption articles I've read over the years than I realized, but it
> actually took my replies right here in this thread to lay it all out
> logically so I too realized how much more I understand what's going on
> now, than I did back then.)
>
> So... Thanks for the thread! =:^)
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-08 18:34           ` Mark Knecht
@ 2014-08-09  1:38             ` Duncan
  0 siblings, 0 replies; 17+ messages in thread
From: Duncan @ 2014-08-09  1:38 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Fri, 08 Aug 2014 11:34:54 -0700 as excerpted:

> On Thu, Aug 7, 2014 at 2:18 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>> Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted:
>>
>>> I don't see any evidence that emerge checked what it downloaded, but
>>> maybe those checks are only done when I really build the code?
>>
>> Here's what happens.
>>
>> FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the
>> snapshot-tarball that webrsync downloads.  Without that, it'd still
>> download it the same, but it wouldn't verify the signature.  This
>> allows people who use the webrsync only because they're behind a
>> firewall that wouldn't allow normal rsync, but who don't care about the
>> gpg signing security stuff, to use the same tool as the people who
>> actually use webrsync for the security aspect, regardless of whether
>> they could use normal rsync or not.
>>
> And to clarify, I believe this step is responsible for putting into
> place on a Gentoo machine much of what's in /usr/portage, most
> specifically in the app categorization  directories.

Yes.  It's basically the entire $PORTDIR tree (/usr/portage/ by default), 
the app categories and ebuilds plus digest files and patches, eclasses, 
metadata, the profiles, the whole thing.

That's what emerge sync would normally update (via rsync), and
emerge-webrsync replaces the normal emerge sync with a tarball download, 
signature verify if FEATURES=webrsyncgpg, and tarball extraction to 
$PORTDIR (while normally /usr/portage/, my $PORTDIR is set to put it 
elsewhere).

The only bits of $PORTDIR that wouldn't be included would be $DISTDIR
(/usr/portage/distfiles/ by default, but again I have mine set to 
something else), as source files are downloaded and hash-verified against 
against the hash-digest stored in the digest files (which are part of the 
signed tarball), and $PKGDIR (/usr/portage/packages/ by default, but 
again, I've set the variable to put them elsewhere), since that's binpkgs 
that portage creates if you have FEATURES=buildpkg or FEATURES=buildsyspkg 
set.

Additionally, anything else that you configure to be placed in $PORTDIR 
won't be in the tarball, as you've configured that for yourself.  Here, I 
have layman's overlays under $PORTDIR as well (the storage setting in 
layman.conf, by default set to /var/lib/layman), with an appropriate 
rsync-exclude set so emerge sync doesn't clear them out when I sync.  
Were I to switch to webrsync I might have to do something different as I 
guess webrsync would clear them out.

Which reminds me, in all the discussion so far we've not talked about 
overlays or layman.  But since that is optional, it can be treated as a 
separate topic.  Suffice it to say here that the webrsync discussion 
does /not/ cover overlays, etc, only the main gentoo tree.

> In the old days the Gentoo Install Guide used to have us download the
> portage snapshots for a location such as
> 
> http://distfiles.gentoo.org/snapshots/
> 
> That's now been replaced by a call to emerge-webrsync so newbies might
> not have that view.

Good point.  I had noticed that change in passing when I found and 
referenced the handbook webrsync stuff too, but didn't think it worth 
mentioning.  But you're correct, without the perspective of what it 
replaced, newbies might miss the connection.

> Additionally, even if we're downloading the snapshot
> tarball it appears, at least on my system, it's deleted after it's
> expanded/ Or at least it's not showing up in a locate command.

Interesting.  Deleting by default after extraction does make sense, 
however, since otherwise you'd have an ever-growing cache of mostly 
identical content with only incremental change, tho I imagine there's 
some sort of config option to turn it off, in case you don't want it 
deleted.

Tho I don't use locate here and in fact don't even have it installed.
I never found it particularly useful.  But are you sure locate would show 
it anyway, given that locate only knows about what is indexed, and the 
indexing only runs periodically, once a day or week or some such?  If it 
hasn't indexed files since you started doing the emerge-webrsync thing, 
it probably won't know anything about them, even if they are kept.

(Actually, that was my problem with locate in the first place.  My 
schedule is never regular enough to properly select a time when the 
computer will be on to do the indexing, yet I won't be using it for 
something else so it can do the indexing without bothering whatever else 
I'm doing.  Additionally, since it only knows about what it has already 
indexed, I could never completely rely on it having indexed the file I 
was looking for anyway, so it was easier to simply forget about locate 
and to use other means to find files.  So at some point, when I was doing 
an update and the locate/slocate/whatever package was set to update, 
since I had never actually used it in years, I just decided to unmerge it 
instead.  That must have been years ago now, and I've never missed it...)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-04 22:04 [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) Mark Knecht
  2014-08-05  5:52 ` [gentoo-amd64] " Duncan
@ 2014-08-05 19:16 ` Frank Peters
  2014-08-05 19:57   ` Rich Freeman
  1 sibling, 1 reply; 17+ messages in thread
From: Frank Peters @ 2014-08-05 19:16 UTC (permalink / raw
  To: gentoo-amd64

On Mon, 4 Aug 2014 15:04:12 -0700
Mark Knecht <markknecht@gmail.com> wrote:

>
> then how do I know that the
> source code I build on my Gentoo machines hasn't been modified by someone
> to provide access to my machine, networks, etc.?
> 

There are two approaches to system development that tend to mitigate
all security concerns:

1) Highly distributed development

2) Simplicity of design

If the component pieces of a system are independently developed
by widely scattered and unrelated development teams then there
is much less chance for any integrated security attacks.

Also, if the overall system remains simple and each component is
narrowly focused then the result is better transparency for the user
which insures less opportunity for attack.

Linux _used_ to adhere to these two principles, but currently it
is more and more moving toward monolithic development and much
reduced simplicity.  I refer especially to the Freedesktop
project, which is slowly becoming the centralized headquarters
for everything graphical.  I also mention systemd, with its plethora
of system daemons that obscure all system transparency.

From the beginning, Linux, due to its faithfulness to the above
two principles, allowed the user to fully control and easily understand
the operation of his system.  This situation is now being threatened
with freedesktop, systemd, etc., and security attacks can only become
more feasible.

We, as a community of Linux users, have to adamantly oppose these
monolithic projects that attempt to destroy choice and transform
Linux into another Microsoft Windows.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05 19:16 ` [gentoo-amd64] " Frank Peters
@ 2014-08-05 19:57   ` Rich Freeman
  0 siblings, 0 replies; 17+ messages in thread
From: Rich Freeman @ 2014-08-05 19:57 UTC (permalink / raw
  To: gentoo-amd64

On Tue, Aug 5, 2014 at 3:16 PM, Frank Peters <frank.peters@comcast.net> wrote:
> Linux _used_ to adhere to these two principles, but currently it
> is more and more moving toward monolithic development and much
> reduced simplicity.  I refer especially to the Freedesktop
> project, which is slowly becoming the centralized headquarters
> for everything graphical.  I also mention systemd, with its plethora
> of system daemons that obscure all system transparency.

Everybody loves to argue about which design is "simpler," the "unix way," etc.

The fact is that while systemd does bundle a fairly end-to-end
solution, many of its components are modular.  I can run systemd
without running networkd, or resolved, etc.  The modular components
have interfaces, though they aren't really intended to work with
anything other than systemd.

Honestly, I think the main differences are that it doesn't do things
the traditional way.  Nothing prevents you from talking to daemons via
DBus, or inspecting their traffic.

Also, a set of modular components engineered to work together is less
likely to have integration-related bugs than a bunch of components
designed to operate on their own.

SystemD also allows some security-oriented optimizations, like private
tmpdirs, making the filesystem read-only, reduced capabilities/etc.
That isn't to say that you can't do this with traditional service
scripts, but there are more barriers to doing it.

Ultimately it is a lot more functional than a traditional init, so I
do agree that the attack surface is larger.  Still, most of the stuff
that is incorporated into systemd is going to be running in some
process on a typical server - much of it as root.

The use of DBus also means that you can use policies to control who
can do what more granularly.  If you want a user to be able to shut
down the system, I imagine that is just a DBus message to systemd and
you could probably give an otherwise-nonprivileged user the ability to
send that message without having to create suid helpers with their own
private rules.  The ability to further secure message-passing in this
way is one of the reasons for kdbus, and Linus endorses that (but not
some of the practices of its maintainers).

I do suggest that you try using systemd in a VM just to see what it is
about.  If nothing else you might appreciate some of the things it
attempts to solve just so that you can come up with better ways of
solving them.  :)

Rich

^ permalink raw reply	[flat|nested] 17+ messages in thread

[parent not found: <46751df7496f4e4f97fb23e10fc9f5b4@mail10.futurewins.com>]

* Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
       [not found] <46751df7496f4e4f97fb23e10fc9f5b4@mail10.futurewins.com>
@ 2014-08-05 11:36 ` Rich Freeman
  2014-08-05 17:50   ` Mark Knecht
  0 siblings, 1 reply; 17+ messages in thread
From: Rich Freeman @ 2014-08-05 11:36 UTC (permalink / raw
  To: gentoo-amd64

On Mon, Aug 4, 2014 at 6:04 PM, Mark Knecht <markknecht@gmail.com> wrote:
>
> Essentially, what is the security model for all this source code and how do
> I verify that it hasn't been tampered with in some manner?

Duncan already gave a fairly comprehensive response.  I believe the
intent is to refactor and generally improve things when we move to
git.  Even today there aren't a lot of avenues for slipping code in
without compromising a gentoo server or manipulating your rsync data
transfer (if it isn't secured).

But...

> There's certainly lots of other issues about security, like protecting
> passwords, protecting physical access to the network and machines, root kits
> and the like, etc., but assuming none of that is in question (I don't have
> any reason to think the NSA has been in my home!) ;-) I'm looking for info
> on how the code is protected from the time it's signed off until it's built
> and running here.

You may very well be underestimating the NSA here.  It has already
come out that they hack into peoples systems just to get their ssh
keys to hack into other people's systems, even if the admins that
they're targeting aren't of any interest otherwise.  That is, you
don't have to be a suspected terrorist/etc to be on their list.

I run a relay-only tor node (which doesn't seem to keep everybody and
their uncle from blocking me as if I'm an exit node it seems).  I'd be
surprised if the NSA hasn't rooted my server just so that they can
monitor my tor traffic - if they did this to all the tor relays they
could monitor the entire network, so I would think that this would be
a priority for them.

To root your system the NSA doesn't have to compromise some Gentoo
server, or even tamper with your rsync feed.  The simplest solution
would be to just target a zero-day vulnerability in some software
you're running.  They might use a zero-day in some daemon that runs as
root, maybe a zero-day in the kernel network stack, or a zero-day in
your browser (those certainly exist) combined with a priv escalation
attack.  If they're just after your ssh keys they don't even need priv
escalation.  Those attacks don't require targeting Gentoo in
particular.

If your goal is to be safe from "the NSA" then I think you need to
fundamentally rethink your approach to security.  I'd recommend
verifying, signing, and verifying all code that runs (think iOS).  I
doubt that any linux distro is going to suit your needs unless you
just use it as a starting point for a fork.

However, I do think that Gentoo can do a better job of securing code
than it does today, and that is a worthwhile goal.  I doubt it would
stop the NSA, but we certainly can do something about lesser threats
that don't:
1.  Have a 12-figure budget.
2.  Have complete immunity from prosecution.
3.  Have an army of the best cryptographers in the world, etc.
4.  Have privileged access to the routers virtually all of your
traffic travels over.
5.  Have the ability to obtain things like trusted SSL certs at will
(though I don't think anybody has caught them doing this one).

In the early post-Snowden days I was more paranoid, but these days
I've basically given up worrying about the NSA.  After the ssh key
revelations I just assume they have root on my box - I just wish
they'd be nice enough to close up any other vulnerabilities they find
so that others don't get root, and maybe let me access whatever
backups they've made if for some reason I lose access to my own
backups.  I still try to keep things as secure as I can to keep
everybody else out, but hiding from the NSA is a tall order.

Oh yeah, if they have compromised my box you can assume they have my
Gentoo ssh key and password and gpg key if they actually want them...
:)

Rich

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05 11:36 ` Rich Freeman
@ 2014-08-05 17:50   ` Mark Knecht
  2014-08-05 20:36     ` Frank Peters
  2014-08-07 15:36     ` [gentoo-amd64] " Max Cizauskas
  0 siblings, 2 replies; 17+ messages in thread
From: Mark Knecht @ 2014-08-05 17:50 UTC (permalink / raw
  To: Gentoo AMD64

[-- Attachment #1: Type: text/plain, Size: 5555 bytes --]

Hi Rich,
   Thanks for the response. I'll likely respond over the next few hours &
days in dribs and drabs...

On Tue, Aug 5, 2014 at 4:36 AM, Rich Freeman <rich0@gentoo.org> wrote:
>
> On Mon, Aug 4, 2014 at 6:04 PM, Mark Knecht <markknecht@gmail.com> wrote:
> >
> > Essentially, what is the security model for all this source code and
how do
> > I verify that it hasn't been tampered with in some manner?
>
> Duncan already gave a fairly comprehensive response.  I believe the
> intent is to refactor and generally improve things when we move to
> git.  Even today there aren't a lot of avenues for slipping code in
> without compromising a gentoo server or manipulating your rsync data
> transfer (if it isn't secured).
>
> But...
>
> > There's certainly lots of other issues about security, like protecting
> > passwords, protecting physical access to the network and machines, root
kits
> > and the like, etc., but assuming none of that is in question (I don't
have
> > any reason to think the NSA has been in my home!) ;-) I'm looking for
info
> > on how the code is protected from the time it's signed off until it's
built
> > and running here.
>
> You may very well be underestimating the NSA here.  It has already
> come out that they hack into peoples systems just to get their ssh
> keys to hack into other people's systems, even if the admins that
> they're targeting aren't of any interest otherwise.  That is, you
> don't have to be a suspected terrorist/etc to be on their list.
>

Yeah, I've read that. It's my basic POV at this time that if the NSA
(or any other organization) wants something I have then they have
it already. However a good portion of my original thoughts are
along the line of your zero-day point below.

> I run a relay-only tor node (which doesn't seem to keep everybody and
> their uncle from blocking me as if I'm an exit node it seems).  I'd be
> surprised if the NSA hasn't rooted my server just so that they can
> monitor my tor traffic - if they did this to all the tor relays they
> could monitor the entire network, so I would think that this would be
> a priority for them.

The book I referenced made it clear that the NSA has a whole specific
program & toolset to target tor so I suspect you're correct, or even
underestimating yourself. That said, running tor is legal so more power
to you. I ran it a little to play with and found all the 2-level security
stuff
at GMail and YahooMail too much trouble to deal with.

>
> To root your system the NSA doesn't have to compromise some Gentoo
> server, or even tamper with your rsync feed.  The simplest solution
> would be to just target a zero-day vulnerability in some software
> you're running.  They might use a zero-day in some daemon that runs as
> root, maybe a zero-day in the kernel network stack, or a zero-day in
> your browser (those certainly exist) combined with a priv escalation
> attack.  If they're just after your ssh keys they don't even need priv
> escalation.  Those attacks don't require targeting Gentoo in
> particular.
>

Yep, and it's the sort of thing I was thinking about when I wrote this
yesterday:

I'm sitting here writing R code. I do it in R-Studio. How do I
know that every bit of code I run in that tool isn't being sent out to some
server? Most likely no one has done an audit of that GUI so I'm trusting
that the company isn't nefarious in nature.

I use Chrome. How do I know Chrome isn't scanning my local drives
and sending stuff somewhere? I don't.

In the limit, how would I even know if the Linux kernel was doing this? I
got source through emerge, built code using gcc, installed it by hand,
but I don't know what's really there and never will. I suspect the kernel
is likely one of the safer things on my box.

In the news yesterday was this story about some pedophile sending
child porn using GMail and then getting arrested because Google scans
'certain' attachments for known hashes. Well, that's the public story (so
far) but it seems to me that Google isn't likely creating those hashes but
getting them from the FBI, but the point is it's all being watched.

I think one way you might not be as John Le Carre-oriented as me is
that if I was the NSA and wanted inside of Linux (or M$FT or Apple) in
general, then I would simply pay people to be inside of those entities and
to do my bidding. Basic spycraft. Those folks would already be in the
kernel development area, or in KDE, or in the facilities that host the
code,
or where ever making whatever changes they want. They would have
already hacked how iOS does signing, or M$FT does updates, etc.

When it comes to security, choose whatever type you want, but how
do I as a user know that my sha-1 or pgp or whatever is what the
developers thought they were making publicly available. I don't and
probably never will.

> If your goal is to be safe from "the NSA"

It's not. Nor do I think I'll ever know if I am so I have to assume
I'm not. Life in the modern era...

<SNIP>
>
> In the early post-Snowden days I was more paranoid, but these days
> I've basically given up worrying about the NSA.

Similar for me, although reading this book, or watching the 2-episode
Frontline story, or (fill in whatever) raises the question, but more in a
general sense. I'm far less worried about the NSA and more worried
about things like general hackers after financial info or people looking
for code I'm writing.

Thanks for all the info, and thanks to Duncan also who I will write more
too when I've checked out all the technical stuff he posted.

Cheers,
Mark

[-- Attachment #2: Type: text/html, Size: 7039 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05 17:50   ` Mark Knecht
@ 2014-08-05 20:36     ` Frank Peters
  2014-08-05 23:20       ` [gentoo-amd64] " Duncan
  2014-08-07 15:36     ` [gentoo-amd64] " Max Cizauskas
  1 sibling, 1 reply; 17+ messages in thread
From: Frank Peters @ 2014-08-05 20:36 UTC (permalink / raw
  To: gentoo-amd64

On Tue, 5 Aug 2014 10:50:35 -0700
Mark Knecht <markknecht@gmail.com> wrote:

> 
> I use Chrome. How do I know Chrome isn't scanning my local drives
> and sending stuff somewhere? I don't.
> 

It wouldn't have to scan your local drives.  It would only have
to scan the very few directories named "MY DOCUMENTS" and
"MY VIDEOS" and "MY EMAIL" which have conveniently been established
by the omnipotent and omniscient desktop environment.  Within
these universal and standardized storage areas can be found
everything that snooping software would need to find.

I am only being partly facetious. This does represent the trend.
We have standardized locations that are shared across many different
programs.  But the programs aren't really different because they
are produced by the same desktop conglomerate or because they
must employ the toolkits and widgets of said conglomerate.

The job of the NSA is getting easier.  Those terrorist documents
will no longer be buried within terabytes of disjoint hard drive
space.  They will all be nicely tucked into an "ALL DOCUMENTS ARE HERE"
standardized directory that nobody had better modify because
the entire system will crash.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05 20:36     ` Frank Peters
@ 2014-08-05 23:20       ` Duncan
  2014-08-06 12:14         ` james.a.elian
  2014-08-06 12:14         ` james.a.elian
  0 siblings, 2 replies; 17+ messages in thread
From: Duncan @ 2014-08-05 23:20 UTC (permalink / raw
  To: gentoo-amd64

Frank Peters posted on Tue, 05 Aug 2014 16:36:57 -0400 as excerpted:

> It wouldn't have to scan your local drives.  It would only have to scan
> the very few directories named "MY DOCUMENTS" and "MY VIDEOS" and "MY
> EMAIL" which have conveniently been established by the omnipotent and
> omniscient desktop environment.  Within these universal and standardized
> storage areas can be found everything that snooping software would need
> to find.

Hmm...  Some people (me) don't use those standardized locations.  I have 
a dedicated media partition -- large, still on spinning rust when most of 
the system in terms of filenames (but not size) is on SSD, and where it's 
mounted isn't standard and is unlikely to /be/ standard, simply because I 
have my own rather nonconformist ideas of where I want stuff located and 
how it should be organized.

OTOH, consider ~/.thumbnails/.  Somebody already mentioned that google 
case and the hashes they apparently scan for.  ~/.thumbnails will 
normally have thumbnails for anything in the system visited by normal 
graphics programs, including both still images and video, and I think pdf 
too unless that's always generated dynamically as is the case with txt 
files, via various video-thumbnail addons.  Those thumbnails are all 
going to be standardized to one of a few standard sizes, and can either 
be used effectively as (large) hashes directly, or smaller hashes of them 
could be generated...

Tho some images programs (gwenview) have an option to wipe the thumbnails 
dir when they're shutdown, but given the time creating those thumbnails 
on any reasonably large collection takes, most people aren't going to 
want to enable wiping...

Meanwhile, one of the things that has come out is that the NSA 
effectively already considers anyone running a Linux desktop a radical, 
likely on their watch-list already, just as is anyone running TOR, or 
even simply visiting the TOR site or an article linking to them.

I guess I must be on their list several times over, what with the sigs I 
use, etc, the security/privacy-related articles I read, the OS I run, and 
the various lists I participate on...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05 23:20       ` [gentoo-amd64] " Duncan
@ 2014-08-06 12:14         ` james.a.elian
  2014-08-06 12:14         ` james.a.elian
  1 sibling, 0 replies; 17+ messages in thread
From: james.a.elian @ 2014-08-06 12:14 UTC (permalink / raw
  To: gentoo-amd64

E
Sent via BlackBerry from Vodafone Romania

-----Original Message-----
From: Duncan <1i5t5.duncan@cox.net>
Date: Tue, 5 Aug 2014 23:20:26 
To: <gentoo-amd64@lists.gentoo.org>
Reply-to: gentoo-amd64@lists.gentoo.org
Subject: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code
 hasn't been messed with?)

Frank Peters posted on Tue, 05 Aug 2014 16:36:57 -0400 as excerpted:

> It wouldn't have to scan your local drives.  It would only have to scan
> the very few directories named "MY DOCUMENTS" and "MY VIDEOS" and "MY
> EMAIL" which have conveniently been established by the omnipotent and
> omniscient desktop environment.  Within these universal and standardized
> storage areas can be found everything that snooping software would need
> to find.

Hmm...  Some people (me) don't use those standardized locations.  I have 
a dedicated media partition -- large, still on spinning rust when most of 
the system in terms of filenames (but not size) is on SSD, and where it's 
mounted isn't standard and is unlikely to /be/ standard, simply because I 
have my own rather nonconformist ideas of where I want stuff located and 
how it should be organized.

OTOH, consider ~/.thumbnails/.  Somebody already mentioned that google 
case and the hashes they apparently scan for.  ~/.thumbnails will 
normally have thumbnails for anything in the system visited by normal 
graphics programs, including both still images and video, and I think pdf 
too unless that's always generated dynamically as is the case with txt 
files, via various video-thumbnail addons.  Those thumbnails are all 
going to be standardized to one of a few standard sizes, and can either 
be used effectively as (large) hashes directly, or smaller hashes of them 
could be generated...

Tho some images programs (gwenview) have an option to wipe the thumbnails 
dir when they're shutdown, but given the time creating those thumbnails 
on any reasonably large collection takes, most people aren't going to 
want to enable wiping...

Meanwhile, one of the things that has come out is that the NSA 
effectively already considers anyone running a Linux desktop a radical, 
likely on their watch-list already, just as is anyone running TOR, or 
even simply visiting the TOR site or an article linking to them.

I guess I must be on their list several times over, what with the sigs I 
use, etc, the security/privacy-related articles I read, the OS I run, and 
the various lists I participate on...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05 23:20       ` [gentoo-amd64] " Duncan
  2014-08-06 12:14         ` james.a.elian
@ 2014-08-06 12:14         ` james.a.elian
  1 sibling, 0 replies; 17+ messages in thread
From: james.a.elian @ 2014-08-06 12:14 UTC (permalink / raw
  To: gentoo-amd64

E
Sent via BlackBerry from Vodafone Romania

-----Original Message-----
From: Duncan <1i5t5.duncan@cox.net>
Date: Tue, 5 Aug 2014 23:20:26 
To: <gentoo-amd64@lists.gentoo.org>
Reply-to: gentoo-amd64@lists.gentoo.org
Subject: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code
 hasn't been messed with?)

Frank Peters posted on Tue, 05 Aug 2014 16:36:57 -0400 as excerpted:

> It wouldn't have to scan your local drives.  It would only have to scan
> the very few directories named "MY DOCUMENTS" and "MY VIDEOS" and "MY
> EMAIL" which have conveniently been established by the omnipotent and
> omniscient desktop environment.  Within these universal and standardized
> storage areas can be found everything that snooping software would need
> to find.

Hmm...  Some people (me) don't use those standardized locations.  I have 
a dedicated media partition -- large, still on spinning rust when most of 
the system in terms of filenames (but not size) is on SSD, and where it's 
mounted isn't standard and is unlikely to /be/ standard, simply because I 
have my own rather nonconformist ideas of where I want stuff located and 
how it should be organized.

OTOH, consider ~/.thumbnails/.  Somebody already mentioned that google 
case and the hashes they apparently scan for.  ~/.thumbnails will 
normally have thumbnails for anything in the system visited by normal 
graphics programs, including both still images and video, and I think pdf 
too unless that's always generated dynamically as is the case with txt 
files, via various video-thumbnail addons.  Those thumbnails are all 
going to be standardized to one of a few standard sizes, and can either 
be used effectively as (large) hashes directly, or smaller hashes of them 
could be generated...

Tho some images programs (gwenview) have an option to wipe the thumbnails 
dir when they're shutdown, but given the time creating those thumbnails 
on any reasonably large collection takes, most people aren't going to 
want to enable wiping...

Meanwhile, one of the things that has come out is that the NSA 
effectively already considers anyone running a Linux desktop a radical, 
likely on their watch-list already, just as is anyone running TOR, or 
even simply visiting the TOR site or an article linking to them.

I guess I must be on their list several times over, what with the sigs I 
use, etc, the security/privacy-related articles I read, the OS I run, and 
the various lists I participate on...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-05 17:50   ` Mark Knecht
  2014-08-05 20:36     ` Frank Peters
@ 2014-08-07 15:36     ` Max Cizauskas
  2014-08-07 16:06       ` Lie Ryan
  1 sibling, 1 reply; 17+ messages in thread
From: Max Cizauskas @ 2014-08-07 15:36 UTC (permalink / raw
  To: gentoo-amd64

Hello all,

I've been very interested in this topic myself, so I'll pile on my 
question after answering one of Mark's

<Snip>
On 05/08/2014 1:50 PM, Mark Knecht wrote:
> I'm sitting here writing R code. I do it in R-Studio. How do I
> know that every bit of code I run in that tool isn't being sent out to 
> some
> server? Most likely no one has done an audit of that GUI so I'm trusting
> that the company isn't nefarious in nature.
>
> I use Chrome. How do I know Chrome isn't scanning my local drives
> and sending stuff somewhere? I don't.
>
> In the limit, how would I even know if the Linux kernel was doing this? I
> got source through emerge, built code using gcc, installed it by hand,
> but I don't know what's really there and never will. I suspect the kernel
> is likely one of the safer things on my box.
>

The answer to most things security related seems to be independent 
verification. If you're going to be the person to do that verification 
because you don't trust others to do it or can't find proof that it's 
been done, then there are two factors at play; time and money.

Where you're only running your own traffic through your system (unlike 
Duncan's TOR example) this is relatively easy and cheap to accomplish. 
For ~$100 you can buy a consumer grade switch with a configurable 
mirroring port which will effectively passively sniff all the traffic 
going through the switch. You then connect this mirrored port to a spare 
junker computer running optimally a different distro of linux like 
Security Onion or anything else with TCPDump capturing full packet 
captures which you can do analytics on. I do the same for my home 
network to detect compromised hosts and to see if I'm under attack for 
any reason. Things I find useful for getting a finger on the pulse are:

  - DNS Query monitoring to see who my home network is reaching out to
  - GeoIPLookup mappings against bandwidth usage to see if lots of data 
is being slurped out of my environment
  - BroIDS, Snorby and Squert (security onion suite of tools) for at a 
glance view of things going wrong and the ability to dig into events quickly

My question is what kind of independent validation, or even peer review, 
is done over the core of Gentoo? Now that new users are being pushed to 
use the Stage3 tarball and genkernel, is seems to me that much of the 
core of the Gentoo system is a "just trust me" package. What I love 
about the Stage 1 approach is you get all the benefits of compiling the 
system as you go, essentially from scratch and customized for your 
system, and all the benefits of the scrutiny Duncan mentioned applying 
to ebuilds is applied. There is much more control in the hands of the 
person using Stage 1, and it's a smaller footprint for someone to 
independently validate malicious code didn't get introduced into it. 
Should someone have been manipulated to put something malicious into the 
stage3 tarball it could much more easily give a permanent foothold over 
your system to a malicious 3rd party (think rootkit) then stage 1 would 
allow.

Thanks to anyone who can provide light on the topic,
Max

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-07 15:36     ` [gentoo-amd64] " Max Cizauskas
@ 2014-08-07 16:06       ` Lie Ryan
  2014-08-07 17:20         ` [gentoo-amd64] " Duncan
  0 siblings, 1 reply; 17+ messages in thread
From: Lie Ryan @ 2014-08-07 16:06 UTC (permalink / raw
  To: gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

With you having to compile thousands of stuffs if you build from stage 1, I
doubt that you will be able to verify every single thing you compile and
detect if something is actually doing sneaky stuff AND still have the time
to enjoy your system. Also, even if you build from stage 1 and manage to
verify all the source code, you still need to download a precompiled
compiler which could possibly inject the malicious code into the programs
it compiles, and which can also inject itself if you try to compile another
compiler from source. If there is a single software that is worth a gold
mine to inject with malware to gain illicit access to all Linux system,
then it would be gcc. Once you infect a compiler, you're invincible.

Also, did you apply the same level of scrutiny to your hardware?

For the truly paranoid, I recommend unplugging.

[-- Attachment #2: Type: text/html, Size: 936 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-07 16:06       ` Lie Ryan
@ 2014-08-07 17:20         ` Duncan
  2014-08-07 19:38           ` Mark Knecht
  0 siblings, 1 reply; 17+ messages in thread
From: Duncan @ 2014-08-07 17:20 UTC (permalink / raw
  To: gentoo-amd64

Lie Ryan posted on Fri, 08 Aug 2014 02:06:14 +1000 as excerpted:

> With you having to compile thousands of stuffs if you build from stage
> 1, I doubt that you will be able to verify every single thing you
> compile and detect if something is actually doing sneaky stuff AND still
> have the time to enjoy your system. Also, even if you build from stage 1
> and manage to verify all the source code, you still need to download a
> precompiled compiler which could possibly inject the malicious code into
> the programs it compiles, and which can also inject itself if you try to
> compile another compiler from source. If there is a single software that
> is worth a gold mine to inject with malware to gain illicit access to
> all Linux system, then it would be gcc. Once you infect a compiler,
> you're invincible.

Actually, that brings up a good question.  The art of compiling is 
certainly somewhat magic to me tho I guess I somewhat understand the 
concept in a vague, handwavy way, but...

From my understanding, that's one reason why the gcc build is multi-stage 
and uses simpler (and thus easier to audit) tools such as lex and bison 
in its bootstrapping process.  I'm not actually sure whether gcc actually 
requires a previous gcc (or other full compiler) to build or not, but I 
do know it goes to quite some lengths to bootstrap in multiple stages, 
building things up from the simple to the complex as it goes and testing 
each stage in the process so that if something goes wrong, there's some 
idea /where/ it went wrong.

Clearly one major reason for that is proving functionality at each step 
such that if the process goes wrong, there's some place to start as to 
why and how, but it certainly doesn't hurt in helping to prove or at 
least somewhat establish the basic security situation either, tho as 
we've already established, it's basically impossible to prove both the 
hardware and the software back thru all the multiple generations.

Of course the simpler tools, lex, bison, etc, must have been built from 
something, but because they /are/ simpler, they're also easier to audit 
and prove basic functionality, including disassembly and analysis of 
individual machine instructions for a fuller audit.

So anyway, to the gcc experts that know, and to non-gcc CS folks who have 
actually built their own simple compilers and can at least address the 
concept, is a previous gcc or other full compiler actually required to 
build a new gcc, or does it sufficiently bootstrap itself from the more 
basic tools such that unlike most code, it doesn't actually need a full 
compiler to build and reasonably optimize at all?  That's a question I've 
had brewing in the back of my mind for some time, and this seemed the 
perfect opportunity to ask it. =:^)

Meanwhile, I suppose it must be possible at least at some level, else how 
would new hardware archs come to be supported.  Gotta start /somewhere/ 
on the toolchain, and "simpler" stuff like lex and bison can I believe 
run on a previous arch, generating the basic executable building blocks 
that ultimately become the first executable code actually run by the new 
target arch.

And of course gcc has long been one of the most widely arch-supporting 
compilers, precisely because it /is/ open source and /is/ designed to be 
bootstrapped in stages like that.  I guess clang/llvm is giving gcc some 
competition in that area now, in part because it's more modern and 
modular and in part because unlike gcc it /can/ legally be taken private 
and supplied to others without offering sources and some companies are 
evil that way, but gcc's the one with the long history in that area, and 
given that history I'd guess it'll be some time before clang/llvm catches 
up, even if it's getting most of the new platforms right now, which I've 
no idea whether it's the case or not.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
  2014-08-07 17:20         ` [gentoo-amd64] " Duncan
@ 2014-08-07 19:38           ` Mark Knecht
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Knecht @ 2014-08-07 19:38 UTC (permalink / raw
  To: Gentoo AMD64

On Thu, Aug 7, 2014 at 10:20 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Lie Ryan posted on Fri, 08 Aug 2014 02:06:14 +1000 as excerpted:
>
>> With you having to compile thousands of stuffs if you build from stage
>> 1, I doubt that you will be able to verify every single thing you
>> compile and detect if something is actually doing sneaky stuff AND still
>> have the time to enjoy your system. Also, even if you build from stage 1
>> and manage to verify all the source code, you still need to download a
>> precompiled compiler which could possibly inject the malicious code into
>> the programs it compiles, and which can also inject itself if you try to
>> compile another compiler from source. If there is a single software that
>> is worth a gold mine to inject with malware to gain illicit access to
>> all Linux system, then it would be gcc. Once you infect a compiler,
>> you're invincible.
>
> Actually, that brings up a good question.  The art of compiling is
> certainly somewhat magic to me tho I guess I somewhat understand the
> concept in a vague, handwavy way, but...

<SNIP>
>
> So anyway, to the gcc experts that know, and to non-gcc CS folks who have
> actually built their own simple compilers and can at least address the
> concept, is a previous gcc or other full compiler actually required to
> build a new gcc, or does it sufficiently bootstrap itself from the more
> basic tools such that unlike most code, it doesn't actually need a full
> compiler to build and reasonably optimize at all?  That's a question I've
> had brewing in the back of my mind for some time, and this seemed the
> perfect opportunity to ask it. =:^)
>

And beyond Duncan's question (good question!) if I try to rebuild gcc
like it was an empty box using my current machine I see this sort of thing
where gcc is about the 350th of 385 packages getting built. It seems to
me that _any_ package that has programs running at the same or higher
level as emerge could be hacked and control what's actually placed on the
machine.

It's an endless problem if you cannot trust anything, and for most people,
and certainly for me, unverifiable the ways the tools work today.

c2RAID6 ~ # emerge -pve gcc

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild   R    ] app-arch/xz-utils-5.0.5-r1  USE="nls threads
-static-libs" ABI_X86="(64) (-32) (-x32)" 1,276 kB
[ebuild   R    ] virtual/libintl-0-r1  ABI_X86="(64) -32 (-x32)" 0 kB
[ebuild   R    ] app-arch/bzip2-1.0.6-r6  USE="-static -static-libs"
ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] dev-libs/expat-2.1.0-r3  USE="unicode -examples
-static-libs" ABI_X86="(64) (-32) (-x32)" 550 kB
[ebuild   R    ] virtual/libiconv-0-r1  ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] dev-lang/python-exec-2.0.1-r1:2
PYTHON_TARGETS="(jython2_5) (jython2_7) (pypy) (python2_7) (python3_2)
(python3_3) (-python3_4)" 0 kB
[ebuild   R    ] sys-devel/gnuconfig-20140212  0 kB
[ebuild   R    ] media-libs/libogg-1.3.1  USE="-static-libs"
ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] app-misc/mime-types-9  16 kB
[ebuild   R    ] sys-apps/baselayout-2.2  USE="-build" 40 kB
[ebuild   R    ] sys-devel/gcc-config-1.7.3  15 kB

<SNIP, SNIP, SNIP>

[ebuild   R    ] media-libs/phonon-4.6.0-r1  USE="gstreamer (-aqua)
-debug -pulseaudio -vlc (-zeitgeist)" 275 kB
[ebuild   R    ] sys-libs/glibc-2.19-r1:2.2  USE="(multilib) -debug
-gd (-hardened) -nscd -profile (-selinux) -suid -systemtap -vanilla" 0
kB
[ebuild   R    ] sys-devel/gcc-4.7.3-r1:4.7  USE="cxx fortran
(multilib) nls nptl openmp (-altivec) -awt -doc (-fixed-point) -gcj
-go -graphite (-hardened) (-libssp) -mudflap (-multislot) -nopie
-nossp -objc -objc++ -objc-gc -regression-test -vanilla" 81,022 kB
[ebuild   R    ] sys-libs/pam-1.1.8-r2  USE="berkdb cracklib nls
-audit -debug -nis (-selinux) {-test} -vim-syntax" ABI_X86="(64) (-32)
(-x32)" 0 kB
[ebuild   R    ] dev-db/mysql-5.1.70  USE="community perl ssl
-big-tables -cluster -debug -embedded -extraengine -latin1
-max-idx-128 -minimal -pbxt -profiling (-selinux) -static {-test}
-xtradb" 24,865 kB
[ebuild   R    ] sys-devel/llvm-3.3-r3:0/3.3  USE="libffi
static-analyzer xml -clang -debug -doc -gold -multitarget -ocaml
-python {-test} -udis86" ABI_X86="(64) (-32) (-x32)"
PYTHON_TARGETS="python2_7 (-pypy) (-pypy2_0%) (-python2_6%)"
VIDEO_CARDS="-radeon" 0 kB
[ebuild   R    ] media-libs/mesa-10.0.4  USE="classic egl gallium llvm
nptl vdpau xvmc -bindist -debug -gbm -gles1 -gles2 -llvm-shared-libs
-opencl -openvg -osmesa -pax_kernel -pic -r600-llvm-compiler
(-selinux) -wayland -xa" ABI_X86="(64) (-32) (-x32)"
VIDEO_CARDS="(-freedreno) -i915 -i965 -ilo -intel -nouveau -r100 -r200
-r300 -r600 -radeon -radeonsi -vmware" 0 kB
[ebuild   R    ] x11-libs/cairo-1.12.16  USE="X glib opengl svg xcb
(-aqua) -debug -directfb -doc (-drm) (-gallium) (-gles2)
-legacy-drivers -openvg (-qt4) -static-libs -valgrind -xlib-xcb" 0 kB
[ebuild   R    ] app-text/poppler-0.24.5:0/44  USE="cairo cxx
introspection jpeg jpeg2k lcms png qt4 tiff utils -cjk -curl -debug
-doc" 0 kB
[ebuild   R    ] media-libs/harfbuzz-0.9.28:0/0.9.18  USE="cairo glib
graphite introspection truetype -icu -static-libs {-test}"
ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] x11-libs/pango-1.36.5  USE="X introspection -debug"
ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] x11-libs/gtk+-2.24.24:2  USE="introspection xinerama
(-aqua) -cups -debug -examples {-test} -vim-syntax" ABI_X86="(64)
(-32) (-x32)" 0 kB
[ebuild   R    ] x11-libs/gtk+-3.12.2:3  USE="X introspection xinerama
(-aqua) -cloudprint -colord -cups -debug -examples {-test} -vim-syntax
-wayland" 0 kB
[ebuild   R    ] dev-db/libiodbc-3.52.7  USE="gtk" 1,015 kB
[ebuild   R    ] app-crypt/pinentry-0.8.2  USE="gtk ncurses qt4 -caps
-static" 419 kB
[ebuild   R    ] dev-java/icedtea-bin-6.1.13.3-r3:6  USE="X alsa -cjk
-cups -doc -examples -nsplugin (-selinux) -source -webstart" 0 kB
[ebuild   R    ] dev-libs/soprano-2.9.4  USE="dbus raptor redland
virtuoso -debug -doc {-test}" 1,913 kB
[ebuild   R    ] app-crypt/gnupg-2.0.25  USE="bzip2 ldap nls readline
usb -adns -doc -mta (-selinux) -smartcard -static" 0 kB
[ebuild   R    ] gnome-extra/polkit-gnome-0.105  304 kB
[ebuild   R    ] kde-base/kdelibs-4.12.5-r2:4/4.12  USE="acl alsa
bzip2 fam handbook jpeg2k mmx nls opengl (policykit) semantic-desktop
spell sse sse2 ssl udev udisks upower -3dnow (-altivec) (-aqua) -debug
-doc -kerberos -lzma -openexr {-test} -zeroconf" 0 kB
[ebuild   R    ] sys-auth/polkit-kde-agent-0.99.0-r1:4  USE="(-aqua)
-debug" LINGUAS="-ca -ca@valencia -cs -da -de -en_GB -eo -es -et -fi
-fr -ga -gl -hr -hu -is -it -ja -km -lt -mai -ms -nb -nds -nl -pa -pt
-pt_BR -ro -ru -sk -sr -sr@ijekavian -sr@ijekavianlatin -sr@latin -sv
-th -tr -uk -zh_TW" 34 kB
[ebuild   R    ] kde-base/nepomuk-core-4.12.5:4/4.12  USE="exif pdf
(-aqua) -debug -epub -ffmpeg -taglib" 0 kB
[ebuild   R    ] kde-base/katepart-4.12.5:4/4.12  USE="handbook
(-aqua) -debug" 0 kB
[ebuild   R    ] kde-base/kdesu-4.12.5:4/4.12  USE="handbook (-aqua)
-debug" 0 kB
[ebuild   R    ] net-libs/libproxy-0.4.11-r2  USE="kde -gnome -mono
-networkmanager -perl -python -spidermonkey {-test} -webkit"
ABI_X86="(64) (-32) (-x32)" PYTHON_TARGETS="python2_7" 0 kB
[ebuild   R    ] kde-base/nepomuk-widgets-4.12.5:4/4.12  USE="(-aqua)
-debug" 0 kB
[ebuild   R    ] kde-base/khelpcenter-4.12.5:4/4.12  USE="(-aqua) -debug" 0 kB
[ebuild   R    ] net-libs/glib-networking-2.40.1-r1  USE="gnome
libproxy ssl -smartcard {-test}" ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] net-libs/libsoup-2.46.0-r1:2.4  USE="introspection
ssl -debug -samba {-test}" ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] media-plugins/gst-plugins-soup-0.10.31-r1:0.10
ABI_X86="(64) (-32) (-x32)" 0 kB
[ebuild   R    ] media-libs/phonon-gstreamer-4.6.3  USE="alsa network
-debug" 71 kB

Total: 385 packages (385 reinstalls), Size of downloads: 355,030 kB
c2RAID6 ~ #


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-08-09  1:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-04 22:04 [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) Mark Knecht
2014-08-05  5:52 ` [gentoo-amd64] " Duncan
2014-08-05 18:50   ` Mark Knecht
2014-08-06 21:33   ` Mark Knecht
2014-08-07  0:58     ` Duncan
2014-08-07 18:16       ` Mark Knecht
2014-08-07 19:53         ` Duncan
2014-08-07 21:18         ` Duncan
2014-08-08 18:34           ` Mark Knecht
2014-08-09  1:38             ` Duncan
2014-08-05 19:16 ` [gentoo-amd64] " Frank Peters
2014-08-05 19:57   ` Rich Freeman
     [not found] <46751df7496f4e4f97fb23e10fc9f5b4@mail10.futurewins.com>
2014-08-05 11:36 ` Rich Freeman
2014-08-05 17:50   ` Mark Knecht
2014-08-05 20:36     ` Frank Peters
2014-08-05 23:20       ` [gentoo-amd64] " Duncan
2014-08-06 12:14         ` james.a.elian
2014-08-06 12:14         ` james.a.elian
2014-08-07 15:36     ` [gentoo-amd64] " Max Cizauskas
2014-08-07 16:06       ` Lie Ryan
2014-08-07 17:20         ` [gentoo-amd64] " Duncan
2014-08-07 19:38           ` Mark Knecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox