From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org)
	by finch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-user+bounces-131630-garchives=archives.gentoo.org@lists.gentoo.org>)
	id 1RUlB2-0004lE-QS
	for garchives@archives.gentoo.org; Sun, 27 Nov 2011 20:18:17 +0000
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id E9DE821C024;
	Sun, 27 Nov 2011 20:17:52 +0000 (UTC)
Received: from mail-bw0-f53.google.com (mail-bw0-f53.google.com [209.85.214.53])
	by pigeon.gentoo.org (Postfix) with ESMTP id C494321C0DA
	for <gentoo-user@lists.gentoo.org>; Sun, 27 Nov 2011 20:16:34 +0000 (UTC)
Received: by bkaq10 with SMTP id q10so8864422bka.40
        for <gentoo-user@lists.gentoo.org>; Sun, 27 Nov 2011 12:16:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        bh=vbPA/bvt0fE7VFGBZ4e78s9xLo6cOvXCOV0ehfiPphw=;
        b=Rx75XsWWdwhlOTWbupXxWnXo+k0Ek612uNrAbuQPhvQYtBkDvmzHRsl++F+2Pmii5U
         DeqIl7XbSJNe0IE9dHNivV/rpNKTs9l09GY60xyt3pzXBzhEQ2fwRwPDEEOX+jAdZ9Ci
         NKXCeRHto6zcLDiXcCdx/bKodjpL7zPrOc0wo=
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@lists.gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
MIME-Version: 1.0
Received: by 10.205.81.141 with SMTP id zy13mr41950826bkb.50.1322424993814;
 Sun, 27 Nov 2011 12:16:33 -0800 (PST)
Received: by 10.204.14.7 with HTTP; Sun, 27 Nov 2011 12:16:33 -0800 (PST)
In-Reply-To: <CAA2qdGWwuHUUd7ZkbBOwJxx8Ax88xfAptASgvbFtNa0YMqe6bQ@mail.gmail.com>
References: <CA+czFiB4pSTcVTgAbRnL6AUgGoE9E7gsg-1O0LgfU4FVF_DB+Q@mail.gmail.com>
	<201111270927.57294.michaelkintzios@gmail.com>
	<CA+czFiBAR2_j1vETXGZk6SL9XK9w=doe4orU+bLeZbXHXBUqPg@mail.gmail.com>
	<CAA2qdGWwuHUUd7ZkbBOwJxx8Ax88xfAptASgvbFtNa0YMqe6bQ@mail.gmail.com>
Date: Sun, 27 Nov 2011 15:16:33 -0500
Message-ID: <CA+czFiBSDrF-gSu_HYr4p84CCK+DaL5DgOJpr9m5z4a_7_58oQ@mail.gmail.com>
Subject: Re: [gentoo-user] emerge -j, make -j and make -l
From: Michael Mol <mikemol@gmail.com>
To: gentoo-user@lists.gentoo.org
Content-Type: text/plain; charset=UTF-8
X-Archives-Salt: d696f2fa-a0c4-4b3e-b62f-0dd5d2a1fcfb
X-Archives-Hash: 34acfbcd4c1d42fa113ebe6a0340b36d

On Sun, Nov 27, 2011 at 12:56 PM, Pandu Poluan <pandu@poluan.info> wrote:
> On Nov 27, 2011 5:12 PM, "Michael Mol" <mikemol@gmail.com> wrote:

[snip]

>
> Here's my experience:
>
> I always experience emerge failures on my Gentoo VMs if I use MAKEOPTS=-j>3.
> Not all packages, but many. Including, IIRC, glibc and gcc.

In my barebones 177-package state, I didn't get any build failures
from parallel building, either via emerge -j or make -j. I did get one
failure when I went to install X that worked fine on the second
attempt.

Parallel operations are finicky things; if you don't define the
relationships correctly, you can have things work fine most of the
time, and then a race condition between one make recipe and another
(or perhaps between one ebuild and another; a revdep-rebuild afterward
might not be a bad CYA) causes one thing to fail, just this one time.

My day job is C++ on Windows[1], and we do a *lot* with multithreaded
code. Race conditions are a PITA; you might not be able to reproduce a
race-induced failure on any of the workstations or test systems you
have, but then have it crop up consistently on a customer's system.
The same principles can and will apply with things like parallel make
and parallel emerge. I've even seen it happen in VS2005 and VS2008
parallel builds.

> This happens even if I make sure that there's just one emerge job being
> done. And this happens even if I allocate more vCPUs than -j, on VMware and
> XenServer alike.

FWIW, I've been running with MAKEOPTS=-j10 on my Phenom 9650 for over
a year. It's very rare that something breaks due to the parallel
build. I think it's happened perhaps three times, and each time was
resolvable with a retry. YMMV, of course; race conditions are finicky.

> I don't know where the 'blame' lies, but I've found myself standardizing on
> MAKEOPTS=-j3, and PORTAGE_DEFAULT_OPTS="--jobs
> --load-average=<1.6*num_of_vCPU>"
>
> (Yes, no explicit number of jobs. The newer portages are smart enough to
> keep starting new jobs until the load number is reached)

Sweet; I didn't know about Portage's --load-average; I'll definitely
switch to that instead of -j. Load-driven make plus load-driven
portage should work beautifully on my system.

I'll steal your 1.6 factor, and give:
    MAKEOPTS=-j <2*N> -l <1.6*N)
    PORTAGE_DEFAULT_OPTS="--jobs --load-average<1.6*N>"
a try.

How does that interact with distcc, by the way? I've got two of these
octo-core Xeon boxes, and I've still got my Phenom 9650--distcc on my
home network should become very, very nice. And this box is consuming
255W at the wall with monitor off, 326W with monitor on. That's not
bad. Though perhaps I should move to an apartment where heat isn't
free...

[1] Well, for most of this year, my task list has been more
PHP-oriented, but I'm still on tap for our C++ work.

-- 
:wq