From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1RUiz3-0006kI-OP for garchives@archives.gentoo.org; Sun, 27 Nov 2011 17:57:46 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id BFADB21C057; Sun, 27 Nov 2011 17:57:26 +0000 (UTC) Received: from svr-us4.tirtonadi.com (svr-us4.tirtonadi.com [69.65.43.212]) by pigeon.gentoo.org (Postfix) with ESMTP id 448B121C029 for ; Sun, 27 Nov 2011 17:56:21 +0000 (UTC) Received: from mail-bw0-f53.google.com ([209.85.214.53]) by svr-us4.tirtonadi.com with esmtpsa (TLSv1:RC4-SHA:128) (Exim 4.69) (envelope-from ) id 1RUixi-002D8f-Oo for gentoo-user@lists.gentoo.org; Mon, 28 Nov 2011 00:56:23 +0700 Received: by bkaq10 with SMTP id q10so8681437bka.40 for ; Sun, 27 Nov 2011 09:56:17 -0800 (PST) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Received: by 10.204.148.4 with SMTP id n4mr42285731bkv.42.1322416577158; Sun, 27 Nov 2011 09:56:17 -0800 (PST) Received: by 10.223.74.16 with HTTP; Sun, 27 Nov 2011 09:56:17 -0800 (PST) Received: by 10.223.74.16 with HTTP; Sun, 27 Nov 2011 09:56:17 -0800 (PST) In-Reply-To: References: <201111270927.57294.michaelkintzios@gmail.com> Date: Mon, 28 Nov 2011 00:56:17 +0700 Message-ID: Subject: Re: [gentoo-user] emerge -j, make -j and make -l From: Pandu Poluan To: gentoo-user@lists.gentoo.org Content-Type: multipart/alternative; boundary=0015174bf02c539fe904b2bb1ae6 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - svr-us4.tirtonadi.com X-AntiAbuse: Original Domain - lists.gentoo.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - poluan.info X-Archives-Salt: 1e601ce3-f7dc-4e82-a39c-e752c9057a13 X-Archives-Hash: c3f69cf046b77cd2007aefa0889d16de --0015174bf02c539fe904b2bb1ae6 Content-Type: text/plain; charset=UTF-8 On Nov 27, 2011 5:12 PM, "Michael Mol" wrote: > > I figure that the optimal number of simultaneous CPU-consuming > processes is going to be the number of CPU cores, plus enough to keep > the CPU occupied while others are blocked on I/O. That's the same > reasoning that drives the selection of a -j number, really. > > If I read make's man page correctly, -l acts as a threshold, choosing > not to spawn an additional child process if the system load average is > above a certain value Since system load is a count of actively running > and ready-to-run processes, you want it to be very close to your > number of logical cores[1]. > > Since it's going to be a spot decision for Make as to whether or not > to spawn another child (if it hits its limit, it's not going to check > again until after one of its children returns), there will be many > race cases where the load average is high when it looks, but some > other processes will return shortly afterward.[2] That means adding a > process or two for a fudge factor. > > That's a lot of guess, though, and it still comes down to guess-and-check. > > emerge -j8 @world # MAKEOPTS="-j16 -l10" > > Was the first combination I tried. This completed in 89 minutes. > > emerge -j8 @world # MAKEOPT="-j16 -l8" > > Was the second. This took significantly longer. > > I haven't tried higher than -l10; I needed this box to do be able to > do things, which meant installing more software. I've gone from 177 > packages to 466. > > [1] I don't have a hyperthreading system available, but I suspect that > this is also going to be true of logical cores; It's my understanding > that the overhead from overcommitting CPU comes primarily from context > switching between processors, and hyperthreading adds CPU hardware > specifically to reduce the need to context-switch in splitting > physical CPU resources between threads/processes. So while you'd lose > a little speed for an individual thread, you would gain it back in > aggregate over both threads. > > [2] There would also be cases where the load average is low, such as > if a Make recipe calls for a significant bit of I/O before it consumes > a great deal of CPU, but a simple 7200rpm SATA disk appears to be > sufficiently fast that this case is less frequent. Here's my experience: I always experience emerge failures on my Gentoo VMs if I use MAKEOPTS=-j>3. Not all packages, but many. Including, IIRC, glibc and gcc. This happens even if I make sure that there's just one emerge job being done. And this happens even if I allocate more vCPUs than -j, on VMware and XenServer alike. I don't know where the 'blame' lies, but I've found myself standardizing on MAKEOPTS=-j3, and PORTAGE_DEFAULT_OPTS="--jobs --load-average=<1.6*num_of_vCPU>" (Yes, no explicit number of jobs. The newer portages are smart enough to keep starting new jobs until the load number is reached) Rgds, --0015174bf02c539fe904b2bb1ae6 Content-Type: text/html; charset=UTF-8


On Nov 27, 2011 5:12 PM, "Michael Mol" <mikemol@gmail.com> wrote:
>
> I figure that the optimal number of simultaneous CPU-consuming
> processes is going to be the number of CPU cores, plus enough to keep
> the CPU occupied while others are blocked on I/O. That's the same
> reasoning that drives the selection of a -j number, really.
>
> If I read make's man page correctly, -l acts as a threshold, choosing
> not to spawn an additional child process if the system load average is
> above a certain value Since system load is a count of actively running
> and ready-to-run processes, you want it to be very close to your
> number of logical cores[1].
>
> Since it's going to be a spot decision for Make as to whether or not
> to spawn another child (if it hits its limit, it's not going to check
> again until after one of its children returns), there will be many
> race cases where the load average is high when it looks, but some
> other processes will return shortly afterward.[2] That means adding a
> process or two for a fudge factor.
>
> That's a lot of guess, though, and it still comes down to guess-and-check.
>
> emerge -j8 @world # MAKEOPTS="-j16 -l10"
>
> Was the first combination I tried. This completed in 89 minutes.
>
> emerge -j8 @world # MAKEOPT="-j16 -l8"
>
> Was the second. This took significantly longer.
>
> I haven't tried higher than -l10; I needed this box to do be able to
> do things, which meant installing more software. I've gone from 177
> packages to 466.
>
> [1] I don't have a hyperthreading system available, but I suspect that
> this is also going to be true of logical cores; It's my understanding
> that the overhead from overcommitting CPU comes primarily from context
> switching between processors, and hyperthreading adds CPU hardware
> specifically to reduce the need to context-switch in splitting
> physical CPU resources between threads/processes. So while you'd lose
> a little speed for an individual thread, you would gain it back in
> aggregate over both threads.
>
> [2] There would also be cases where the load average is low, such as
> if a Make recipe calls for a significant bit of I/O before it consumes
> a great deal of CPU, but a simple 7200rpm SATA disk appears to be
> sufficiently fast that this case is less frequent.

Here's my experience:

I always experience emerge failures on my Gentoo VMs if I use MAKEOPTS=-j>3. Not all packages, but many. Including, IIRC, glibc and gcc.

This happens even if I make sure that there's just one emerge job being done. And this happens even if I allocate more vCPUs than -j, on VMware and XenServer alike.

I don't know where the 'blame' lies, but I've found myself standardizing on MAKEOPTS=-j3, and PORTAGE_DEFAULT_OPTS="--jobs --load-average=<1.6*num_of_vCPU>"

(Yes, no explicit number of jobs. The newer portages are smart enough to keep starting new jobs until the load number is reached)

Rgds,

--0015174bf02c539fe904b2bb1ae6--