From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1SayER-00080u-AF for garchives@archives.gentoo.org; Sat, 02 Jun 2012 23:59:45 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 5D907E058E; Sat, 2 Jun 2012 23:59:25 +0000 (UTC) Received: from mail-pz0-f53.google.com (mail-pz0-f53.google.com [209.85.210.53]) by pigeon.gentoo.org (Postfix) with ESMTP id 3C53EE0782 for ; Sat, 2 Jun 2012 23:58:42 +0000 (UTC) Received: by dadg9 with SMTP id g9so5075076dad.40 for ; Sat, 02 Jun 2012 16:58:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=rlymTx+z/L4pK/NbPN1H7ivsn/clZFDG35fIRGD7JcM=; b=et1LLB4fmONnfUAbedTCy1peSxA7zJ5oGjPUAPqgVrwN1vpWRNq39gsNwi3a9/EKP3 HrVihNtvCRDV2s9daaj7M/UJE0efQ1zMmXxAUeSPzl5TZtFlLyElc18DTJl8U8+YakVy M1e681YH5+UmxqsYx91SpW9eVLoCV5GQUBYfhK2po+UfvHUKOxoPqhidaBGs/WVDsQoS 9Lq3ATw/qRzOATigltAfijv1cFRZ5sHHpOtWAAHKP5hUW+orpbHJBjZMew2YYSL4Vjc0 8ARmVUUbT6qlNFtH0th4wXEQrAPsmkWp+4lwXIq16lYPqSCYTp1UiICv4+B2DLZIHLnq KhhA== Received: by 10.68.225.170 with SMTP id rl10mr23859090pbc.13.1338681521521; Sat, 02 Jun 2012 16:58:41 -0700 (PDT) Received: from smtp.gmail.com:587 (74-95-192-101-SFBA.hfc.comcastbusiness.net. [74.95.192.101]) by mx.google.com with ESMTPS id pe2sm7770949pbc.59.2012.06.02.16.58.39 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 02 Jun 2012 16:58:40 -0700 (PDT) Received: by smtp.gmail.com:587 (sSMTP sendmail emulation); Sat, 02 Jun 2012 16:59:03 -0700 Date: Sat, 2 Jun 2012 16:59:02 -0700 From: Brian Harring To: Mike Frysinger Cc: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash Message-ID: <20120602235902.GC9296@localhost> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201206011841.23302.vapier@gentoo.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Archives-Salt: d7a69ad7-66d7-4640-a500-fd65cfac45fd X-Archives-Hash: f42243339ac271486b1af77dc93cd9f9 On Fri, Jun 01, 2012 at 06:41:22PM -0400, Mike Frysinger wrote: > # @FUNCTION: multijob_post_fork > # @DESCRIPTION: > # You must call this in the parent process after forking a child process. > # If the parallel limit has been hit, it will wait for one to finish and > # return the child's exit status. > multijob_post_fork() { > [[ $# -eq 0 ]] || die "${FUNCNAME} takes no arguments" > > : $(( ++mj_num_jobs )) > if [[ ${mj_num_jobs} -ge ${mj_max_jobs} ]] ; then > multijob_finish_one > fi > return $? > } Minor note; the design of this (fork then check), means when a job finishes, we'll not be ready with more work. This implicitly means that given a fast job identification step (main thread), and a slower job execution (what's backgrounded), we'll not breach #core of parallelism, nor will we achieve that level either (meaning potentially some idle cycles left on the floor). Realistically, the main thread (what invokes post_fork) is *likely*, (if the consumer isn't fricking retarded) to be doing minor work- mostly just poking about figuring out what the next task/arguments are to submit to the pool. That work isn't likely to be a full core worth of work, else as I said, the consumer is being a retard. The original form of this was designed around the assumption that the main thread was light, and the backgrounded jobs weren't, thus it basically did the equivalent of make -j+1, allowing #cores background jobs running, while allowing the main thread to continue on and get the next job ready, once it had that ready, it would block waiting for a slot to open, then immediately submit the job once it had done a reclaim. On the surface of it, it's a minor difference, but having the next job immediately ready to fire makes it easier to saturate cores. Unfortunately, that also changes your API a bit; your call. ~harring