From: Brian Harring <ferringb@gmail.com>
To: Micha?? G??rny <mgorny@gentoo.org>
Cc: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] multiprocessing.eclass: doing parallel work in bash
Date: Sat, 2 Jun 2012 16:47:26 -0700 [thread overview]
Message-ID: <20120602234726.GB9296@localhost> (raw)
In-Reply-To: <4FCA989E.3050307@gentoo.org>
On Sat, Jun 02, 2012 at 03:50:06PM -0700, Zac Medico wrote:
> On 06/02/2012 02:31 PM, Micha?? G??rny wrote:
> > On Sat, 2 Jun 2012 15:54:03 -0400
> > Mike Frysinger <vapier@gentoo.org> wrote:
> >
> >> # @FUNCTION: redirect_alloc_fd
> >> # @USAGE: <var> <file> [redirection]
> >> # @DESCRIPTION:
> >
> > (...and a lot of code)
> >
> > I may be wrong but wouldn't it be simpler to just stick with a named
> > pipe here? Well, at first glance you wouldn't be able to read exactly
> > one result at a time but is it actually useful?
>
> I'm pretty sure that the pipe has remain constantly open in read mode
> (which can only be done by assigning it a file descriptor). Otherwise,
> there's a race condition that can occur, where a write is lost because
> it's written just before the reader closes the pipe.
There isn't a race; write side, it'll block once it exceeds pipe buf
size; read side, bash's read functionality is explicitly byte by byte
reads to avoid consuming data it doesn't need.
That said, Mgorny's suggestion ignores that the the code already is
pointed at a fifo. Presume he's suggesting "Just open it everytime
you need to fuck with it"... which, sure, 'cept that complicates the
read side (either having to find a free fd, open to it, then close
it), or abuse cat or $(<) to pull the results and make the reclaim
code handle multiple results in a single shot.
Frankly, don't see the point in doing that. The code isn't that
complex frankly, and we *need* the overhead of this to be minimal-
the hand off/reclaim is effectively the bottleneck for scaling.
If the jobs you've backgrounded are a second a piece, it matters less;
if they're quick little bursts of activity, the scaling *will* be
limited by how fast we can blast off/reclaim jobs. Keep in mind that
the main process has to go find more work to queue up between the
reclaims, thus this matters more than you'd think.
Either way, that limit varies dependent on time required for each job
vs # of cores; that said, you run code like this on a 48 core and you
see it start becoming an actual bottleneck (which is why I came up
with this hacky bash semaphore).
~harring
next prev parent reply other threads:[~2012-06-02 23:48 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-01 22:41 [gentoo-dev] multiprocessing.eclass: doing parallel work in bash Mike Frysinger
2012-06-01 22:50 ` Mike Frysinger
2012-06-02 4:11 ` Brian Harring
2012-06-02 4:57 ` Mike Frysinger
2012-06-02 9:23 ` Cyprien Nicolas
2012-06-02 9:52 ` David Leverton
2012-06-02 19:18 ` Mike Frysinger
2012-06-02 19:54 ` Mike Frysinger
2012-06-02 20:39 ` Zac Medico
2012-06-02 21:12 ` Mike Frysinger
2012-06-02 23:29 ` Zac Medico
2012-06-02 23:58 ` Mike Frysinger
2012-06-02 21:31 ` Michał Górny
2012-06-02 22:50 ` Zac Medico
2012-06-02 23:47 ` Brian Harring [this message]
2012-06-03 1:04 ` Zac Medico
2012-06-03 1:10 ` Zac Medico
2012-06-03 7:15 ` Michał Górny
2012-06-03 7:18 ` Zac Medico
2012-06-02 23:59 ` Brian Harring
2012-06-03 5:05 ` Mike Frysinger
2012-06-03 6:53 ` Zac Medico
2012-06-03 5:08 ` Mike Frysinger
2012-06-03 22:16 ` Zac Medico
2012-06-05 6:10 ` Mike Frysinger
2012-06-03 22:21 ` Zac Medico
2012-06-04 1:41 ` Zac Medico
2012-06-05 6:14 ` Mike Frysinger
2012-06-07 4:57 ` Mike Frysinger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120602234726.GB9296@localhost \
--to=ferringb@gmail.com \
--cc=gentoo-dev@lists.gentoo.org \
--cc=mgorny@gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox