* [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin)
@ 2012-09-24 10:25 Gregory M. Turner
2012-09-24 16:08 ` Zac Medico
2012-09-24 17:05 ` Mike Frysinger
0 siblings, 2 replies; 5+ messages in thread
From: Gregory M. Turner @ 2012-09-24 10:25 UTC (permalink / raw
To: gentoo-portage-dev
On cygwin, there is a problem with bi-directional pipe support in bash.
I used to solve this with an ugly reversion in portage and an
ultra-simple stubbification patch for multiprocessing.eclass (both
serialized everything).
However, this really sucked for numerous reasons, including the obvious
one: it makes stuff slow as hell.
So I've reworked everything to use uni-directional named pipes.
In portage I have:
diff --git a/bin/helper-functions.sh b/bin/helper-functions.sh
index c7400fa..87f3120 100644
--- a/bin/helper-functions.sh
+++ b/bin/helper-functions.sh
@@ -7,6 +7,36 @@
source "${PORTAGE_BIN_PATH:-/usr/lib/portage/bin}"/isolated-functions.sh
+# try real hard to figure out if this is a cygwin host; cache results.
+this_host_is_cygwin() {
+ if [[ -n ${_this_host_is_cygwin} ]] ; then
+ return $_this_host_is_cygwin
+ fi
+ [[ -x ${EPREFIX}/usr/bin/uname ]] && \
+ [[ $( ${EPREFIX}/usr/bin/uname -o 2>/dev/null ) == Cygwin* ]] && \
+ export _this_host_is_cygwin=0 && return 0
+ [[ -x /usr/bin/uname ]] && \
+ [[ $( /usr/bin/uname -o 2>/dev/null ) == Cygwin* ]] && \
+ export _this_host_is_cygwin=0 && return 0
+ [[ -x /bin/uname ]] && \
+ [[ $( /bin/uname -o 2>/dev/null ) == Cygwin* ]] && \
+ export _this_host_is_cygwin=0 && return 0
+ # hail-mary before we resort to envvars
+ [[ $( uname -o 2>/dev/null ) == Cygwin* ]] && \
+ export _this_host_is_cygwin=0 && return 0
+
+ [[ -n ${CHOST} ]] && \
+ [[ ${CHOST} == *-cygwin* ]] && \
+ export _this_host_is_cygwin=0 && return 0
+ [[ -n ${CTARGET} ]] && \
+ [[ ${CTARGET} == *-cygwin* ]] && \
+ export _this_host_is_cygwin=0 && return 0
+
+ # either it ain't cygwin or something is very broken.
+ export _this_host_is_cygwin=1
+ return 1
+}
+
#
# API functions for doing parallel processing
#
@@ -19,25 +49,51 @@ numjobs() {
multijob_init() {
# Setup a pipe for children to write their pids to when they finish.
- mj_control_pipe=$(mktemp -t multijob.XXXXXX)
- rm "${mj_control_pipe}"
- mkfifo "${mj_control_pipe}"
- redirect_alloc_fd mj_control_fd "${mj_control_pipe}"
+ export mj_control_pipe=$(mktemp -t multijob.XXXXXX)
rm -f "${mj_control_pipe}"
+ mkfifo "${mj_control_pipe}"
+
+ if ! this_host_is_cygwin ; then
+ redirect_alloc_fd mj_control_fd "${mj_control_pipe}"
+ rm -f "${mj_control_pipe}"
+ fi
# See how many children we can fork based on the user's settings.
mj_max_jobs=$(numjobs)
mj_num_jobs=0
}
+# make sure someone called multijob_init
+multijob_assert() {
+ if this_host_is_cygwin ; then
+ [[ -z ${mj_control_pipe} ]] && \
+ die "multijob initialization required"
+ [[ $( file -b "${mj_control_pipe}" ) != fifo* ]] && \
+ die "multijob fifo gone"
+ else
+ [[ -z ${mj_control_fd} ]] && \
+ die "multijob initialization required"
+ fi
+}
+
multijob_child_init() {
- trap 'echo ${BASHPID} $? >&'${mj_control_fd} EXIT
+ multijob_assert
+ if this_host_is_cygwin ; then
+ trap 'echo ${BASHPID} $? >'${mj_control_pipe} EXIT
+ else
+ trap 'echo ${BASHPID} $? >&'${mj_control_fd} EXIT
+ fi
trap 'exit 1' INT TERM
}
multijob_finish_one() {
local pid ret
- read -r -u ${mj_control_fd} pid ret
+ multijob_assert
+ if this_host_is_cygwin ; then
+ read -r pid ret < ${mj_control_pipe}
+ else
+ read -r -u ${mj_control_fd} pid ret
+ fi
: $(( --mj_num_jobs ))
return ${ret}
}
@@ -70,6 +126,9 @@ multijob_post_fork() {
redirect_alloc_fd() {
local var=$1 file=$2 redir=${3:-"<>"}
+ [[ "${redir}" == "<>" ]] && this_host_is_cygwin && \
+ die "Cygwin bash has broken <> bidirectional redirection support."
+
if [[ $(( (BASH_VERSINFO[0] << 8) + BASH_VERSINFO[1] )) -ge $(( (4 <<
8) + 1 )) ]] ; then
# Newer bash provides this functionality.
eval "exec {${var}}${redir}'${file}'"
--
(Here's hoping Thunderbird didn't reflow the above; apologies in advance
if it did)
In multiprocessing.eclass, I do something fairly similar, also relying
on named pipes rather than file-descriptor cloning.
I was wondering if anyone in the know could comment on how correct or
incorrect the above actually is? For example, is either portage
multijob or mutliprocessing.eclass expected to work if someone nests
multijob_init's (because I'm pretty sure mine won't -- FTR, this seems
pretty naughty, given that both get their parallelism defaults from
MAKEOPTS)?
Finally, to be clear, I'm not wanting this to go in to git or
anything... at least, certainly not as of now. Just seeking
constructive criticism so my code isn't horribly broken, since there
aren't a ton of multi{jobs,processing} clients, ATM, for me to test against.
-gmt
Postscript: status of this bug in cygwin, for anyone who may be
interested/googling:
I've tried to look into this, just a little bit, because if I run an
ultra-simple bidirectional pipe test in C, it almost kinda-sorta works
(although some weird pump-priming seems to be required ... blah).
bash's code for this doesn't exactly make for light reading, nor does
cygwin's named pipe code, although the latter is at least not mixed up
with a bunch of parser/generator code. Given that I'm not even sure the
problem isn't already fixed upstream, it didn't seem worth the effort.
The matter has been mentioned on the cygwin mailing list, and cgf,
cygwin lead dev and implementation SME, has stated he doesn't want the
named-pipe code churning for now. So, unless the fix is already
in-repo, but unreleased (my over-under on this would be maybe 3:1
against; I'm too lazy to check), it's likely to stay this way for a while.
My personal game-plan is: just wait until the next cygwin core release
and see if it's fixed. Until then, I have what seem to be tolerable
work-arounds for my overlay, as described above. If the next release
fails to resolve the problem, then I might revisit and see if I can't
find a solution, or at least a work-around in bash.
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin)
2012-09-24 10:25 [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin) Gregory M. Turner
@ 2012-09-24 16:08 ` Zac Medico
2012-09-24 19:17 ` Gregory M. Turner
2012-09-24 17:05 ` Mike Frysinger
1 sibling, 1 reply; 5+ messages in thread
From: Zac Medico @ 2012-09-24 16:08 UTC (permalink / raw
To: gentoo-portage-dev
On 09/24/2012 03:25 AM, Gregory M. Turner wrote:
> +# try real hard to figure out if this is a cygwin host; cache results.
> +this_host_is_cygwin() {
> + if [[ -n ${_this_host_is_cygwin} ]] ; then
> + return $_this_host_is_cygwin
> + fi
> + [[ -x ${EPREFIX}/usr/bin/uname ]] && \
> + [[ $( ${EPREFIX}/usr/bin/uname -o 2>/dev/null ) == Cygwin* ]] && \
> + export _this_host_is_cygwin=0 && return 0
You could probably just assume that uname is in $PATH and remove all the
other tests.
> multijob_finish_one() {
> local pid ret
> - read -r -u ${mj_control_fd} pid ret
> + multijob_assert
> + if this_host_is_cygwin ; then
> + read -r pid ret < ${mj_control_pipe}
> + else
> + read -r -u ${mj_control_fd} pid ret
> + fi
> : $(( --mj_num_jobs ))
> return ${ret}
> }
Since you re-open the pipe for each read, there's a race condition which
I've explained here:
http://permalink.gmane.org/gmane.linux.gentoo.devel/77528
--
Thanks,
Zac
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin)
2012-09-24 10:25 [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin) Gregory M. Turner
2012-09-24 16:08 ` Zac Medico
@ 2012-09-24 17:05 ` Mike Frysinger
2012-09-25 0:35 ` Brian Harring
1 sibling, 1 reply; 5+ messages in thread
From: Mike Frysinger @ 2012-09-24 17:05 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Gregory M. Turner
[-- Attachment #1: Type: Text/Plain, Size: 706 bytes --]
On Monday 24 September 2012 06:25:35 Gregory M. Turner wrote:
> On cygwin, there is a problem with bi-directional pipe support in bash.
>
> I used to solve this with an ugly reversion in portage and an
> ultra-simple stubbification patch for multiprocessing.eclass (both
> serialized everything).
>
> However, this really sucked for numerous reasons, including the obvious
> one: it makes stuff slow as hell.
if cygwin sucks, it doesn't get parallel jobs. add stubs to the end of the
file to disable parallel support.
if is_cygwin ; then
numjobs() { return 1; }
multijob_init() { return 0; }
...etc...
fi
interleaving the cygwin logic otherwise makes it unmaintainable
-mike
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin)
2012-09-24 16:08 ` Zac Medico
@ 2012-09-24 19:17 ` Gregory M. Turner
0 siblings, 0 replies; 5+ messages in thread
From: Gregory M. Turner @ 2012-09-24 19:17 UTC (permalink / raw
To: gentoo-portage-dev
On 9/24/2012 9:08 AM, Zac Medico wrote:
> On 09/24/2012 03:25 AM, Gregory M. Turner wrote:
>> multijob_finish_one() {
>> local pid ret
>> - read -r -u ${mj_control_fd} pid ret
>> + multijob_assert
>> + if this_host_is_cygwin ; then
>> + read -r pid ret < ${mj_control_pipe}
>> + else
>> + read -r -u ${mj_control_fd} pid ret
>> + fi
>> : $(( --mj_num_jobs ))
>> return ${ret}
>> }
> Since you re-open the pipe for each read, there's a race condition which
> I've explained here:
>
> http://permalink.gmane.org/gmane.linux.gentoo.devel/77528
Gross, thanks for this pointer, Zac. Plenty of solutions one could
concoct, but no point dumping effort into this black-hole when I can
just wait for somebody else to get fed up and fix the underlying problem
for me :)
Guess it's back to stubs, for now.
-gmt
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin)
2012-09-24 17:05 ` Mike Frysinger
@ 2012-09-25 0:35 ` Brian Harring
0 siblings, 0 replies; 5+ messages in thread
From: Brian Harring @ 2012-09-25 0:35 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Gregory M. Turner
On Mon, Sep 24, 2012 at 01:05:35PM -0400, Mike Frysinger wrote:
> On Monday 24 September 2012 06:25:35 Gregory M. Turner wrote:
> > On cygwin, there is a problem with bi-directional pipe support in bash.
> >
> > I used to solve this with an ugly reversion in portage and an
> > ultra-simple stubbification patch for multiprocessing.eclass (both
> > serialized everything).
> >
> > However, this really sucked for numerous reasons, including the obvious
> > one: it makes stuff slow as hell.
>
> if cygwin sucks, it doesn't get parallel jobs. add stubs to the end of the
> file to disable parallel support.
>
> if is_cygwin ; then
> numjobs() { return 1; }
> multijob_init() { return 0; }
> ...etc...
> fi
>
> interleaving the cygwin logic otherwise makes it unmaintainable
Seconded; if cygwin environment sucks for this, just suppress the
parallelization for it. Better to be working then not, plus I doubt
cygwin users are going to notice a huge diff in performance. :)
~harring
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-09-25 3:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-24 10:25 [gentoo-portage-dev] blech... (multijob/multiprocessing work-around for cygwin) Gregory M. Turner
2012-09-24 16:08 ` Zac Medico
2012-09-24 19:17 ` Gregory M. Turner
2012-09-24 17:05 ` Mike Frysinger
2012-09-25 0:35 ` Brian Harring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox