public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* Re: [gentoo-dev]  Re: Re: Default src_install for EAPI-2 or following EAPI
       [not found]       ` <bew51-7fR-11@gated-at.bofh.it>
@ 2008-09-21 22:11         ` Vaeth
  2008-09-22  0:35           ` [gentoo-dev] " Steve Long
  0 siblings, 1 reply; 5+ messages in thread
From: Vaeth @ 2008-09-21 22:11 UTC (permalink / raw
  To: gentoo-dev

Steve Long wrote:

> Vaeth wrote:
>
> > let me remark that the more clever way to this is
> > 
> >   [ -n "${DOCS}" ] && eval "dodoc ${DOCS}"
> >
> eval is _not_ clever. Try: /msg greybot eval
> ..or check http://wooledge.org:8000/BashFAQ/048

This is not at all related with my remark:
We were speaking about the variable DOCS, which is supposed to be
defined by a package author, not by an unreliable source.
Of course, unreliable data here may allow execution of arbitrary code,
but the package author can execute what he wants anyway.

> > This way, people can simply quote as they like:
> > 
> > DOCS="'filename with spaces' filename_without_space doc/*"
> >
> Yeuch.

Well: DOCS=('filename with spaces' filename_without_space doc/*)
I cannot see much difference: ( ) vs. " " would optically IMHO not be a
reason to discuss, but the former works only in bash, the latter
practically everyhwere (and so shell programmers should be used to the
latter notation anyway).

> > or also
> > 
> > DOCS="just_one_filename_without_special_characters"
> >
> You don't need quotes there.

This is true, but I wanted to show the way most people will use it.

> > or also - when Push from /usr/bin/functions-eix.sh is used
> > (which might be implemented simpler without using other functions):
> >
> > Push DOCS 'filename with spaces' filename_without_space "${S}"/doc/*
> >
> Or just do DOCS+=(foo/* someFile 'some other File') at any point.

So the difference is saving two tokens. Is this worth to cement
bash-dependency forever in many scripts?

> BASH arrays will cope with *any* character apart from NUL, which isn't
> allowed in filenames. Can you _guarantee_ the same?

Yes, Push does _guarantee_ the same. It is actually rather simple to
implement: It puts its argument in '...', separated by spaces,
but replaces ' in the arguments before by '\'' (the last part is a bit
tricky to do in POSIX [although not really hard - only in functions-eix.sh
this is lengthy, because a more general replacement function is used
there]. For the time being, I would not even argue against implementing
Push in a sourced script in bash: This is only one place to change if one
wants more compatibility later on).

> For instance, what if some crazy designer puts a file called:
> Vaeth's "Latest" Hits
> ..in that doc dir

no problem at all.

> Ebuilds require BASH; get over it.

My remark concerning arrays was meant to be general, not specific for
ebuilds/portage only (although I couldn't find a passage in the bible
where god claimed that ebuilds have to require bash. Actually, 99% of
the ebuilds would not need bash, if they would be modified in a completely
trivial ways (for the remaining 1% it would need a bit more work)).
If one encourages people to write ebuilds compatible, maybe even for
portage some day a change is realistic (although I am completely aware
that this is not a reasonable project for the near future).

> BASH is as portable as GNU make is, and you clearly have no issue
> depending on that, and Python or C++.

Do you know which shell might be preferrable in 5 years or 10 years?
Bets are good that thos shell will at least support POSIX;
bets are worse that this shell will support the bash-specific
treatment of arrays.

> BTW, POSIX sh doesn't need ${DOCS} or ${S} either, you're just wasting
> characters.

Yes, but that's the gentoo-recommended way to write variables -
no need to change the style just for changing it.

> > the array-less solution is also much simpler to 
> > implement, easy to understand from the source, and clearer in usage.
>
> Not to me it's not, it looks awful, to read and to type, as well as being
> fragile.

Yes, two symbols to type more is a nightmare :)
"Fragile" is not the case as I showed above.

> Furthermore you're bringing eval into the script new people are going to
> look at to learn from (it's core functionality, fulfilling a basic task)

So why should people learn bashisms instead of compatible shell
programming? Eval is not bad in itself. It is only bad if used on
uncontrolled data which I never suggested. (Actually, a warning in an
portage manpage might be didactically more effective concerning the
dangers of the uncontrolled usage of eval than many shell introductions.)

> >> if isArr DOCS; then
> >>    ((${#DOCS[@]})) && dodoc "${DOCS[@]}"
> >> else [[ $DOCS ]] && dodoc $DOCS
> >> fi
> > 
> > are just awful.
> 
> Actually if you factor out that isArr is a utility function (exactly like
> Push) that code is very easy to follow

Maybe my explanation was unclear here: I am not speaking about the code.
I am speaking about the way it behaves.
  DOCS='"a b"'   -> two files `"a' and `b"'
  DOCS=('"a b"') -> one file `"a b"'
this is just creating confusion by special cases.
If you say instead the argument is eval'ed, everybody who knows any shell
knows what is going on and that you have to quote correspondingly.
And the case distinction is necessary, since for arrays you cannot
shortcut (i.e. you can _never_ avoid the ( ) part) - for variables
you can (as you mentioned, in most cases you can even avoid the " " part).

> I'm willing to bet your sh scripts aren't really as portable as you think.
> If you want to see how portable sh is done, read:
> http://sources.redhat.com/autobook/autobook/autobook_210.html#SEC210
> (all of it) and then try to persuade us that we should be writing ebuilds
> like that.

This is an old rhetorical trick (I don't know its name in English):
You impute that I claimed things which I never said - of course, then it
is easy for you to prove that these things are wrong.
I _never_ suggested to use code from stone-age for ebuilds (I did more
for the eix scripts, and I think that I succeeded meanwhile for all
architectures supported by gentoo, but I did never suggest this for
everybody. BTW: Even for these architectures only very few differences
from POSIX arose - these really old shells which do not have even
functions or other odd bugs seem to have really extinct. But this is a
different topic).

However, I strongly suggest to avoid bashisms unless absolutely
necessary and reasonable. There are scripts where this is reasonable,
but far too many scripts which use it do not belong to this category.
Using arrays to pass parameters is one of the cases of unnecessary usage
(although this is not widely known - that's that main reason why I posted
the remark).



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [gentoo-dev]  Re: Re: Re: Default src_install for EAPI-2 or following EAPI
  2008-09-21 22:11         ` [gentoo-dev] Re: Re: Default src_install for EAPI-2 or following EAPI Vaeth
@ 2008-09-22  0:35           ` Steve Long
  2008-09-22  8:25             ` [gentoo-dev] " Duncan
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Long @ 2008-09-22  0:35 UTC (permalink / raw
  To: gentoo-dev

[Sorry for length]
Vaeth wrote:

> Steve Long wrote:
> 
>> Vaeth wrote:
>>
>> > let me remark that the more clever way to this is
>> > 
>> >   [ -n "${DOCS}" ] && eval "dodoc ${DOCS}"
>> >
>> eval is _not_ clever. Try: /msg greybot eval
>> ..or check http://wooledge.org:8000/BashFAQ/048
> 
> This is not at all related with my remark:
> We were speaking about the variable DOCS, which is supposed to be
> defined by a package author, not by an unreliable source.
> Of course, unreliable data here may allow execution of arbitrary code,
> but the package author can execute what he wants anyway.
>
My point wasn't about security so much as the fact that the author has to
worry about how the filenames will be interpreted. You state that
saying "it will be eval'ed" is enough. I disagree, as it makes it trickier
to handle.
 
>> > This way, people can simply quote as they like:
>> > 
>> > DOCS="'filename with spaces' filename_without_space doc/*"
>> >
>> Yeuch.
> 
> Well: DOCS=('filename with spaces' filename_without_space doc/*)
> I cannot see much difference: ( ) vs. " " would optically IMHO not be a
> reason to discuss, but the former works only in bash, the latter
> practically everyhwere (and so shell programmers should be used to the
> latter notation anyway).
>
That's the thing though; most Gentoo devs don't appear to be shell
programmers, and certainly not POSIX sh ones. BASH is simply much more
convenient to work with, especially if you are used to another language
(that has arrays for example.) That convenience adds up to saved time and
cleaner code.

Again, your formulation only works with eval. It doesn't work easily as a
generic thing; it requires thinking about, mental effort from devs who are
already overstretched. I guess it comes down to the debate over saving
programmer time vs CPU time.
 
>> > or also
>> > 
>> > DOCS="just_one_filename_without_special_characters"
>> >
>> You don't need quotes there.
> 
> This is true, but I wanted to show the way most people will use it.
>
Sure, but people should also be learning when quotes are needed and when
not; that is fundamental to shell-scripting after all?
 
>> > or also - when Push from /usr/bin/functions-eix.sh is used
>> > (which might be implemented simpler without using other functions):
>> >
>> > Push DOCS 'filename with spaces' filename_without_space "${S}"/doc/*
>> >
>> Or just do DOCS+=(foo/* someFile 'some other File') at any point.
> 
> So the difference is saving two tokens. Is this worth to cement
> bash-dependency forever in many scripts?
>
No, my point was that it's part of basic BASH syntax, so anyone looking at
it who knows BASH knows exactly what it does, without having to dig through
an eclass or the like to make sure. It's cleaner to work with in the lib
code too.
 
>> BASH arrays will cope with *any* character apart from NUL, which isn't
>> allowed in filenames. Can you _guarantee_ the same?
> 
> Yes, Push does _guarantee_ the same. It is actually rather simple to
> implement: It puts its argument in '...', separated by spaces,
> but replaces ' in the arguments before by '\'' (the last part is a bit
> tricky to do in POSIX [although not really hard - only in functions-eix.sh
> this is lengthy, because a more general replacement function is used
> there]. For the time being, I would not even argue against implementing
> Push in a sourced script in bash: This is only one place to change if one
> wants more compatibility later on).
>
Cool, I've seen that trick in makefiles (kernel uses it for echoing cmds
iirc.) If you're stuck with a shell that only implements a "stone-age"
standard, designed to allow a base common-denominator 15 or 20 years ago,
fair enough ;p
 
>> Ebuilds require BASH; get over it.
> 
> My remark concerning arrays was meant to be general, not specific for
> ebuilds/portage only (although I couldn't find a passage in the bible
> where god claimed that ebuilds have to require bash.
Yes, hyperbole aside: ebuilds have been built on BASH from the start.

> Actually, 99% of 
> the ebuilds would not need bash, if they would be modified in a completely
> trivial ways (for the remaining 1% it would need a bit more work)).
> If one encourages people to write ebuilds compatible, maybe even for
> portage some day a change is realistic (although I am completely aware
> that this is not a reasonable project for the near future).
>
The thing is those changes make the code harder to read and maintain, which
matters for the target scripters. It's important to be able to look at the
script and tell what it does quickly; it's also important to be able to
write and update it quickly and relatively easily.
 
>> BASH is as portable as GNU make is, and you clearly have no issue
>> depending on that, and Python or C++.
> 
> Do you know which shell might be preferrable in 5 years or 10 years?
> Bets are good that thos shell will at least support POSIX;
> bets are worse that this shell will support the bash-specific
> treatment of arrays.
>
ksh, zsh and bash all have arrays. Since POSIX came along, the development
(which moves forward, remember) of most next-generation shells (ie not
those aiming for the embedded space, but for general use) have all included
arrays.

Put it another way: do you believe the GNU shell in 5 or 10 years time will
not support arrays?
 
>> BTW, POSIX sh doesn't need ${DOCS} or ${S} either, you're just wasting
>> characters.
> 
> Yes, but that's the gentoo-recommended way to write variables -
> no need to change the style just for changing it.
>
Well OK, but imo no need to use it, since repoman deals fine with variables
without braces. Changing the style to make it easier to work with strikes
me as a good idea. (Especially when so many beginners think it means you
don't have to quote; it's just a distraction from learning what really
matters.)
 
>> > the array-less solution is also much simpler to
>> > implement, easy to understand from the source, and clearer in usage.
>>
>> Not to me it's not, it looks awful, to read and to type, as well as being
>> fragile.
> 
> Yes, two symbols to type more is a nightmare :)
> "Fragile" is not the case as I showed above.
>
Again, it's not the two symbols. It's having to parse or write that string.
 
>> Furthermore you're bringing eval into the script new people are going to
>> look at to learn from (it's core functionality, fulfilling a basic task)
> 
> So why should people learn bashisms instead of compatible shell
> programming?

Precisely because bashisms are the features that have been added by people
who really know Unix to make their system administration easier. These are
the people who really know scripting in an environment where the scarcest
resource is human time.

>> Actually if you factor out that isArr is a utility function (exactly like
>> Push) that code is very easy to follow
> 
> Maybe my explanation was unclear here: I am not speaking about the code.
> I am speaking about the way it behaves.
>   DOCS='"a b"'   -> two files `"a' and `b"'
>   DOCS=('"a b"') -> one file `"a b"'
> this is just creating confusion by special cases.

No, it's providing two ways to specify a config variable. One is the
backward compatible manner, so that old ebuilds won't break, and people can
continue to use the method they're used to for simple things. The other is
the way for the ebuild author to specify more complex cases.

I know for a fact that users like having both. It's providing mechanism, and
not enforcing policy. "You must make sure your variables are in a fit state
to be eval'ed" is the opposite; it both takes away an option and restricts
what the user can easily do.

And you said yourself above you couldn't see much difference (although the
BASH version is a bit cleaner.) All I'll say is the BASH arrays mean you
always know what you're quoting; if you use 'a b' it's always one
parameter, exactly like all the other quoting you do.

> If you say instead the argument is eval'ed, everybody who knows any shell
> knows what is going on and that you have to quote correspondingly.
> And the case distinction is necessary, since for arrays you cannot
> shortcut (i.e. you can _never_ avoid the ( ) part) - for variables
> you can (as you mentioned, in most cases you can even avoid the " " part).
>
You can only avoid quotes (and I prefer '' unless I want variable expansion)
when it's a single token with no characters like < > ( ) & | or ; which
affect tokenisation (a $ obviously affects things too). [ ? or * don't
actually matter, since pathname expansion doesn't happen in assignment.

You can say "everyone knows what is going on" but beginners simply don't,
and even advanced sh scripters sometimes get their eval strings wrong.
Devoting the extra headspace when you're just trying to get a bug fixed, or
your first ebuild written, is just counter to maintaining a distribution
imo.

As for the case distinction, the ebuild author or maintainer doesn't need to
make it. It's only relevant for an eclass or base function which actually
handles the variable in question, either using it to carry out a task for
the ebuild author, or manipulating it.

It would be easy enough to convert it to array once after sourcing the
ebuild so that all functions could rely on it being an array, if that's
desired, so that the test would only be run once. Granted it would be a bit
more complex if it had to operate on a list of those variables, but it
wouldn't need eval, since BASH has syntax designed to obviate the need for
eval in nearly all cases.

>> I'm willing to bet your sh scripts aren't really as portable as you
>> think. If you want to see how portable sh is done, read:
>> http://sources.redhat.com/autobook/autobook/autobook_210.html#SEC210
>> (all of it) and then try to persuade us that we should be writing ebuilds
>> like that.
> 
> This is an old rhetorical trick (I don't know its name in English):
> You impute that I claimed things which I never said - of course, then it
> is easy for you to prove that these things are wrong.
What, like saying my point was only about saving two tokens?

> I _never_ suggested to use code from stone-age for ebuilds
You did as far as I am concerned.

> (I did more 
> for the eix scripts, and I think that I succeeded meanwhile for all
> architectures supported by gentoo, but I did never suggest this for
> everybody.
I see; so you, a competent and knowledgeable sh scripter, are not even sure
whether your sh code works on every arch supported by Gentoo? Whereas BASH
is running on every single one of those and clearly ebuilds run on all of
them, or they wouldn't be supported. That reinforces my point about BASH
portability, which was actually why I posted the link to that doc.

> BTW: Even for these architectures only very few differences 
> from POSIX arose - these really old shells which do not have even
> functions or other odd bugs seem to have really extinct. But this is a
> different topic).
> 
> However, I strongly suggest to avoid bashisms unless absolutely
> necessary and reasonable. There are scripts where this is reasonable,
> but far too many scripts which use it do not belong to this category.

You seem to mixing up reasonable and necessary in the last sentence. Granted
ebuilds don't need bashisms in many cases; many could indeed be rewritten
to only use sh. Nonetheless, it's not about getting absolutely the most
efficient use of the processor, but about making it easy for people to
write and maintain ebuilds and eclasses.

Given things like the awkwardness and loss of flexibility[1] in only using [
it's entirely reasonable to specify that Gentoo ebuilds use BASH.

> Using arrays to pass parameters is one of the cases of unnecessary usage
> (although this is not widely known - that's that main reason why I posted
> the remark).

Thanks for the discussion, although I do feel we're covering old ground.[2]
Given that ebuilds need BASH, have always needed BASH, and will continue to
do so, can we get on with actually using BASH and not BASHiSH?

[1] http://wooledge.org:8000/BashFAQ/031
[2] http://thread.gmane.org/gmane.linux.gentoo.devel/52102





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [gentoo-dev]  Re: Default src_install for EAPI-2 or following EAPI
  2008-09-22  0:35           ` [gentoo-dev] " Steve Long
@ 2008-09-22  8:25             ` Duncan
  2008-09-24  1:13               ` [gentoo-dev] [project] " Steve Long
  0 siblings, 1 reply; 5+ messages in thread
From: Duncan @ 2008-09-22  8:25 UTC (permalink / raw
  To: gentoo-dev

Steve Long <slong@rathaus.eclipse.co.uk> posted
gb6pop$997$1@ger.gmane.org, excerpted below, on  Mon, 22 Sep 2008 01:35:57
+0100:

>> This is an old rhetorical trick (I don't know its name in English): You
>> impute that I claimed things which I never said - of course, then it is
>> easy for you to prove that these things are wrong.
> What, like saying my point was only about saving two tokens?
> 
>> I _never_ suggested to use code from stone-age for ebuilds
> You did as far as I am concerned.

Careful please, both of you.  This bit looks like it could be headed 
personal, and I don't believe that's in the interest of anyone.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




^ permalink raw reply	[flat|nested] 5+ messages in thread

* [gentoo-dev]  [project] Re: Default src_install for EAPI-2 or following EAPI
  2008-09-22  8:25             ` [gentoo-dev] " Duncan
@ 2008-09-24  1:13               ` Steve Long
  2008-09-24  6:42                 ` [gentoo-dev] " Duncan
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Long @ 2008-09-24  1:13 UTC (permalink / raw
  To: gentoo-dev; +Cc: gentoo-project

Duncan wrote:

> Steve Long <slong@rathaus.eclipse.co.uk> posted
> gb6pop$997$1@ger.gmane.org, excerpted below, on  Mon, 22 Sep 2008 01:35:57
> +0100:
> 
>>> This is an old rhetorical trick (I don't know its name in English): You
>>> impute that I claimed things which I never said - of course, then it is
>>> easy for you to prove that these things are wrong.
>> What, like saying my point was only about saving two tokens?
>> 
>>> I _never_ suggested to use code from stone-age for ebuilds
>> You did as far as I am concerned.
> 
> Careful please, both of you.  This bit looks like it could be headed
> personal, and I don't believe that's in the interest of anyone.
> 
Eh, I feel that's slightly taken out of context, in that we both smiled at
each other during the course of the discussion, and I wouldn't have made
such a long post if I hadn't both respected Vaeth and thought it was a
technical discussion (now that we have project. I'm cross-posting this for
for those who only follow dev; please follow up to project if you wish to
comment further.) I did thank him at the end too, to keep it civil.

Having said that, you're right that it could be read like that, and I
applaud your acting to stop any suggestion of it.

Guess I was being a bit Germanic ;-)





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [gentoo-dev]  Re: [project] Re: Default src_install for EAPI-2 or following EAPI
  2008-09-24  1:13               ` [gentoo-dev] [project] " Steve Long
@ 2008-09-24  6:42                 ` Duncan
  0 siblings, 0 replies; 5+ messages in thread
From: Duncan @ 2008-09-24  6:42 UTC (permalink / raw
  To: gentoo-dev; +Cc: gentoo-project

Steve Long <slong@rathaus.eclipse.co.uk> posted
gbc4n1$2g1$1@ger.gmane.org, excerpted below, on  Wed, 24 Sep 2008 02:13:12
+0100:

>  we both smiled at each other

I missed that.  Thanks.  "Over and out."

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-09-24  6:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <beqsE-8pJ-13@gated-at.bofh.it>
     [not found] ` <beqsE-8pJ-15@gated-at.bofh.it>
     [not found]   ` <beqsE-8pJ-17@gated-at.bofh.it>
     [not found]     ` <beqsD-8pJ-11@gated-at.bofh.it>
     [not found]       ` <bew51-7fR-11@gated-at.bofh.it>
2008-09-21 22:11         ` [gentoo-dev] Re: Re: Default src_install for EAPI-2 or following EAPI Vaeth
2008-09-22  0:35           ` [gentoo-dev] " Steve Long
2008-09-22  8:25             ` [gentoo-dev] " Duncan
2008-09-24  1:13               ` [gentoo-dev] [project] " Steve Long
2008-09-24  6:42                 ` [gentoo-dev] " Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox