public inbox for gentoo-pms@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-pms] [PATCH] EAPI must be at least a single char.
@ 2012-09-30  8:28 Brian Harring
  2012-09-30  9:13 ` Ulrich Mueller
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Harring @ 2012-09-30  8:28 UTC (permalink / raw
  To: gentoo-pms; +Cc: Brian Harring

The example regex allowed for
EAPI=

Which isn't used, alloewd, nor desired; at best, that's short hand
for EAPI=0.
---
 ebuild-vars.tex | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ebuild-vars.tex b/ebuild-vars.tex
index 53de53d..ec0d84e 100644
--- a/ebuild-vars.tex
+++ b/ebuild-vars.tex
@@ -139,7 +139,7 @@ once. The assignment must not be preceded by any lines other than blank lines or
 with optional whitespace (spaces or tabs) followed by a \t{\#} character, and the line containing
 the assignment statement must match the following regular expression:
 \begin{verbatim}
-^[ \t]*EAPI=(['"]?)([A-Za-z0-9+_.-]*)\1[ \t]*([ \t]#.*)?$
+^[ \t]*EAPI=(['"]?)([A-Za-z0-9+_.-]+)\1[ \t]*([ \t]#.*)?$
 \end{verbatim}
 
 The package manager must determine the EAPI of an ebuild by parsing its first non-blank and
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-09-30  8:28 [gentoo-pms] [PATCH] EAPI must be at least a single char Brian Harring
@ 2012-09-30  9:13 ` Ulrich Mueller
  2012-09-30 19:44   ` Brian Harring
  0 siblings, 1 reply; 10+ messages in thread
From: Ulrich Mueller @ 2012-09-30  9:13 UTC (permalink / raw
  To: gentoo-pms; +Cc: Brian Harring

>>>>> On Sun, 30 Sep 2012, Brian Harring wrote:

> The example regex allowed for
> EAPI=

> Which isn't used, alloewd, nor desired; at best, that's short hand
> for EAPI=0.

The regexp was never meant to match exactly what is allowed.
It matches a superset, and if the matched substring turns out to be an
illegal or unknown EAPI, then the PM can catch it after parsing.

Otherwise, an empty EAPI is already forbidden by section 3.1
"Restrictions upon Names".

Ulrich


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-09-30  9:13 ` Ulrich Mueller
@ 2012-09-30 19:44   ` Brian Harring
  2012-09-30 19:52     ` Zac Medico
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Harring @ 2012-09-30 19:44 UTC (permalink / raw
  To: Ulrich Mueller; +Cc: gentoo-pms

On Sun, Sep 30, 2012 at 11:13:31AM +0200, Ulrich Mueller wrote:
> >>>>> On Sun, 30 Sep 2012, Brian Harring wrote:
> 
> > The example regex allowed for
> > EAPI=
> 
> > Which isn't used, alloewd, nor desired; at best, that's short hand
> > for EAPI=0.
> 
> The regexp was never meant to match exactly what is allowed.
> It matches a superset, and if the matched substring turns out to be an
> illegal or unknown EAPI, then the PM can catch it after parsing.
>
> Otherwise, an empty EAPI is already forbidden by section 3.1
> "Restrictions upon Names".

Going by the fact this sort of scenario has come up before for me, I 
guess I just do things differently.

The rules are such that "either an EAPI is matched, or EAPI=0 if the 
first non blank/non comment isn't a matching EAPI assignment".

You want the rules to be left as:

"Here's this regex you use to find the EAPI.  The actual rules of the 
EAPI value are off in another part of the doc that we're not going to 
refer to, but you need to make sure it still is valid despite this 
regex allowing invalid EAPI's in."

Respectfully, that is daft since it's going to result in anyone who 
tries to write a PM, likely fucking that up.  If what you were saying 
was the actual intention behind it, that assignment would've just been 
along the lines of EAPI=("[^"]*"|'[^']*'|[^\t ]); aka "here's how you 
grab what looks like an EAPI assignment".

Either way, I strongly disagree with your counterargument of leaving 
the spec as it is.

The following is saner:

^[ \t]*EAPI=(['"]?)([A-Za-z0-9_][A-Za-z0-9+_.-]*)\1[ \t]*([ \t]#.*)?$

And the following multiline match covers the rules in full (ie, ready 
to go meaning people aren't likely to fuck it up):

r"(?:^[ \t]*(?:#[^\n]*)?(?:$|\n))*^[ \t]*EAPI=(['\"]?)([A-Za-z0-9][A-Za-z0-9+_.-]*)\1"

Roughly, if the spec is left the way you intend it,

EAPI=-1 will get picked up, then it's reliant on the PM to spot 
it/explode; which in a quick check of portages source, it doesn't 
do.

Now, if a QA tool wants to use a looser version to spot EAPI 
assignments of an invalid EAPI, have at it; but the core spec's "find 
an eapi via this" should block invalid EAPI's from being picked up, 
period.

~harring


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-09-30 19:44   ` Brian Harring
@ 2012-09-30 19:52     ` Zac Medico
  2012-09-30 22:10       ` Ulrich Mueller
  2012-09-30 22:22       ` Brian Harring
  0 siblings, 2 replies; 10+ messages in thread
From: Zac Medico @ 2012-09-30 19:52 UTC (permalink / raw
  To: gentoo-pms

On 09/30/2012 12:44 PM, Brian Harring wrote:
> tries to write a PM, likely fucking that up.  If what you were saying 
> was the actual intention behind it, that assignment would've just been 
> along the lines of EAPI=("[^"]*"|'[^']*'|[^\t ]); aka "here's how you 
> grab what looks like an EAPI assignment".

I would have preferred a regex that just matches any assignment like
this, but didn't feel like bikeshedding it, since the one that's
currently in the spec works in practice.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-09-30 19:52     ` Zac Medico
@ 2012-09-30 22:10       ` Ulrich Mueller
  2012-10-01  0:48         ` Zac Medico
  2012-09-30 22:22       ` Brian Harring
  1 sibling, 1 reply; 10+ messages in thread
From: Ulrich Mueller @ 2012-09-30 22:10 UTC (permalink / raw
  To: gentoo-pms

>>>>> On Sun, 30 Sep 2012, Zac Medico wrote:

>> tries to write a PM, likely fucking that up.  If what you were
>> saying was the actual intention behind it, that assignment would've
>> just been along the lines of EAPI=("[^"]*"|'[^']*'|[^\t ]); aka
>> "here's how you grab what looks like an EAPI assignment".

> I would have preferred a regex that just matches any assignment like
> this, but didn't feel like bikeshedding it,

I doesn't really matter what the regexp is, as long as it matches a
superset of the allowed EAPI syntax. Unknown or invalid EAPIs that the
regexp doesn't match will still be rejected, because the results from
parsing and from sourcing the ebuild don't agree.

And it is practically impossible to find a regexp that would match all
valid bash assignments. You would have to handle backslash escapes,
dollar signs, and whatnot.

> since the one that's currently in the spec works in practice.

Exactly.

Ulrich


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-09-30 19:52     ` Zac Medico
  2012-09-30 22:10       ` Ulrich Mueller
@ 2012-09-30 22:22       ` Brian Harring
  2012-10-01  0:25         ` Zac Medico
  1 sibling, 1 reply; 10+ messages in thread
From: Brian Harring @ 2012-09-30 22:22 UTC (permalink / raw
  To: gentoo-pms

On Sun, Sep 30, 2012 at 12:52:40PM -0700, Zac Medico wrote:
> On 09/30/2012 12:44 PM, Brian Harring wrote:
> > tries to write a PM, likely fucking that up.  If what you were saying 
> > was the actual intention behind it, that assignment would've just been 
> > along the lines of EAPI=("[^"]*"|'[^']*'|[^\t ]); aka "here's how you 
> > grab what looks like an EAPI assignment".
> 
> I would have preferred a regex that just matches any assignment like
> this, but didn't feel like bikeshedding it, since the one that's
> currently in the spec works in practice.

Counter point; portage doesn't actually enforce the rules of a valid 
EAPI name; correct me if I'm wrong obviously, but in checking the 
source, didn't see any such validation.

If the regex were as I suggested, that would be a non issue and we'd 
have *guranteed* EAPI name compliance- else it wouldn't match the 
invalid EAPI, and would stop looking at that line (falling back to 
EAPI=0).

Basically, I'd like y'all to spell out the actual gains of having it 
loose like this, especailly in light of the fact the majority PM, via 
relying on that alone, doesn't do EAPI value enforcement.

If we used the regex I suggested in the second email, this issue goes 
away, and we remove a potential landmine.

Clarify to me why that landmine should be left in place.

~harring


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-09-30 22:22       ` Brian Harring
@ 2012-10-01  0:25         ` Zac Medico
  2012-10-01  7:15           ` Ulrich Mueller
  0 siblings, 1 reply; 10+ messages in thread
From: Zac Medico @ 2012-10-01  0:25 UTC (permalink / raw
  To: gentoo-pms

On 09/30/2012 03:22 PM, Brian Harring wrote:
> On Sun, Sep 30, 2012 at 12:52:40PM -0700, Zac Medico wrote:
>> On 09/30/2012 12:44 PM, Brian Harring wrote:
>>> tries to write a PM, likely fucking that up.  If what you were saying 
>>> was the actual intention behind it, that assignment would've just been 
>>> along the lines of EAPI=("[^"]*"|'[^']*'|[^\t ]); aka "here's how you 
>>> grab what looks like an EAPI assignment".
>>
>> I would have preferred a regex that just matches any assignment like
>> this, but didn't feel like bikeshedding it, since the one that's
>> currently in the spec works in practice.
> 
> Counter point; portage doesn't actually enforce the rules of a valid 
> EAPI name; correct me if I'm wrong obviously, but in checking the 
> source, didn't see any such validation.

It doesn't need to check that EAPI names are valid. It only needs to
check that they are supported, so that's what it does.

> If the regex were as I suggested, that would be a non issue and we'd 
> have *guranteed* EAPI name compliance- else it wouldn't match the 
> invalid EAPI, and would stop looking at that line (falling back to 
> EAPI=0).
> 
> Basically, I'd like y'all to spell out the actual gains of having it 
> loose like this, especailly in light of the fact the majority PM, via 
> relying on that alone, doesn't do EAPI value enforcement.

It's enforced by the fact that all supported EAPIs happen to have valid
EAPI values.

> If we used the regex I suggested in the second email, this issue goes 
> away, and we remove a potential landmine.
> 
> Clarify to me why that landmine should be left in place.

The current spec works in practice, and that's all I really care about.
Still, I'd be in favor of changing the regex to match practically
anything that looks like an EAPI assignment, regardless of the validity
of the EAPI value.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-09-30 22:10       ` Ulrich Mueller
@ 2012-10-01  0:48         ` Zac Medico
  0 siblings, 0 replies; 10+ messages in thread
From: Zac Medico @ 2012-10-01  0:48 UTC (permalink / raw
  To: gentoo-pms

On 09/30/2012 03:10 PM, Ulrich Mueller wrote:
>>>>>> On Sun, 30 Sep 2012, Zac Medico wrote:
> 
>>> tries to write a PM, likely fucking that up.  If what you were
>>> saying was the actual intention behind it, that assignment would've
>>> just been along the lines of EAPI=("[^"]*"|'[^']*'|[^\t ]); aka
>>> "here's how you grab what looks like an EAPI assignment".
> 
>> I would have preferred a regex that just matches any assignment like
>> this, but didn't feel like bikeshedding it,
> 
> I doesn't really matter what the regexp is, as long as it matches a
> superset of the allowed EAPI syntax. Unknown or invalid EAPIs that the
> regexp doesn't match will still be rejected, because the results from
> parsing and from sourcing the ebuild don't agree.
> 
> And it is practically impossible to find a regexp that would match all
> valid bash assignments. You would have to handle backslash escapes,
> dollar signs, and whatnot.

Still, it might make sense to replace ([A-Za-z0-9+_.-]*) with (.*),
since that should work fine in practice. The spec is fine with me as it
is, but I don't think a change like this would hurt.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-10-01  0:25         ` Zac Medico
@ 2012-10-01  7:15           ` Ulrich Mueller
  2012-10-01 16:33             ` Zac Medico
  0 siblings, 1 reply; 10+ messages in thread
From: Ulrich Mueller @ 2012-10-01  7:15 UTC (permalink / raw
  To: gentoo-pms

>>>>> On Sun, 30 Sep 2012, Zac Medico wrote:

> Still, I'd be in favor of changing the regex to match practically
> anything that looks like an EAPI assignment, regardless of the
> validity of the EAPI value.

I don't believe that this is possible. You'll never be able to match
everyting that it valid bash syntax, other than by parsing it with
bash itself. Which is what we wanted to avoid, in the first place.

(( $'\x45\x41\x50\x49' = 4 + 1 ))

Ulrich


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-pms] [PATCH] EAPI must be at least a single char.
  2012-10-01  7:15           ` Ulrich Mueller
@ 2012-10-01 16:33             ` Zac Medico
  0 siblings, 0 replies; 10+ messages in thread
From: Zac Medico @ 2012-10-01 16:33 UTC (permalink / raw
  To: gentoo-pms

On 10/01/2012 12:15 AM, Ulrich Mueller wrote:
>>>>>> On Sun, 30 Sep 2012, Zac Medico wrote:
> 
>> Still, I'd be in favor of changing the regex to match practically
>> anything that looks like an EAPI assignment, regardless of the
>> validity of the EAPI value.
> 
> I don't believe that this is possible. You'll never be able to match
> everyting that it valid bash syntax, other than by parsing it with
> bash itself. Which is what we wanted to avoid, in the first place.
> 
> (( $'\x45\x41\x50\x49' = 4 + 1 ))

The package manager wouldn't need any more knowledge of bash syntax than
it currently has. All of the implementation details would be the same as
they are now. The only difference would be that the regex would return a
match in _more_ cases. I think that this would be preferable to Brian's
suggestion to make the regex fail to match an empty EAPI setting (which
would return a match in _fewer_ cases).
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-10-01 16:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-30  8:28 [gentoo-pms] [PATCH] EAPI must be at least a single char Brian Harring
2012-09-30  9:13 ` Ulrich Mueller
2012-09-30 19:44   ` Brian Harring
2012-09-30 19:52     ` Zac Medico
2012-09-30 22:10       ` Ulrich Mueller
2012-10-01  0:48         ` Zac Medico
2012-09-30 22:22       ` Brian Harring
2012-10-01  0:25         ` Zac Medico
2012-10-01  7:15           ` Ulrich Mueller
2012-10-01 16:33             ` Zac Medico

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox