From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gentoo-dev+bounces-80994-garchives=archives.gentoo.org@lists.gentoo.org>
Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by finch.gentoo.org (Postfix) with ESMTPS id E89B2139694
	for <garchives@archives.gentoo.org>; Fri,  9 Jun 2017 14:16:38 +0000 (UTC)
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 065F6E0DC3;
	Fri,  9 Jun 2017 14:16:29 +0000 (UTC)
Received: from smtp.gentoo.org (dev.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by pigeon.gentoo.org (Postfix) with ESMTPS id A2FBFE0B78
	for <gentoo-dev@lists.gentoo.org>; Fri,  9 Jun 2017 14:16:28 +0000 (UTC)
Received: from localhost (unknown [IPv6:2a01:e34:eeaa:6bd0:4ecc:6aff:fe03:1cfc])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	(Authenticated sender: aballier)
	by smtp.gentoo.org (Postfix) with ESMTPSA id 95B66341850
	for <gentoo-dev@lists.gentoo.org>; Fri,  9 Jun 2017 14:16:26 +0000 (UTC)
Date: Fri, 9 Jun 2017 16:16:19 +0200
From: Alexis Ballier <aballier@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints
 (codename: ENFORCED_USE)
Message-ID: <20170609161619.1b72ad5b@gentoo.org>
In-Reply-To: <1497012847.25475.4.camel@gentoo.org>
References: <1496071993.31087.1.camel@gentoo.org>
	<20170531103819.417c2420@gentoo.org>
	<1496235892.25038.1.camel@gentoo.org>
	<20170531193922.477245bb@gentoo.org>
	<1496257344.25758.1.camel@gentoo.org>
	<20170601105523.08a9234e@gentoo.org>
	<1496352685.30502.4.camel@gentoo.org>
	<20170602132758.50a5f734@gentoo.org>
	<1496411717.29233.5.camel@gentoo.org>
	<20170603130000.4f88fb14@gentoo.org>
	<1496503989.15351.1.camel@gentoo.org>
	<20170603185835.57741ff0@gentoo.org>
	<20170604105938.2b40157f@gentoo.org>
	<20170605095516.0b463432@gentoo.org>
	<1496671825.1230.3.camel@gentoo.org>
	<20170605192433.6238797b@gentoo.org>
	<1496686212.1222.4.camel@gentoo.org>
	<20170606140803.051f8048@gentoo.org>
	<1496770744.1157.1.camel@gentoo.org>
	<20170607101759.7e21f0f6@gentoo.org>
	<1496827679.2129.3.camel@gentoo.org>
	<20170607115654.2a5da5e2@gentoo.org>
	<1496999960.29391.1.camel@gentoo.org>
	<20170609134110.418ae6ac@gentoo.org>
	<1497012847.25475.4.camel@gentoo.org>
Organization: Gentoo
X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu)
Precedence: bulk
List-Post: <mailto:gentoo-dev@lists.gentoo.org>
List-Help: <mailto:gentoo-dev+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-dev+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-dev+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-dev.gentoo.org>
X-BeenThere: gentoo-dev@lists.gentoo.org
Reply-to: gentoo-dev@lists.gentoo.org
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Archives-Salt: c909b65e-32bf-4c73-9afc-4a5b935d2de8
X-Archives-Hash: 537aeb23eb801d4ce190e57f8695406d

On Fri, 09 Jun 2017 14:54:07 +0200
Micha=C5=82 G=C3=B3rny <mgorny@gentoo.org> wrote:

> On pi=C4=85, 2017-06-09 at 13:41 +0200, Alexis Ballier wrote:
> > On Fri, 09 Jun 2017 11:19:20 +0200
> > Micha=C5=82 G=C3=B3rny <mgorny@gentoo.org> wrote:
> >  =20
> > > On =C5=9Bro, 2017-06-07 at 11:56 +0200, Alexis Ballier wrote: =20
> > > > On Wed, 07 Jun 2017 11:27:59 +0200
> > > > Micha=C5=82 G=C3=B3rny <mgorny@gentoo.org> wrote:
> > > >    =20
> > > > > On =C5=9Bro, 2017-06-07 at 10:17 +0200, Alexis Ballier wrote:   =
=20
> > > > > > > Also, do I presume correctly that for all supported cases
> > > > > > > (i.e. those which your nsolve does not reject), solve and
> > > > > > > nsolve are compatible?    =20
> > > > > >=20
> > > > > > Not sure what you mean here. nsolve does not solve
> > > > > > anything, it just validates REQUIRED_USE so that it is
> > > > > > guaranteed it can be solved.     =20
> > > > >=20
> > > > > What I mean is whether you can guarantee that:
> > > > >=20
> > > > > a. for every X that nsolve(X) =3D=3D ok, solve() will be able to
> > > > > find a valid solution,   =20
> > > >=20
> > > > yes
> > > >    =20
> > > > > b. for every X that solve() can solve reliably, nsolve(X) =3D=3D
> > > > > ok.   =20
> > > >=20
> > > > no and that's not really possible   =20
> > >=20
> > > I thought so. To expand it a little, could you confirm whether I
> > > correctly presume that:
> > >=20
> > > a. for all 'good' results, the plain iterative solve() should be
> > > able to find a solution with a single iteration? =20
> >=20
> > yes that's the point of it
> >  =20
> > > b. for all 'need toposort' results, the solve() should be able to
> > > find a solution with n>=3D1 iterations? =20
> >=20
> > yes; though n is only bounded by the # of clauses (expanded as
> > implications) while it can be 1 if reorderer properly; I wouldn't
> > bother doing several iterations and just reject that at repoman
> > side since it's easily solved =20
>=20
> I would prefer not having Portage fail randomly on local ebuilds where
> the cost of multiple iterations is practically zero, and certainly
> it's not worth the effort to ensure a particular ordering.


Yep, makes sense.

It makes it harder to guess though: "a? ( b ) !b? ( a )" will
enable both a and b if USE=3D-*. This is not obvious to me from a single
read of the constraint. I need to tell myself "Oh, I've now enabled 'a'
and I need to recheck the first clause.".

I was thinking more from a spec perspective where I would definitely
not want to mandate PM to do fixpoint(solve) or having to detect a
cycle before failing. Also, I believe that forcing a single pass solver
will help in ensuring that the solver solves it the way the developer
writing the constraint meant it.


[...]
> > > > > > > Implication(useflag, consequence) ->     =20
> > > > > >=20
> > > > > >                  if not input[useflag]: raise
> > > > > > "impossible"     =20
> > > > >=20
> > > > > Why impossible? Unless I'm missing something, it's false
> > > > > already.   =20
> > > >=20
> > > > 'foo? bar' is always true if foo is false; so it's impossible to
> > > > make it false   =20
> > >=20
> > > Yes, you are correct. I was actually thinking of 'if LHS is
> > > false, we do not enforce RHS'. =20
> >=20
> > I'm wrong actually. It can be falsified by setting foo to True and
> > bar to False. More on it below. =20
>=20
> Well, I'm not sure if it can still plainly apply here but the generic
> contract was that in implication clauses only RHS is mutable.


Yeah, that's basically what I inferred from trying to figure out the
meaning out of it later on.

[...]
> > > > > >=20
> > > > > > Note how the above favors leftmost in all cases. If it
> > > > > > needs to change something, it always tries to leave the
> > > > > > leftmost untouched. Note also that it processes everything
> > > > > > left to right (the AllOf case where
> > > > > > REQUIRED_USE=3D"AllOf(list of clauses)" ).     =20
> > > > >=20
> > > > > You need to be able to reorder the clauses to handle use.force
> > > > > and use.mask.   =20
> > > >=20
> > > > Not sure if reorder is the best way. It sure works, but maybe
> > > > we'd want a repoman error if e.g. 'foo? ( bar )' is in
> > > > REQUIRED_USE, bar is masked but not foo. That'd be a matter of
> > > > eliminating the constants in the ast and if we get 'false' for
> > > > a profile we error out.   =20
> > >=20
> > > Yes, we want that. It makes sense for pure implications. For n-ary
> > > thingies, it's harder than that and I'd rather not require
> > > developers to figure out a specific order to make things work.
> > >=20
> > > Think of the following:
> > >=20
> > >   || ( X Y )
> > >=20
> > > with X being masked on profile A, Y being masked on profile B. You
> > > just can't get it right then.
> > >=20
> > > Of course, this is a bit unrealistic but I think a case when a
> > > preferred (newer) provider is masked on some architectures is
> > > realistic. I don't think we want to prefer a worse solution (or go
> > > back to applying package.use workarounds) because of fringe
> > > arches.
> > >=20
> > > Another question is whether we could solve this specific problem
> > > via applying some kind of 'if forced/masked' logic into the AST. =20
> >=20
> > I think we're having communication interferences here :) That's
> > exactly what I'm talking about.
> >=20
> > More specifically:
> >=20
> > For each profile:
> > 	formula=3Dreplace_masks_and_force_by_constants(required_use)
> > 	simplify(formula)
> > 	nsolve(formula)
> >=20
> > In your example above, we'd call 'nsolve("|| ( X )")' and
> > 'nsolve("|| ( Y )")' (or even simpler, depending on how simplify()
> > is defined). If both X and Y are masked on a profile, then that'd
> > reduce to 'nsolve("False")' which would rant. =20
>=20
> So you're talking about reducing prior to transforming? Yes, that'd
> work. As I mentioned in one of the first mails wrt my reference
> implementation, I've used reordering (stable sort) instead of reducing
> since it was simpler.
>=20
> If you reduce (simplify), you need to account for special cases like
> getting '|| ()' etc. If you reorder only, things just fail the normal
> way.

While the reordering idea seems nice as it factors both user
preference and masks, the problem with reordering is that nothing
guarantees that the solver won't try to enable a masked flag. We'd have
to deal with that somehow.
I think reordering should be kept for user preferences
(soft-enable/soft-disable) while masks for hard-no's or hard-yes'es.


Be careful with reordering though:
'^^ ( a b ) b? ( a )' can be solved in one pass.
(it disables b if both are set and enables a if none are set)

while:
'^^ ( b a ) b? ( a )' loops
(if both are set it disables 'a' for the 1st clause but then enables it
for the 2nd)

This is not checked by nsolve().

> > I think that for solve(), reordering is basically equivalent since
> > we always favour leftmost. However, we have to somehow guarantee
> > that solve() will never try to enable a masked flag. This is
> > guaranteed by the above checks on profiles if enforced by repoman.
> > If we apply the same logic for solve that'd basically be built-in
> > (masked/forced flags wont appear in solve()'s input anymore), which
> > I believe is better. Keep in mind that users can mask and force
> > some useflags, so the repoman check won't catch that.
> >=20
> >  =20
> > > > > > So, the very first thing to do is to agree that the above
> > > > > > solver (the trueify function) is what we want to implement
> > > > > > and set this in stone. There's no point in implementing a
> > > > > > proper requse checker if the algorithm is meant to change.
> > > > > > Having a formal definition will also be necessary to
> > > > > > mandate that in future EAPIs.
> > > > > >=20
> > > > > > Then, and only then, we'd need to have the above solver
> > > > > > implemented into portage (hidden under a FEATURES) and
> > > > > > import my nsolve into repoman (after due cleanup).
> > > > > >      =20
> > > > >=20
> > > > > Yes, that's my goal. However, before we can set the algorithm
> > > > > in stone we need to verify that it will work in all of the
> > > > > supported cases.   =20
> > > >=20
> > > > Yep, that's the point of nsolve/classify :)
> > > >    =20
> > > > > Preferably it should also be as simple as possible to avoid
> > > > > putting too much complexity in the spec.   =20
> > > >=20
> > > > Yes; not sure if anything simpler than the above can be achieved
> > > > though.  =20
> > >=20
> > > Actually, I think we should roughly proceed in the following way:
> > >=20
> > > 1. Design the solve() algorithm.
> > >=20
> > > 1a. Test solve() on isolated test cases.
> > >=20
> > > 1b. Integrate solve() into Portage (as optional feature) and
> > > encourage wider testing.
> > >=20
> > > 2. Design the nsolve() verification based on solve().
> > >=20
> > > 2a. Test nsolve() separately.
> > >=20
> > > 2b. Start fixing simple cases that do not bring any controversy
> > > (like most of the reorderings).
> > >=20
> > > 2c. Integrate nsolve() into repoman.
> > >=20
> > > 3. Spec the whole thing and decide how to go next.
> > >=20
> > >=20
> > > Assuming that 1* and 2* is to be done simultaneously, and each
> > > subpoint implies that if we hit any walls, we can go back to a
> > > previous point and fix the design.
> > >=20
> > > I think we're mostly past 1a and 2a now. We should try to think a
> > > bit more about the corner cases right now, and when we're done
> > > with that, the next step would be to:
> > >=20
> > > A. start fixing some cases ('need topo sort' at least?) to improve
> > > testing area for the implementation,
> > >=20
> > > B. integrate solve() into Portage,
> > >=20
> > > C. maybe optionally integrate nsolve() into repoman.
> > >=20
> > > However, I don't want to set the spec into stone until we have
> > > some live testing via Portage with FEATURES=3Dsolve-required-use or
> > > alike. For repoman, we can add the verification optionally (i.e.
> > > with the above feature or via a separate command-line option) but
> > > I don't think we should enable it by default until we verify it.
> > >  =20
> >=20
> > Yes, I agree. Since solve() and nsolve() are very related we should
> > carefully think how to organize the code then in order to avoid
> > duplication or them getting out of sync.
> >=20
> > As for enabling it into repoman, I'm not sure which is best. Safest
> > way is obviously as you suggest, but a warning is not a blocker, so
> > enabling it by default can have a nice effect of receiving hate
> > mails from angry developers if something is wrong.
> >  =20
>=20
> There's wisdom in this as well. I'm just a little bit worried that
> we're first make developers change their REQUIRED_USE, and then have
> to change the algorithm anyway ;-).


Yeah, I wouldn't enable it without some good testing on the whole tree
and starting to deal with the annoying cases. But then I'd assume it'd
be good.

Alexis.