From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gentoo-dev+bounces-80857-garchives=archives.gentoo.org@lists.gentoo.org>
Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by finch.gentoo.org (Postfix) with ESMTPS id E151C139694
	for <garchives@archives.gentoo.org>; Wed, 31 May 2017 08:38:36 +0000 (UTC)
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 01EB7E0D20;
	Wed, 31 May 2017 08:38:27 +0000 (UTC)
Received: from smtp.gentoo.org (smtp.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by pigeon.gentoo.org (Postfix) with ESMTPS id 9C580E0BFA
	for <gentoo-dev@lists.gentoo.org>; Wed, 31 May 2017 08:38:26 +0000 (UTC)
Received: from localhost (unknown [IPv6:2a01:e34:eeaa:6bd0:4ecc:6aff:fe03:1cfc])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	(Authenticated sender: aballier)
	by smtp.gentoo.org (Postfix) with ESMTPSA id 5CADA341738
	for <gentoo-dev@lists.gentoo.org>; Wed, 31 May 2017 08:38:25 +0000 (UTC)
Date: Wed, 31 May 2017 10:38:19 +0200
From: Alexis Ballier <aballier@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] [RFC] Forced/automatic USE flag constraints
 (codename: ENFORCED_USE)
Message-ID: <20170531103819.417c2420@gentoo.org>
In-Reply-To: <1496217792.1164.5.camel@gentoo.org>
References: <1496071993.31087.1.camel@gentoo.org>
	<20170529200037.2559f80a@gentoo.org>
	<1496093035.12795.3.camel@gentoo.org>
	<20170530094245.40e1cf64@gentoo.org>
	<20170530092245.681d4aeb@snowblower>
	<20170530104654.31b89e10@gentoo.org>
	<20170530095607.1adbc0b8@snowblower>
	<20170530112518.65b4f9e9@gentoo.org>
	<22829.24276.295.969060@a1i15.kph.uni-mainz.de>
	<1496154812.1238.5.camel@gentoo.org>
	<20170530173340.0b575526@gentoo.org>
	<1496167898.1335.1.camel@gentoo.org>
	<20170530204614.61e8e42c@gentoo.org>
	<1496213717.1164.1.camel@gentoo.org>
	<20170531093257.23b66f88@gentoo.org>
	<1496217792.1164.5.camel@gentoo.org>
Organization: Gentoo
X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu)
Precedence: bulk
List-Post: <mailto:gentoo-dev@lists.gentoo.org>
List-Help: <mailto:gentoo-dev+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-dev+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-dev+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-dev.gentoo.org>
X-BeenThere: gentoo-dev@lists.gentoo.org
Reply-to: gentoo-dev@lists.gentoo.org
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Archives-Salt: 567b2cfa-2878-4f52-b12e-b4c953db09ae
X-Archives-Hash: e7527e1e16df9e517d01140b0f0ff0c1

On Wed, 31 May 2017 10:03:12 +0200
Micha=C5=82 G=C3=B3rny <mgorny@gentoo.org> wrote:

> On =C5=9Bro, 2017-05-31 at 09:32 +0200, Alexis Ballier wrote:
> > On Wed, 31 May 2017 08:55:17 +0200
> > Micha=C5=82 G=C3=B3rny <mgorny@gentoo.org> wrote:
> >  =20
> > > On wto, 2017-05-30 at 20:46 +0200, Alexis Ballier wrote: =20
> > > > On Tue, 30 May 2017 20:11:38 +0200
> > > > Micha=C5=82 G=C3=B3rny <mgorny@gentoo.org> wrote:
> > > > [...]   =20
> > > > > > > Of course, we could just validate all the possible cases
> > > > > > > via repoman, and reject the ebuild if there's at least one
> > > > > > > conflict between them. Not sure how to express that
> > > > > > > properly in the spec though. Not sure how it would work
> > > > > > > practically.     =20
> > > > > >=20
> > > > > > Adding a 2^n check to repoman isn't gonna work well.
> > > > > >      =20
> > > > >=20
> > > > > I'm not saying it's the most optimal algorithm of verifying
> > > > > the correctness of the constraints. It's just the one that's
> > > > > quite obvious -- relatively simple and reliable. If someone
> > > > > can come up with something better that covers at least the
> > > > > most common cases, I'm all for it.
> > > > >=20
> > > > > That said, this wouldn't be that much of a problem if we can
> > > > > keep the n low. For a start, we can rule out all flags that
> > > > > don't appear in REQUIRED_USE at all. Furthermore, I think we
> > > > > could also ignore the constraints for flags that don't appear
> > > > > there at least 'twice', and so on.   =20
> > > >=20
> > > > :)
> > > >=20
> > > > You're applying classical techniques to lower the size of a SAT
> > > > instance so that your exponential algorithm does not explode,
> > > > but it's still hard.
> > > >=20
> > > > I'm not sure what you want: If you want to detect that there is
> > > > an impossible constraint, well, ebuild writer will notice soon
> > > > enough when testing it. If you want to detect that there is a
> > > > way to have a conflict between useflags, then there will be
> > > > valid cases where this will happen.
> > > >=20
> > > > That said, assuming we have REQUIRED_USE in CNF form, its
> > > > negation is in DNF form. Solving SAT on DNF formulas is easy
> > > > (linear I think), and this would give you an assignment of
> > > > useflags triggering an impossible constraint. e.g. 'foo? ( bar
> > > > )' with USE=3D'foo -bar' in make.conf. This could be used to
> > > > trigger a repoman warning but basically every single ebuild
> > > > would trigger those.   =20
> > >=20
> > > Not sure if we understand each other.
> > >=20
> > > I'd like the constraints to be plain straightforward, to the
> > > point of having only one acceptable solution. No special
> > > Portage-style algorithms that attempt to provide a reasonable
> > > solution to unreasonable input, resulting in horrible solutions
> > > that need 20 more hacks every few months. =20
> >=20
> > Yes, we definitely agree here. For that, you need to kill the SAT
> > solver and define (spec) what is the straightforward solution, aka a
> > deterministic and straightforward (*) algorithm. Otherwise, you fall
> > into the problems Ciaran explained in an earlier email.
> >=20
> >=20
> > (*) It's better for the algorithm to be simple enough so that
> > REQUIRED_USE can be written easily for achieving a given behavior.
> >=20
> >  =20
> > > For example:
> > >=20
> > >   foo? ( bar )
> > >=20
> > > would mean 'if you have USE=3Dfoo, then USE=3Dbar is enabled as
> > > well'. Not 'find some random solution which satisfies this'. In
> > > other words, here changing USE=3Dfoo into USE=3D-foo is not an
> > > acceptable solution. =20
> >=20
> >=20
> > What if I specifically set USE=3D-bar in make.conf ? Do we really
> > want PM to override that without telling me ? =20
>=20
> Yes. Unless you specifically and explicitly disable that (globally or
> for USE=3Dbar), in which case the PM would just reject to proceed.


Then could you please explain how to get the list of useflags the
solver is allowed to toggle ? Preferably in a PMS-friendly way (aka no
USE=3Dfoo emerge). It's not clear to me what this would be and is
mandatory for determinism.

Note that most definitions are acceptable, but there must be one.

> > I believe that, from the ebuild POV, the ternary useflag model is
> > more appropriate: You have a whole bunch of ways to specify
> > useflags with portage (make.conf, package.use, profiles, command
> > line, ...). From the ebuild those are collapsed into 'user input'.
> > You only have IUSE (with its defaults) and that's what the
> > auto-solver should play with: those are the flags that can be
> > toggled. =20
>=20
> I see your point. However, it's merely a preference problem and we
> really don't want the constraints to become ternary and/or PM try to
> force the reverse solutions. That's an easy way to lose
> predictability.

They're not ternary anymore after processing ebuild: IUSE=3D"foo +bar"
means instantiate foo as -foo if not specified, and bar as +bar.
The way I see it, ternary model is useful in the sense that the 3rd
undefined state is what the solver can toggle, the others are fixed
(either by user input or e.g. use.mask).

Basically, I see automatic solving of REQUIRED_USE as dynamic IUSE
defaults. But see above, this might not be the best way.

> Besides, I should point out that USE_EXPAND in make.profile
> and make.conf are strictly binary.

Last I checked IUSE=3D"+use_expand_foo use_expand_bar" worked just as
useflags.

> Furthermore, the ternary idea
> starts becoming blurry when you have to deal with profiles that you
> explicitly want to disable in user configuration.

I'm not following you here.

> > > Now, this also means that every constraint that can't be solved in
> > > this easy fashion is invalid. We want to detect that, and warn the
> > > developer. Some of those could be tricky. Simple example:
> > >=20
> > >   foo? ( baz ) bar? ( !baz )
> > >=20
> > > This one is invalid because USE=3D'foo bar' requires both 'baz' and
> > > '!baz' as a solution. Remember that we don't want to do any
> > > changes besides what's explicitly written there, no guessing. =20
> >=20
> > Besides that, what makes it invalid ? =20
>=20
> What makes it invalid is that you can't solve it in a predictable way.

You can fail in a predictable way and ebuild writer can adjust it to
avoid that. Again, you *need* to process the constraints in order. '!a?
( b ) !b? ( a )' is not deterministic when none of a and b are
enabled otherwise.

You'll get a message like:
"
 The constraint bar? ( !baz )' is violated.
    bar is enabled because $reason (say, make.conf)
    baz is enabled because of the constraint 'foo? ( baz )'
			      foo is enabled because $reason
"

> >  How is it more invalid than '?? ( foo bar )' ? =20
>=20
> This would go into:
>=20
>   foo? ( !bar )

Just to be clear: Are you suggesting banning '??' from the syntax or
simply an internal rewrite for the solver ?

> which gives a single predictable solution:

Then ebuild writer should not write 'foo? ( baz ) bar? ( !baz )' but
rather: 'foo? ( !bar baz ) bar? ( !baz )', which should cover more
cases. With USE=3D"foo bar", the message would then be:
"
  The constraint 'foo? ( !bar )' is violated.
      foo is enabled because $reason
      bar is enabled because $reason
"
which is similar to the above except there's one less step for
explaining the reasons. It's not dramatic but it is, indeed, desirable
to have simple & clear reasons. I'd say that's more to the argument for
specifying completely how to solve that and leave those small
improvements to ebuild writers and/or QA.

[...]
> > > However, the
> > > following should be valid:
> > >=20
> > >   foo? ( baz ) bar? ( !foo !baz )
> > >=20
> > > Because now we clearly indicate that USE=3Dbar disables USE=3Dfoo,
> > > and therefore makes the first constraint inapplicable. It clearly
> > > indicates course of action for all combinations: =20
> >=20
> > Ok, I now think you're aiming for giving full power to the solver,
> > overriding user inputs if necessary. Before going further, I think
> > we should first agree on what are the useflags such a solver can
> > toggle. I'm not sure 'USE=3Dfoo emerge blah' should disable foo
> > instead of failing for example.
> >  =20
>=20
> As I said, it's a matter of configuration to decide which flags should
> be touched, and which not. Of course if we find that necessary, we may
> go into defining a specific set in the profiles or metadata.
>=20
> However, I would rather focus on getting a PoC solver and checker
> first, and play with existing constraints to see how it all works.

The solver seems on good tracks, at least from the algorithmic POV. The
checker, however, is not clear at all to me. The main reason is that to
determine if the solver will be able to solve it, it needs to know what
the solver can toggle and what not.

Alexis.