public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Regarding my final year thesis
@ 2014-11-06 12:49 Harsh Bhatt
  2014-11-06 13:25 ` Jauhien Piatlicki
  0 siblings, 1 reply; 74+ messages in thread
From: Harsh Bhatt @ 2014-11-06 12:49 UTC (permalink / raw
  To: gentoo-dev@lists.gentoo.org

[-- Attachment #1: Type: text/plain, Size: 388 bytes --]

I an Applied Maths student, currently in my final year. In my last 6 months i need to do a thesis something related to Mathematics as i am a Maths student. I have been using gentoo for quite a long time so was thinking to do something related to gentoo. Give me suggestion of what can be done. Anything related to  modeling, simulation or Discreet Mathematics would be a better choice.

[-- Attachment #2: Type: text/html, Size: 715 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-06 12:49 [gentoo-dev] Regarding my final year thesis Harsh Bhatt
@ 2014-11-06 13:25 ` Jauhien Piatlicki
  2014-11-06 13:43   ` Ciaran McCreesh
  2014-11-06 21:28   ` [gentoo-dev] Regarding my final year thesis Jeroen Roovers
  0 siblings, 2 replies; 74+ messages in thread
From: Jauhien Piatlicki @ 2014-11-06 13:25 UTC (permalink / raw
  To: gentoo-dev, Harsh Bhatt

[-- Attachment #1: Type: text/plain, Size: 807 bytes --]

Hi,

Mathematics you said? That's nice. You can, for example, redesign our
portage's dependency solving algorithm, as it is quite slow at the
moment. ) I do not know what it does have inside right now, but using
SAT solver can be a good idea (there is a successful example already:
https://en.opensuse.org/openSUSE:Libzypp_satsolver)

--
Regards,
Jauhien

On 11/06/2014 01:49 PM, Harsh Bhatt wrote:
> I an Applied Maths student, currently in my final year. In my last 6 months i need to do a thesis something related to Mathematics as i am a Maths student. I have been using gentoo for quite a long time so was thinking to do something related to gentoo. Give me suggestion of what can be done. Anything related to  modeling, simulation or Discreet Mathematics would be a better choice.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-06 13:25 ` Jauhien Piatlicki
@ 2014-11-06 13:43   ` Ciaran McCreesh
  2014-11-06 15:18     ` Ian Stakenvicius
  2014-11-07  9:42     ` [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis) Jauhien Piatlicki
  2014-11-06 21:28   ` [gentoo-dev] Regarding my final year thesis Jeroen Roovers
  1 sibling, 2 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-06 13:43 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2212 bytes --]

On Thu, 06 Nov 2014 14:25:46 +0100
Jauhien Piatlicki <jauhien@gentoo.org> wrote:
> Mathematics you said? That's nice. You can, for example, redesign our
> portage's dependency solving algorithm, as it is quite slow at the
> moment. ) I do not know what it does have inside right now, but using
> SAT solver can be a good idea (there is a successful example already:
> https://en.opensuse.org/openSUSE:Libzypp_satsolver)

A SAT encoding for dependency resolution is a *terrible* idea, for all
kinds of reasons (some of which are Gentoo-specific, and some of which
are not).

* The model would be full of implication constraints, which perform
terribly under unit propagation.

* You can't get decent human-readable explanations of failure out of SAT
solvers.

* You're not just trying to find a correct resolution. You're trying to
find an optimal resolution, with respect to some very difficult
criteria. For example, you don't want to install any extra unrelated
packages. This is very hard to express in SAT if you're going for a
model which preserves consistency.

* Coming up with a legal ordering in SAT is a pain. It's worse if you're
trying to fully solve circular dependencies: if you do, dependency
resolution becomes harder than NP, so you'd at least need a QSAT
solver, not SAT.

* Coming up with a legal resolution isn't always the right thing to do.
Often a legal resolution can be obtained by uninstalling a whole load
of stuff or switching loads of USE flags off. But it's better to give a
good error to the user than to come up with a legal but stupid
resolution. In other words, you *don't* want a complete algorithm, you
want a domain-aware incomplete algorithm.

If you're going to go the toolkit route, you should be using a CP
solver, not a SAT solver. But even then you'd be better off making some
changes and not using plain old MAC, so you're back to writing the
algorithms yourself.

What you need is for someone who understands CP and SAT to write a
resolver using algorithms inspired by how CP and SAT solvers work, but
not just blindly copying them. Doing this well is at least a full year
Masters level project...

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-06 13:43   ` Ciaran McCreesh
@ 2014-11-06 15:18     ` Ian Stakenvicius
  2014-11-06 15:26       ` Ciaran McCreesh
  2014-11-07  9:42     ` [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis) Jauhien Piatlicki
  1 sibling, 1 reply; 74+ messages in thread
From: Ian Stakenvicius @ 2014-11-06 15:18 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 06/11/14 08:43 AM, Ciaran McCreesh wrote:
> On Thu, 06 Nov 2014 14:25:46 +0100 Jauhien Piatlicki
> <jauhien@gentoo.org> wrote:
>> Mathematics you said? That's nice. You can, for example, redesign
>> our portage's dependency solving algorithm, as it is quite slow
>> at the moment. ) I do not know what it does have inside right
>> now, but using SAT solver can be a good idea (there is a
>> successful example already: 
>> https://en.opensuse.org/openSUSE:Libzypp_satsolver)
> 
> A SAT encoding for dependency resolution is a *terrible* idea, for
> all kinds of reasons (some of which are Gentoo-specific, and some
> of which are not).
> 
> [ Snip! ]
> 
> What you need is for someone who understands CP and SAT to write a 
> resolver using algorithms inspired by how CP and SAT solvers work,
> but not just blindly copying them. Doing this well is at least a
> full year Masters level project...
> 


...well, if this is an undergrad project, he could start with the SAT
solver and then do what you recommend for his Masters' .. :)


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iF4EAREIAAYFAlRbkToACgkQ2ugaI38ACPBwYwEAtrXJFaVlf4WSv7eV8N+vX6T9
VFq56sh59LmeJ6+UMJcA/33trhsYdNAoRe6i/RWIIRQw8zyS37lIo6I9bLA7TEPg
=7kZS
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-06 15:18     ` Ian Stakenvicius
@ 2014-11-06 15:26       ` Ciaran McCreesh
  0 siblings, 0 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-06 15:26 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 590 bytes --]

On Thu, 06 Nov 2014 10:18:18 -0500
Ian Stakenvicius <axs@gentoo.org> wrote:
> ...well, if this is an undergrad project, he could start with the SAT
> solver and then do what you recommend for his Masters' .. :)

Naah, SAT is doomed. A (bad) vanilla CP model is doable, but in my
experience of students doing these kinds of projects, SAT and IP look
sufficiently "mathsy" to count as a maths project, but if you hand in a
CP model to a mathematician they'll go "I don't understand, you just
wrote down some stuff describing something. This isn't maths!"...

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-06 13:25 ` Jauhien Piatlicki
  2014-11-06 13:43   ` Ciaran McCreesh
@ 2014-11-06 21:28   ` Jeroen Roovers
  2014-11-07  5:06     ` Harsh Bhatt
  1 sibling, 1 reply; 74+ messages in thread
From: Jeroen Roovers @ 2014-11-06 21:28 UTC (permalink / raw
  To: gentoo-dev

On Thu, 06 Nov 2014 14:25:46 +0100
Jauhien Piatlicki <jauhien@gentoo.org> wrote:

> Mathematics you said? That's nice. You can, for example, redesign our
> portage's dependency solving algorithm

More generally perhaps: do something interesting with the portage tree.
If not as directly useful as fixing dependency, a look at how bits of
the tree changed over time (particularly with regard to
inter-dependencies between the bits) could be much more interesting
than regarding any particular snapshot.

The other huge multidimensional tree we have is the bug tracker
database. Several social science majors have already tried to do
something intelligible with the bug tracker data (and failed in my
opinion) so I am confident that someone who doesn't have that socially
oriented view of networks might be able to come up with more outrageous
and interesting viewpoints on how the bug tracker actually works and how
various bits of it interconnect, or doesn't work and don't connect,
respectively.


     jer


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-06 21:28   ` [gentoo-dev] Regarding my final year thesis Jeroen Roovers
@ 2014-11-07  5:06     ` Harsh Bhatt
  2014-11-07  9:49       ` Luca Barbato
  0 siblings, 1 reply; 74+ messages in thread
From: Harsh Bhatt @ 2014-11-07  5:06 UTC (permalink / raw
  To: gentoo-dev@lists.gentoo.org

[-- Attachment #1: Type: text/plain, Size: 2167 bytes --]

Thank you Jauhien Piatlicki, Ciaran McCreesh, Ian Stakenvicius, Jeroen Roovers for your detailed replies. 
After reading all the proivded information, I got confused about doing SAT or CP model. Currently i am in 5 th year of Applied Mathematics and i have 6 months of time to complete my work.
> "The other huge multidimensional tree we have is the bug tracker
> database. Several social science majors have already tried to do
> something intelligible with the bug tracker data (and failed in my
> opinion) so I am confident that someone who doesn't have that socially
> oriented view of networks might be able to come up with more outrageous
> and interesting viewpoints on how the bug tracker actually works and how
> various bits of it interconnect, or doesn't work and don't connect,
> respectively."  -- Jeroen Roovers

This idea seems bit interesting, about how the bug tracker works. In this i just need to confirm that how much mathematical aspect can be included. It's a good idea to work on.
Harsh Bhatt

 

     On Friday, 7 November 2014 2:58 AM, Jeroen Roovers <jer@gentoo.org> wrote:
   

 On Thu, 06 Nov 2014 14:25:46 +0100
Jauhien Piatlicki <jauhien@gentoo.org> wrote:

> Mathematics you said? That's nice. You can, for example, redesign our
> portage's dependency solving algorithm

More generally perhaps: do something interesting with the portage tree.
If not as directly useful as fixing dependency, a look at how bits of
the tree changed over time (particularly with regard to
inter-dependencies between the bits) could be much more interesting
than regarding any particular snapshot.

The other huge multidimensional tree we have is the bug tracker
database. Several social science majors have already tried to do
something intelligible with the bug tracker data (and failed in my
opinion) so I am confident that someone who doesn't have that socially
oriented view of networks might be able to come up with more outrageous
and interesting viewpoints on how the bug tracker actually works and how
various bits of it interconnect, or doesn't work and don't connect,
respectively.


    jer



   

[-- Attachment #2: Type: text/html, Size: 6265 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
  2014-11-06 13:43   ` Ciaran McCreesh
  2014-11-06 15:18     ` Ian Stakenvicius
@ 2014-11-07  9:42     ` Jauhien Piatlicki
  2014-11-07 17:36       ` Zac Medico
  2014-11-07 18:07       ` Ciaran McCreesh
  1 sibling, 2 replies; 74+ messages in thread
From: Jauhien Piatlicki @ 2014-11-07  9:42 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 974 bytes --]

Hi,

On 11/06/2014 02:43 PM, Ciaran McCreesh wrote:
> 
> If you're going to go the toolkit route, you should be using a CP
> solver, not a SAT solver. But even then you'd be better off making some
> changes and not using plain old MAC, so you're back to writing the
> algorithms yourself.
> 
> What you need is for someone who understands CP and SAT to write a
> resolver using algorithms inspired by how CP and SAT solvers work, but
> not just blindly copying them. Doing this well is at least a full year
> Masters level project...
> 

Yeah, you are right.

What I am interested in is an overview of what algorithm we are using
now. Do we have any documentation about it? As I really would like to
look at some concise document rather than sources.

Also may be we need to discuss how can we improve it, as at the moment
for me it seems one of the biggest problems with Gentoo. And afaik
paludis does not solve it (or am I wrong?)

--
Jauhien


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-07  5:06     ` Harsh Bhatt
@ 2014-11-07  9:49       ` Luca Barbato
  2015-01-16 17:30         ` Jan Matejka
  0 siblings, 1 reply; 74+ messages in thread
From: Luca Barbato @ 2014-11-07  9:49 UTC (permalink / raw
  To: gentoo-dev

On 07/11/14 06:06, Harsh Bhatt wrote:

> This idea seems bit interesting, about how the bug tracker works.
> In this i just need to confirm that how much mathematical aspect
> can be included. It's a good idea to work on.


Also make might enjoy improvements.

lu


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
  2014-11-07  9:42     ` [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis) Jauhien Piatlicki
@ 2014-11-07 17:36       ` Zac Medico
  2014-11-07 18:07       ` Ciaran McCreesh
  1 sibling, 0 replies; 74+ messages in thread
From: Zac Medico @ 2014-11-07 17:36 UTC (permalink / raw
  To: gentoo-dev

On 11/07/2014 01:42 AM, Jauhien Piatlicki wrote:
> Hi,
> 
> On 11/06/2014 02:43 PM, Ciaran McCreesh wrote:
>>
>> If you're going to go the toolkit route, you should be using a CP
>> solver, not a SAT solver. But even then you'd be better off making some
>> changes and not using plain old MAC, so you're back to writing the
>> algorithms yourself.
>>
>> What you need is for someone who understands CP and SAT to write a
>> resolver using algorithms inspired by how CP and SAT solvers work, but
>> not just blindly copying them. Doing this well is at least a full year
>> Masters level project...
>>
> 
> Yeah, you are right.
> 
> What I am interested in is an overview of what algorithm we are using
> now. Do we have any documentation about it? As I really would like to
> look at some concise document rather than sources.

If you install sys-apps/portage with USE=doc, it includes this
documentation which gives an overview of the portage's dependency
resolver algorithms:

	http://dev.gentoo.org/~zmedico/portage/doc/pt02.html

> Also may be we need to discuss how can we improve it, as at the moment
> for me it seems one of the biggest problems with Gentoo. And afaik
> paludis does not solve it (or am I wrong?)
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
  2014-11-07  9:42     ` [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis) Jauhien Piatlicki
  2014-11-07 17:36       ` Zac Medico
@ 2014-11-07 18:07       ` Ciaran McCreesh
  2014-11-07 18:11         ` Jauhien Piatlicki
  1 sibling, 1 reply; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-07 18:07 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 453 bytes --]

On Fri, 07 Nov 2014 10:42:39 +0100
Jauhien Piatlicki <jauhien@gentoo.org> wrote:
> Also may be we need to discuss how can we improve it, as at the moment
> for me it seems one of the biggest problems with Gentoo. And afaik
> paludis does not solve it (or am I wrong?)

Paludis solves it. However, Paludis will only ever produce a correct
resolution, which can be a problem since ebuild dependencies are
often garbage...

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
  2014-11-07 18:07       ` Ciaran McCreesh
@ 2014-11-07 18:11         ` Jauhien Piatlicki
  2014-11-07 18:30           ` Ciaran McCreesh
  0 siblings, 1 reply; 74+ messages in thread
From: Jauhien Piatlicki @ 2014-11-07 18:11 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 687 bytes --]

On 11/07/2014 07:07 PM, Ciaran McCreesh wrote:
> On Fri, 07 Nov 2014 10:42:39 +0100
> Jauhien Piatlicki <jauhien@gentoo.org> wrote:
>> Also may be we need to discuss how can we improve it, as at the moment
>> for me it seems one of the biggest problems with Gentoo. And afaik
>> paludis does not solve it (or am I wrong?)
> 
> Paludis solves it. However, Paludis will only ever produce a correct
> resolution, which can be a problem since ebuild dependencies are
> often garbage...
> 

Then the same question for you: where can one read about the algorithm
Paludis uses?

And, again, I have herd (did not try myself) that Paludis is as slow as
Portage.

--
Jauhien


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
  2014-11-07 18:11         ` Jauhien Piatlicki
@ 2014-11-07 18:30           ` Ciaran McCreesh
  2014-11-07 18:54             ` [gentoo-dev] Portage dependency solving algorithm Matthias Maier
  0 siblings, 1 reply; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-07 18:30 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1279 bytes --]

On Fri, 07 Nov 2014 19:11:04 +0100
Jauhien Piatlicki <jauhien@gentoo.org> wrote:
> Then the same question for you: where can one read about the algorithm
> Paludis uses?

It's basically a two stage process: simple constraint solving using
value ordering heuristics to enforce "don't do unnecessary work", then
ordering (which is not quite a graph process, and which is not as
simple as a topological sort, because the tree is full of circular
dependencies).

But the interesting question isn't "what's the algorithm?", it's
"what's the model?". That's where the complexity lies: figuring out how
to turn *DEPEND specifications into constraints is an utter pain, and
it isn't clean or easily understandable. The primary reason is ||
dependencies: developers like to write not-really-correct and utterly
unobvious dependency strings rather than asking for new syntax so they
can just say what they mean...

> And, again, I have herd (did not try myself) that Paludis is as slow
> as Portage.

Well, you're not comparing like with like. Paludis with "everything
turned off" does more than Portage with "everything turned on". If all
you're looking for is the wrong answer as fast as possible, there are
easier ways of getting it...

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 18:30           ` Ciaran McCreesh
@ 2014-11-07 18:54             ` Matthias Maier
  2014-11-07 19:08               ` hasufell
  2014-11-07 19:21               ` Ciaran McCreesh
  0 siblings, 2 replies; 74+ messages in thread
From: Matthias Maier @ 2014-11-07 18:54 UTC (permalink / raw
  To: gentoo-dev

Am 07. Nov 2014, 19:30 schrieb Ciaran McCreesh <ciaran.mccreesh@googlemail.com>:

> On Fri, 07 Nov 2014 19:11:04 +0100
> Jauhien Piatlicki <jauhien@gentoo.org> wrote:
>> Then the same question for you: where can one read about the algorithm
>> Paludis uses?
>
> It's basically a two stage process: simple constraint solving using
> value ordering heuristics to enforce "don't do unnecessary work", then
> ordering (which is not quite a graph process, and which is not as
> simple as a topological sort, because the tree is full of circular
> dependencies).
>
> But the interesting question isn't "what's the algorithm?", it's
> "what's the model?". That's where the complexity lies: figuring out how
> to turn *DEPEND specifications into constraints is an utter pain, and
> it isn't clean or easily understandable. The primary reason is ||
> dependencies: developers like to write not-really-correct and utterly
> unobvious dependency strings rather than asking for new syntax so they
> can just say what they mean...


Currently, for portage just to decide that nothing has to be done on my
machine takes around 1 minute.

What is in your opinion the main reason for this? And how can we knock
this down to reasonable speed?

 - Is our dependency model that more complex than the problem resolvers
   of other package managers for other distributions solve?

 - Is it the algorithm that is implemented for the dependency model?

 - Is it its implementation?


>> And, again, I have herd (did not try myself) that Paludis is as slow
>> as Portage.
>
> Well, you're not comparing like with like. Paludis with "everything
> turned off" does more than Portage with "everything turned on". If all
> you're looking for is the wrong answer as fast as possible, there are
> easier ways of getting it...

The last time I compared the resolver speed of portage and paludis both
needed almost the same time.

Do you have a speed comparison with a similar feature set of both? (Or,
alternatively, the speedup one gains by tuning paludis to be as fast as
possible).

Best,
Matthias


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 18:54             ` [gentoo-dev] Portage dependency solving algorithm Matthias Maier
@ 2014-11-07 19:08               ` hasufell
  2014-11-07 19:56                 ` Jauhien Piatlicki
                                   ` (3 more replies)
  2014-11-07 19:21               ` Ciaran McCreesh
  1 sibling, 4 replies; 74+ messages in thread
From: hasufell @ 2014-11-07 19:08 UTC (permalink / raw
  To: gentoo-dev

On 11/07/2014 07:54 PM, Matthias Maier wrote:
>> Well, you're not comparing like with like. Paludis with "everything
>> turned off" does more than Portage with "everything turned on". If all
>> you're looking for is the wrong answer as fast as possible, there are
>> easier ways of getting it...
> 
> The last time I compared the resolver speed of portage and paludis both
> needed almost the same time.
> 
> Do you have a speed comparison with a similar feature set of both? (Or,
> alternatively, the speedup one gains by tuning paludis to be as fast as
> possible).
> 

I think you didn't get the idea: it doesn't make much sense to compare
the speed if the correctness differs.

Also, I don't understand these discussions. The time dependency
resolving takes is marginal compared to the whole update process, no
matter what PM you use.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 18:54             ` [gentoo-dev] Portage dependency solving algorithm Matthias Maier
  2014-11-07 19:08               ` hasufell
@ 2014-11-07 19:21               ` Ciaran McCreesh
  2014-11-07 19:57                 ` Jauhien Piatlicki
                                   ` (2 more replies)
  1 sibling, 3 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-07 19:21 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2231 bytes --]

On Fri, 07 Nov 2014 19:54:08 +0100
Matthias Maier <tamiko@gentoo.org> wrote:
> Currently, for portage just to decide that nothing has to be done on
> my machine takes around 1 minute.

Are you running with or without metadata cache? If you're running
without, it's going to be slow independently of the resolution
algorithm... If you're not:

>  - Is our dependency model that more complex than the problem
> resolvers of other package managers for other distributions solve?

Yes, massively so.

>  - Is it the algorithm that is implemented for the dependency model?

Also a contributing factor, for certain cases. You may see Portage doing
a lot of backtracking sometimes. There's a much better typical-case
algorithm for this.

>  - Is it its implementation?

Also a factor.

The main issue, though, is that getting a "good" resolution out of
crappy data is extremely difficult. There's the Babbage quote:

| On two occasions I have been asked, — "Pray, Mr. Babbage, if you put
| into the machine wrong figures, will the right answers come out?" In
| one case a member of the Upper, and in the other a member of the
| Lower, House put this question. I am not able rightly to apprehend
| the kind of confusion of ideas that could provoke such a question.

Yet this is *exactly* what a dependency resolver has to do for Gentoo,
and it's why dependency resolvers are so complicated.

(For comparison, Paludis on Exherbo will run an order of magnitude
faster for the same set of installed packages, simply because on
Exherbo the input is correct.)

> > Well, you're not comparing like with like. Paludis with "everything
> > turned off" does more than Portage with "everything turned on". If
> > all you're looking for is the wrong answer as fast as possible,
> > there are easier ways of getting it...
> 
> The last time I compared the resolver speed of portage and paludis
> both needed almost the same time.

To do different things, though. Portage doesn't have a "produce a
correct resolution" switch. Paludis doesn't (really) have a "produce an
illegal resolution" switch. (Again, assuming you have metadata cache.
If you don't, whole other story.)

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:08               ` hasufell
@ 2014-11-07 19:56                 ` Jauhien Piatlicki
  2014-11-07 20:44                   ` hasufell
  2014-11-08  8:29                 ` vivo75
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 74+ messages in thread
From: Jauhien Piatlicki @ 2014-11-07 19:56 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1222 bytes --]

On 11/07/2014 08:08 PM, hasufell wrote:
> On 11/07/2014 07:54 PM, Matthias Maier wrote:
>>> Well, you're not comparing like with like. Paludis with "everything
>>> turned off" does more than Portage with "everything turned on". If all
>>> you're looking for is the wrong answer as fast as possible, there are
>>> easier ways of getting it...
>>
>> The last time I compared the resolver speed of portage and paludis both
>> needed almost the same time.
>>
>> Do you have a speed comparison with a similar feature set of both? (Or,
>> alternatively, the speedup one gains by tuning paludis to be as fast as
>> possible).
>>
> 
> I think you didn't get the idea: it doesn't make much sense to compare
> the speed if the correctness differs.
> 
> Also, I don't understand these discussions. The time dependency
> resolving takes is marginal compared to the whole update process, no
> matter what PM you use.
> 

When it compiles in background after all dependencies was solved, it
needs no user intervention. But when I need to solve some blocks or do
some tests during maintaining work, the dependency solving time is what
I care about, as I need to wait for it and then investigate the results.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:21               ` Ciaran McCreesh
@ 2014-11-07 19:57                 ` Jauhien Piatlicki
  2014-11-08 13:40                   ` Ciaran McCreesh
  2014-11-07 20:06                 ` Matthias Maier
  2014-11-08 10:56                 ` Andrew Savchenko
  2 siblings, 1 reply; 74+ messages in thread
From: Jauhien Piatlicki @ 2014-11-07 19:57 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 971 bytes --]


On 11/07/2014 08:21 PM, Ciaran McCreesh wrote:

> The main issue, though, is that getting a "good" resolution out of
> crappy data is extremely difficult. There's the Babbage quote:
> 
> | On two occasions I have been asked, — "Pray, Mr. Babbage, if you put
> | into the machine wrong figures, will the right answers come out?" In
> | one case a member of the Upper, and in the other a member of the
> | Lower, House put this question. I am not able rightly to apprehend
> | the kind of confusion of ideas that could provoke such a question.
> 
> Yet this is *exactly* what a dependency resolver has to do for Gentoo,
> and it's why dependency resolvers are so complicated.
> 
> (For comparison, Paludis on Exherbo will run an order of magnitude
> faster for the same set of installed packages, simply because on
> Exherbo the input is correct.)
> 

What;s wrong with input? PMS itself or how do maintainers write ebuilds?
Could you explain?



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:21               ` Ciaran McCreesh
  2014-11-07 19:57                 ` Jauhien Piatlicki
@ 2014-11-07 20:06                 ` Matthias Maier
  2014-11-08 10:56                 ` Andrew Savchenko
  2 siblings, 0 replies; 74+ messages in thread
From: Matthias Maier @ 2014-11-07 20:06 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 835 bytes --]


Am 07. Nov 2014, 20:21 schrieb Ciaran McCreesh <ciaran.mccreesh@googlemail.com>:

> On Fri, 07 Nov 2014 19:54:08 +0100
> Matthias Maier <tamiko@gentoo.org> wrote:

>> Currently, for portage just to decide that nothing has to be done on
>> my machine takes around 1 minute.
>
> Are you running with or without metadata cache? If you're running
> without, it's going to be slow independently of the resolution
> algorithm...

Yes, I run with metadata cache. Without, ... well I never waited for it
to finish.

> [...]

Thank you very much for the detailed explanation. This helped a lot :-]

>
> (For comparison, Paludis on Exherbo will run an order of magnitude
> faster for the same set of installed packages, simply because on
> Exherbo the input is correct.)
>

This might be a problem that we can tackle, though...

Best,
Matthias

[-- Attachment #2: Type: application/pgp-signature, Size: 820 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:56                 ` Jauhien Piatlicki
@ 2014-11-07 20:44                   ` hasufell
  2014-11-07 20:55                     ` Jauhien Piatlicki
  0 siblings, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-07 20:44 UTC (permalink / raw
  To: gentoo-dev

On 11/07/2014 08:56 PM, Jauhien Piatlicki wrote:
>>
>> I think you didn't get the idea: it doesn't make much sense to compare
>> the speed if the correctness differs.
>>
>> Also, I don't understand these discussions. The time dependency
>> resolving takes is marginal compared to the whole update process, no
>> matter what PM you use.
>>
> 
> When it compiles in background after all dependencies was solved, it
> needs no user intervention. But when I need to solve some blocks or do
> some tests during maintaining work, the dependency solving time is what
> I care about, as I need to wait for it and then investigate the results.
> 

I see, however... I prefer to have a correct answer instead of an
incorrect one, even if the correct one takes longer.

That goes _especially_ for testing and maintaining work.

Every time people compare portage to paludis I read stuff like "but
paludis is slower". That is incomplete information to put it diplomatic.

Do you really care so much about speed that you don't mind wrong results?


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 20:44                   ` hasufell
@ 2014-11-07 20:55                     ` Jauhien Piatlicki
  2014-11-07 21:04                       ` hasufell
  0 siblings, 1 reply; 74+ messages in thread
From: Jauhien Piatlicki @ 2014-11-07 20:55 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 694 bytes --]

07.11.14 21:44, hasufell написав(ла):
> On 11/07/2014 08:56 PM, Jauhien Piatlicki wrote:
> 
> Every time people compare portage to paludis I read stuff like "but
> paludis is slower". That is incomplete information to put it diplomatic.
> 
> Do you really care so much about speed that you don't mind wrong results?
> 

My original question was about Portage being too slow. And Paludis came out just as an alternative.

And I would like to see a detailed discussion about what's wrong from the point of view of correctness with:

1. PMS

2. ebuilds in tree

3. Portage dependency solving

Was this discussed somewhere? Could you point me there?

--
Jauhien



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 20:55                     ` Jauhien Piatlicki
@ 2014-11-07 21:04                       ` hasufell
  2014-11-07 21:41                         ` Zac Medico
  2014-11-08 13:40                         ` Ciaran McCreesh
  0 siblings, 2 replies; 74+ messages in thread
From: hasufell @ 2014-11-07 21:04 UTC (permalink / raw
  To: gentoo-dev

On 11/07/2014 09:55 PM, Jauhien Piatlicki wrote:
> 07.11.14 21:44, hasufell написав(ла):
>> On 11/07/2014 08:56 PM, Jauhien Piatlicki wrote:
>>
>> Every time people compare portage to paludis I read stuff like "but
>> paludis is slower". That is incomplete information to put it diplomatic.
>>
>> Do you really care so much about speed that you don't mind wrong results?
>>
> 
> My original question was about Portage being too slow. And Paludis came out just as an alternative.
> 
> And I would like to see a detailed discussion about what's wrong from the point of view of correctness with:
> 
> 1. PMS
> 
> 2. ebuilds in tree
> 
> 3. Portage dependency solving
> 
> Was this discussed somewhere? Could you point me there?
> 

The first thing that comes to my mind is dynamic dependencies. They are
wrong and that has been discussed recently on this ML.

If you have ever switched from portage to paludis on a full-grown
system, then you know how much bad data and missing
updates/blockers/dependencies are hidden.

However, it seems that this issue is being addressed by the portage
team, afair.

Next thing that comes to my mind is: indeterministic results. I'v had
LOTS of them with portage. You run an emerge, abort. You run it again...
and woosh, different result.
I'v not hit that case yet with paludis, unless I ran it with different
configuration options.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 21:04                       ` hasufell
@ 2014-11-07 21:41                         ` Zac Medico
  2014-11-08 13:40                         ` Ciaran McCreesh
  1 sibling, 0 replies; 74+ messages in thread
From: Zac Medico @ 2014-11-07 21:41 UTC (permalink / raw
  To: gentoo-dev

On 11/07/2014 01:04 PM, hasufell wrote:
> Next thing that comes to my mind is: indeterministic results. I'v had
> LOTS of them with portage. You run an emerge, abort. You run it again...
> and woosh, different result.

This is a result of the solution space being quite large, combined with
hash randomization (and possibly some other forms of randomization). You
will probably notice this sort of randomization more for failed
dependency calculations than for successful dependency calculations.
Successful dependency calculations will almost always result in
reproducible results.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:08               ` hasufell
  2014-11-07 19:56                 ` Jauhien Piatlicki
@ 2014-11-08  8:29                 ` vivo75
  2014-11-08 13:35                   ` Ciaran McCreesh
  2014-11-08 11:10                 ` Andrew Savchenko
  2014-11-08 13:24                 ` Patrick Lauer
  3 siblings, 1 reply; 74+ messages in thread
From: vivo75 @ 2014-11-08  8:29 UTC (permalink / raw
  To: gentoo-dev

Il 07/11/2014 20:08, hasufell ha scritto:
> Also, I don't understand these discussions. The time dependency
> resolving takes is marginal compared to the whole update process, no
> matter what PM you use.
"The time dependency resolving takes is marginal compared to the whole
update process "
^^^ this is an utter lie for frequent updates





^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:21               ` Ciaran McCreesh
  2014-11-07 19:57                 ` Jauhien Piatlicki
  2014-11-07 20:06                 ` Matthias Maier
@ 2014-11-08 10:56                 ` Andrew Savchenko
  2014-11-08 20:59                   ` [gentoo-dev] " Duncan
       [not found]                   ` <20141108133520.6ec790fe@googlemail.com>
  2 siblings, 2 replies; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-08 10:56 UTC (permalink / raw
  To: gentoo-dev; +Cc: Ciaran McCreesh

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

On Fri, 7 Nov 2014 19:21:01 +0000 Ciaran McCreesh wrote:
> On Fri, 07 Nov 2014 19:54:08 +0100
> Matthias Maier <tamiko@gentoo.org> wrote:
> > Currently, for portage just to decide that nothing has to be done on
> > my machine takes around 1 minute.
> 
> Are you running with or without metadata cache? If you're running
> without, it's going to be slow independently of the resolution
> algorithm...

Wanna ~20-30 minutes with sqlite metadata cache enabled?
Try on Intel Atom N270 with 2800 packages installed.
Dependency resolution is utterly slow.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:08               ` hasufell
  2014-11-07 19:56                 ` Jauhien Piatlicki
  2014-11-08  8:29                 ` vivo75
@ 2014-11-08 11:10                 ` Andrew Savchenko
  2014-11-08 13:24                 ` Patrick Lauer
  3 siblings, 0 replies; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-08 11:10 UTC (permalink / raw
  To: gentoo-dev; +Cc: hasufell

[-- Attachment #1: Type: text/plain, Size: 1916 bytes --]

On Fri, 07 Nov 2014 20:08:12 +0100 hasufell wrote:
> On 11/07/2014 07:54 PM, Matthias Maier wrote:
> >> Well, you're not comparing like with like. Paludis with "everything
> >> turned off" does more than Portage with "everything turned on". If all
> >> you're looking for is the wrong answer as fast as possible, there are
> >> easier ways of getting it...
> > 
> > The last time I compared the resolver speed of portage and paludis both
> > needed almost the same time.
> > 
> > Do you have a speed comparison with a similar feature set of both? (Or,
> > alternatively, the speedup one gains by tuning paludis to be as fast as
> > possible).
> > 
> 
> I think you didn't get the idea: it doesn't make much sense to compare
> the speed if the correctness differs.
> 
> Also, I don't understand these discussions. The time dependency
> resolving takes is marginal compared to the whole update process, no
> matter what PM you use.

Time required for user's attention differs from time required for
CPU to work. During main update process portage just works for a
few days (and then I have to deal with failed packages, but that's
another story).

As for dependency resolution during world update, this process is
usually iterative: portage failes due to some unsatisfied USE
flags, blocks, circular deps and so on. After resolving one of this
problems emerge needs to be run one more time and so on. On old
hardware iteration takes up to an hour (even with sqlite cache! 1
hour of CPU time), so a day or sometimes even two needed just to
start actual build process! Now take into account that not always
it is possible to spend all day just to start package's building,
so world updates spans to weaks...

With 10x times faster deps resolver it will be possible to handle
all these issues in a few hours and let it contiue update on its
own.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:08               ` hasufell
                                   ` (2 preceding siblings ...)
  2014-11-08 11:10                 ` Andrew Savchenko
@ 2014-11-08 13:24                 ` Patrick Lauer
  2014-11-08 14:48                   ` hasufell
  3 siblings, 1 reply; 74+ messages in thread
From: Patrick Lauer @ 2014-11-08 13:24 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 03:08 AM, hasufell wrote:
> On 11/07/2014 07:54 PM, Matthias Maier wrote:
>>> Well, you're not comparing like with like. Paludis with "everything
>>> turned off" does more than Portage with "everything turned on". If all
>>> you're looking for is the wrong answer as fast as possible, there are
>>> easier ways of getting it...
>>
>> The last time I compared the resolver speed of portage and paludis both
>> needed almost the same time.
>>
>> Do you have a speed comparison with a similar feature set of both? (Or,
>> alternatively, the speedup one gains by tuning paludis to be as fast as
>> possible).
>>
> 
> I think you didn't get the idea: it doesn't make much sense to compare
> the speed if the correctness differs.
> 
> Also, I don't understand these discussions. The time dependency
> resolving takes is marginal compared to the whole update process, no
> matter what PM you use.
> 
*ahem*

On my old notebook, which luckily suicided thanks to Lenovo's built in
obsolete device detection ...

emerge -auNDv world took up to 35 minutes

So, if something like RUBY_TARGETS or a random useflag changes, it takes
me literally DAYS to figure out a valid solution where portage can
figure out an upgrade path.

No, it's not marginal.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08  8:29                 ` vivo75
@ 2014-11-08 13:35                   ` Ciaran McCreesh
  2014-11-08 14:08                     ` vivo75
  0 siblings, 1 reply; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-08 13:35 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 291 bytes --]

On Sat, 08 Nov 2014 09:29:52 +0100
"vivo75@gmail.com" <vivo75@gmail.com> wrote:
> "The time dependency resolving takes is marginal compared to the whole
> update process "
> ^^^ this is an utter lie for frequent updates

Uh, how long are your resolves taking?

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 19:57                 ` Jauhien Piatlicki
@ 2014-11-08 13:40                   ` Ciaran McCreesh
  2014-11-08 18:11                     ` Hilco Wijbenga
  0 siblings, 1 reply; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-08 13:40 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1109 bytes --]

On Fri, 07 Nov 2014 20:57:41 +0100
Jauhien Piatlicki <jauhien@gentoo.org> wrote:
> What;s wrong with input? PMS itself or how do maintainers write
> ebuilds? Could you explain?

A mixture of both. Gentoo developers like writing eclasses that write
unnecessarily clever, highly messy and technically incorrect dependency
strings (see how Perl and Ruby are done, for prime examples). Doing
this kind of thing well requires support from PMS, so developers can
express what they want to say directly rather than via some convoluted
mess of nested ||s, []s, slot abuse and faked range dependencies.
However, it's currently culturally more acceptable to try to make
yourself look clever by writing the new "world's most convoluted family
of eclasses", so developers aren't asking for the features they need.

In a way, this brings us back to SAT and CNF. Although you *can* encode
this kind of thing in SAT (or rather, in QSAT...), the encoding is
utterly opaque and doesn't lend itself to a good algorithm. The
dependencies some developers are writing are nearly as bad.

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-07 21:04                       ` hasufell
  2014-11-07 21:41                         ` Zac Medico
@ 2014-11-08 13:40                         ` Ciaran McCreesh
  1 sibling, 0 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-08 13:40 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 459 bytes --]

On Fri, 07 Nov 2014 22:04:57 +0100
hasufell <hasufell@gentoo.org> wrote:
> Next thing that comes to my mind is: indeterministic results. I'v had
> LOTS of them with portage. You run an emerge, abort. You run it
> again... and woosh, different result.
> I'v not hit that case yet with paludis, unless I ran it with different
> configuration options.

You won't hit this with Paludis if you don't change anything between
runs.

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 13:35                   ` Ciaran McCreesh
@ 2014-11-08 14:08                     ` vivo75
  0 siblings, 0 replies; 74+ messages in thread
From: vivo75 @ 2014-11-08 14:08 UTC (permalink / raw
  To: gentoo-dev

Il 08/11/2014 14:35, Ciaran McCreesh ha scritto:
> On Sat, 08 Nov 2014 09:29:52 +0100
> "vivo75@gmail.com" <vivo75@gmail.com> wrote:
>> "The time dependency resolving takes is marginal compared to the whole
>> update process "
>> ^^^ this is an utter lie for frequent updates
> Uh, how long are your resolves taking?
>
I've updated the system half hour ago it's empty, but this system can
compile and install a good number of packages in 3 minutes.
Obviously this is without tuning the filesystem, with rotational disks
_and_ 3 overlays (which slow down a lot) plus it has more than 2000
packages installed
So I would not call this marginal at all. Not even if it was one minute,
for attended upgrades is annoying.

gentoo ~ # /usr/bin/time --verbose     emerge -uDpN @world

These are the packages that would be merged, in order:

Calculating dependencies... done!
        Command being timed: "emerge -uDpN @world"
        User time (seconds): 178.45
        System time (seconds): 1.58
        Percent of CPU this job got: 100%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 2:59.88
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 2470064
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 168879
        Voluntary context switches: 11
        Involuntary context switches: 1568
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0





^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 13:24                 ` Patrick Lauer
@ 2014-11-08 14:48                   ` hasufell
  2014-11-08 23:33                     ` Patrick Lauer
  0 siblings, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-08 14:48 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 02:24 PM, Patrick Lauer wrote:
> On 11/08/2014 03:08 AM, hasufell wrote:
>> On 11/07/2014 07:54 PM, Matthias Maier wrote:
>>>> Well, you're not comparing like with like. Paludis with "everything
>>>> turned off" does more than Portage with "everything turned on". If all
>>>> you're looking for is the wrong answer as fast as possible, there are
>>>> easier ways of getting it...
>>>
>>> The last time I compared the resolver speed of portage and paludis both
>>> needed almost the same time.
>>>
>>> Do you have a speed comparison with a similar feature set of both? (Or,
>>> alternatively, the speedup one gains by tuning paludis to be as fast as
>>> possible).
>>>
>>
>> I think you didn't get the idea: it doesn't make much sense to compare
>> the speed if the correctness differs.
>>
>> Also, I don't understand these discussions. The time dependency
>> resolving takes is marginal compared to the whole update process, no
>> matter what PM you use.
>>
> *ahem*
> 
> On my old notebook, which luckily suicided thanks to Lenovo's built in
> obsolete device detection ...
> 
> emerge -auNDv world took up to 35 minutes
> 
> So, if something like RUBY_TARGETS or a random useflag changes, it takes
> me literally DAYS to figure out a valid solution where portage can
> figure out an upgrade path.
> 
> No, it's not marginal.
> 

So we are back to the initial discussion: fix the input instead of
making the resolver even worse. You can't have both bad input and a fast
_correct_ resolver.

So you choose between "good enough results which may not be what you
want" and "pretty good results with lots of things to figure out
yourself, because of the input and because the resolver does not make
random assumptions about what you want".

Both solutions s**k, tbh.

I'd just like people to look at the whole picture and don't keep PM
discussions narrow-minded by all this NIH crap.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 13:40                   ` Ciaran McCreesh
@ 2014-11-08 18:11                     ` Hilco Wijbenga
  2014-11-08 18:26                       ` Ciaran McCreesh
  0 siblings, 1 reply; 74+ messages in thread
From: Hilco Wijbenga @ 2014-11-08 18:11 UTC (permalink / raw
  To: Gentoo Dev

On 8 November 2014 05:40, Ciaran McCreesh
<ciaran.mccreesh@googlemail.com> wrote:
> On Fri, 07 Nov 2014 20:57:41 +0100
> Jauhien Piatlicki <jauhien@gentoo.org> wrote:
>> What;s wrong with input? PMS itself or how do maintainers write
>> ebuilds? Could you explain?
>
> A mixture of both. Gentoo developers like writing eclasses that write
> unnecessarily clever, highly messy and technically incorrect dependency
> strings (see how Perl and Ruby are done, for prime examples). Doing
> this kind of thing well requires support from PMS, so developers can
> express what they want to say directly rather than via some convoluted
> mess of nested ||s, []s, slot abuse and faked range dependencies.
> However, it's currently culturally more acceptable to try to make
> yourself look clever by writing the new "world's most convoluted family
> of eclasses", so developers aren't asking for the features they need.
>
> In a way, this brings us back to SAT and CNF. Although you *can* encode
> this kind of thing in SAT (or rather, in QSAT...), the encoding is
> utterly opaque and doesn't lend itself to a good algorithm. The
> dependencies some developers are writing are nearly as bad.

So would it make sense then to move to a more declarative ebuild
model? Or just a "better" model?

Both Portage and Paludis at some point figure out what they think is
needed to install an ebuild. At that point, they could emit an ebuild
in a "new and improved" model? I have no doubt that this scheme would
fail utterly for many of the more complex/convoluted ebuilds but if it
worked for, say, 80% then the work to improve the tree would be
drastically reduced.

(I only write the simplest of ebuilds so I'm undoubtedly
oversimplifying but I thought I would throw this idea out there.)


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 18:11                     ` Hilco Wijbenga
@ 2014-11-08 18:26                       ` Ciaran McCreesh
  2014-11-08 19:01                         ` Matthias Dahl
  0 siblings, 1 reply; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-08 18:26 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 445 bytes --]

On Sat, 8 Nov 2014 10:11:13 -0800
Hilco Wijbenga <hilco.wijbenga@gmail.com> wrote:
> So would it make sense then to move to a more declarative ebuild
> model? Or just a "better" model?

No. It would make sense to introduce a cultural change, where
developers stop writing dependencies that seem to work with some
particular version of Portage if you don't look very closely, and start
writing good dependencies.

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 18:26                       ` Ciaran McCreesh
@ 2014-11-08 19:01                         ` Matthias Dahl
  2014-11-08 19:32                           ` hasufell
  2014-11-09 12:34                           ` Ciaran McCreesh
  0 siblings, 2 replies; 74+ messages in thread
From: Matthias Dahl @ 2014-11-08 19:01 UTC (permalink / raw
  To: gentoo-dev

Hello Ciaran...

On 08/11/14 19:26, Ciaran McCreesh wrote:

> No. It would make sense to introduce a cultural change, where
> developers stop writing dependencies that seem to work with some
> particular version of Portage if you don't look very closely, and start
> writing good dependencies.

Sorry to chime in like that but if you don't mind, I'd like to ask for a
real-life example for badly declared dependencies with a few words why
those are bad and how to make them actually better?

Thanks a lot,
Matthias



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 19:01                         ` Matthias Dahl
@ 2014-11-08 19:32                           ` hasufell
  2014-11-08 19:48                             ` hasufell
  2014-11-09 12:34                           ` Ciaran McCreesh
  1 sibling, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-08 19:32 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 08:01 PM, Matthias Dahl wrote:
> Hello Ciaran...
> 
> On 08/11/14 19:26, Ciaran McCreesh wrote:
> 
>> No. It would make sense to introduce a cultural change, where
>> developers stop writing dependencies that seem to work with some
>> particular version of Portage if you don't look very closely, and start
>> writing good dependencies.
> 
> Sorry to chime in like that but if you don't mind, I'd like to ask for a
> real-life example for badly declared dependencies with a few words why
> those are bad and how to make them actually better?
> 

from dev-haskell/hashtables (note "hashtables" != "hashable"):
|| ( ( >=dev-haskell/hashable-1.1:=[profile?]
       <dev-haskell/hashable-1.2:=[profile?] )
     ( >=dev-haskell/hashable-1.2.1:=[profile?]
       <dev-haskell/hashable-1.3:=[profile?] )
   )

Latest stable version of dev-haskell/hashable is 1.2.1.0.
On a stable system (arch) the paludis dep-solver will try to match the
first group first and realize that there is also a stable version
1.1.2.5 that matches that group. At that point there is a correct
solution, but since that involves downgrading a package, it will require
user-intervention, because it may not be what the user wants.
(this is the easy scenario... if downgrading causes blockers, you get
much more interesting output)

If you want "user friendliness", then such dependencies would require
that the dep-solver tries very very hard to find a solution which *may*
be what the user wants... or not. I don't think that's what "||" were
designed for (if there was a design behind it).

Ofc there is a solution to that, e.g. by using
--favour-matching '>=dev-haskell/hashable-1.2.1'
which explicitly tells the dep-solver what the user wants.


perl virtuals can be similarly interesting.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 19:32                           ` hasufell
@ 2014-11-08 19:48                             ` hasufell
  2014-11-08 21:30                               ` Rich Freeman
  0 siblings, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-08 19:48 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 08:32 PM, hasufell wrote:
> On 11/08/2014 08:01 PM, Matthias Dahl wrote:
>> Hello Ciaran...
>>
>> On 08/11/14 19:26, Ciaran McCreesh wrote:
>>
>>> No. It would make sense to introduce a cultural change, where
>>> developers stop writing dependencies that seem to work with some
>>> particular version of Portage if you don't look very closely, and start
>>> writing good dependencies.
>>
>> Sorry to chime in like that but if you don't mind, I'd like to ask for a
>> real-life example for badly declared dependencies with a few words why
>> those are bad and how to make them actually better?
>>
> 
> from dev-haskell/hashtables (note "hashtables" != "hashable"):
> || ( ( >=dev-haskell/hashable-1.1:=[profile?]
>        <dev-haskell/hashable-1.2:=[profile?] )
>      ( >=dev-haskell/hashable-1.2.1:=[profile?]
>        <dev-haskell/hashable-1.3:=[profile?] )
>    )
> 
> Latest stable version of dev-haskell/hashable is 1.2.1.0.
> On a stable system (arch) the paludis dep-solver will try to match the
> first group first and realize that there is also a stable version
> 1.1.2.5 that matches that group. At that point there is a correct
> solution, but since that involves downgrading a package, it will require
> user-intervention, because it may not be what the user wants.
> (this is the easy scenario... if downgrading causes blockers, you get
> much more interesting output)
> 

To be more specific... it is assumed that hashable-1.2.1.0 is already
installed. Every time the dep solver runs through those packages without
specifying what you want, you will hit the downgrade-problem.

If there was no version of hashable installed at all, then it will also
require user intervention to permit old versions of a package.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [gentoo-dev] Re: Portage dependency solving algorithm
  2014-11-08 10:56                 ` Andrew Savchenko
@ 2014-11-08 20:59                   ` Duncan
       [not found]                   ` <20141108133520.6ec790fe@googlemail.com>
  1 sibling, 0 replies; 74+ messages in thread
From: Duncan @ 2014-11-08 20:59 UTC (permalink / raw
  To: gentoo-dev

Andrew Savchenko posted on Sat, 08 Nov 2014 13:56:21 +0300 as excerpted:

> Wanna ~20-30 minutes with sqlite metadata cache enabled?
> Try on Intel Atom N270 with 2800 packages installed. Dependency
> resolution is utterly slow.

Hmm.  My netbook's an Atom N270 also.  But I use a build-image chroot on 
my main 6-core AMD bulldozer for building and just rsync over, and I only 
have ~800 packages installed, so I fortunately don't have the N270 speed 
issues to deal with for package resolution and build.

But I've more or less decided to buy a new netbook level x86_64 one of 
these days and shelve the 32-bit-only N270, so I can use pretty much the 
same packages on both the workstation and the (new) netbook.  Maybe then 
I'll actually keep the netbook current instead of letting it go a year or 
more between updates.  It's just a matter of actually doing it.  Maybe 
this Black-Friday or Cyber-Monday...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 19:48                             ` hasufell
@ 2014-11-08 21:30                               ` Rich Freeman
  2014-11-08 21:47                                 ` hasufell
  2014-11-09 12:40                                 ` Ciaran McCreesh
  0 siblings, 2 replies; 74+ messages in thread
From: Rich Freeman @ 2014-11-08 21:30 UTC (permalink / raw
  To: gentoo-dev

On Sat, Nov 8, 2014 at 2:48 PM, hasufell <hasufell@gentoo.org> wrote:
> On 11/08/2014 08:32 PM, hasufell wrote:
>> On 11/08/2014 08:01 PM, Matthias Dahl wrote:
>>> Hello Ciaran...
>>>
>>> On 08/11/14 19:26, Ciaran McCreesh wrote:
>>>
>>>> No. It would make sense to introduce a cultural change, where
>>>> developers stop writing dependencies that seem to work with some
>>>> particular version of Portage if you don't look very closely, and start
>>>> writing good dependencies.
>>>
>>> Sorry to chime in like that but if you don't mind, I'd like to ask for a
>>> real-life example for badly declared dependencies with a few words why
>>> those are bad and how to make them actually better?
>>>
>>
>> from dev-haskell/hashtables (note "hashtables" != "hashable"):
>> || ( ( >=dev-haskell/hashable-1.1:=[profile?]
>>        <dev-haskell/hashable-1.2:=[profile?] )
>>      ( >=dev-haskell/hashable-1.2.1:=[profile?]
>>        <dev-haskell/hashable-1.3:=[profile?] )
>>    )
>>
>> Latest stable version of dev-haskell/hashable is 1.2.1.0.
>> On a stable system (arch) the paludis dep-solver will try to match the
>> first group first and realize that there is also a stable version
>> 1.1.2.5 that matches that group. At that point there is a correct
>> solution, but since that involves downgrading a package, it will require
>> user-intervention, because it may not be what the user wants.
>> (this is the easy scenario... if downgrading causes blockers, you get
>> much more interesting output)
>>
>
> To be more specific... it is assumed that hashable-1.2.1.0 is already
> installed. Every time the dep solver runs through those packages without
> specifying what you want, you will hit the downgrade-problem.

I'm missing the problem.  The package requires one of two ranges of
hashable versions.  One of them is already installed.  The dependency
is satisfied.

If the user cared which version they had installed they should have to
specify this.  Otherwise the package manager should just assume that
the user doesn't care whether hashable is installed or not - they just
want hashtables installed (or more likely they want something which
needs hashtables installed).

I get that we order || dependencies to hint to the package manager
which dep should be preferred if there is no reason to favor one over
the other.  It shouldn't mean that if you have the second dep
installed that it should try to switch to the first if there are no
conflicts.

In any case, I'm curious as to how you would propose improving such a
dependency?  I definitely see how the syntax could be cleaned up so
that I don't have to poke my eyeballs out, but I don't see what it
would accomplish in terms of dependency resolution (maybe if there was
more use of (sub)slotting and a syntax based on that it might allow
for a different way of searching the dependencies and cut out a few
checks, but I'd have to think about that).

--
Rich


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 21:30                               ` Rich Freeman
@ 2014-11-08 21:47                                 ` hasufell
  2014-11-08 21:52                                   ` Jauhien Piatlicki
  2014-11-09 12:40                                 ` Ciaran McCreesh
  1 sibling, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-08 21:47 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 10:30 PM, Rich Freeman wrote:
> On Sat, Nov 8, 2014 at 2:48 PM, hasufell <hasufell@gentoo.org> wrote:
>> On 11/08/2014 08:32 PM, hasufell wrote:
>>> On 11/08/2014 08:01 PM, Matthias Dahl wrote:
>>>> Hello Ciaran...
>>>>
>>>> On 08/11/14 19:26, Ciaran McCreesh wrote:
>>>>
>>>>> No. It would make sense to introduce a cultural change, where
>>>>> developers stop writing dependencies that seem to work with some
>>>>> particular version of Portage if you don't look very closely, and start
>>>>> writing good dependencies.
>>>>
>>>> Sorry to chime in like that but if you don't mind, I'd like to ask for a
>>>> real-life example for badly declared dependencies with a few words why
>>>> those are bad and how to make them actually better?
>>>>
>>>
>>> from dev-haskell/hashtables (note "hashtables" != "hashable"):
>>> || ( ( >=dev-haskell/hashable-1.1:=[profile?]
>>>        <dev-haskell/hashable-1.2:=[profile?] )
>>>      ( >=dev-haskell/hashable-1.2.1:=[profile?]
>>>        <dev-haskell/hashable-1.3:=[profile?] )
>>>    )
>>>
>>> Latest stable version of dev-haskell/hashable is 1.2.1.0.
>>> On a stable system (arch) the paludis dep-solver will try to match the
>>> first group first and realize that there is also a stable version
>>> 1.1.2.5 that matches that group. At that point there is a correct
>>> solution, but since that involves downgrading a package, it will require
>>> user-intervention, because it may not be what the user wants.
>>> (this is the easy scenario... if downgrading causes blockers, you get
>>> much more interesting output)
>>>
>>
>> To be more specific... it is assumed that hashable-1.2.1.0 is already
>> installed. Every time the dep solver runs through those packages without
>> specifying what you want, you will hit the downgrade-problem.
> 
> I'm missing the problem.  The package requires one of two ranges of
> hashable versions.  One of them is already installed.  The dependency
> is satisfied.
> 

The one that is installed (1.2.1.0) is *excluded* by the first group,
but there is a valid version that fits instead (1.1.2.5).

That's the point where the assumptions start about what the depstring
means and what the user wants.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 21:47                                 ` hasufell
@ 2014-11-08 21:52                                   ` Jauhien Piatlicki
  2014-11-08 22:05                                     ` hasufell
  2014-11-09 12:41                                     ` Ciaran McCreesh
  0 siblings, 2 replies; 74+ messages in thread
From: Jauhien Piatlicki @ 2014-11-08 21:52 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2398 bytes --]

08.11.14 22:47, hasufell написав(ла):
> On 11/08/2014 10:30 PM, Rich Freeman wrote:
>> On Sat, Nov 8, 2014 at 2:48 PM, hasufell <hasufell@gentoo.org> wrote:
>>> On 11/08/2014 08:32 PM, hasufell wrote:
>>>>> Sorry to chime in like that but if you don't mind, I'd like to ask for a
>>>>> real-life example for badly declared dependencies with a few words why
>>>>> those are bad and how to make them actually better?
>>>>>
>>>>
>>>> from dev-haskell/hashtables (note "hashtables" != "hashable"):
>>>> || ( ( >=dev-haskell/hashable-1.1:=[profile?]
>>>>        <dev-haskell/hashable-1.2:=[profile?] )
>>>>      ( >=dev-haskell/hashable-1.2.1:=[profile?]
>>>>        <dev-haskell/hashable-1.3:=[profile?] )
>>>>    )
>>>>
>>>> Latest stable version of dev-haskell/hashable is 1.2.1.0.
>>>> On a stable system (arch) the paludis dep-solver will try to match the
>>>> first group first and realize that there is also a stable version
>>>> 1.1.2.5 that matches that group. At that point there is a correct
>>>> solution, but since that involves downgrading a package, it will require
>>>> user-intervention, because it may not be what the user wants.
>>>> (this is the easy scenario... if downgrading causes blockers, you get
>>>> much more interesting output)
>>>>
>>>
>>> To be more specific... it is assumed that hashable-1.2.1.0 is already
>>> installed. Every time the dep solver runs through those packages without
>>> specifying what you want, you will hit the downgrade-problem.
>>
>> I'm missing the problem.  The package requires one of two ranges of
>> hashable versions.  One of them is already installed.  The dependency
>> is satisfied.
>>
> 
> The one that is installed (1.2.1.0) is *excluded* by the first group,
> but there is a valid version that fits instead (1.1.2.5).
> 
> That's the point where the assumptions start about what the depstring
> means and what the user wants.
> 

So the problem is only with intervals? First of all, maintainer would specify higher interval first here and it would solve a problem with possible downgrading. Second, || is rather not for such cases as you've said, so we could ask for a new syntax and solve this problem in the right way in one of the next EAPIs.

Are there any other problems in current model apart from intervals? I would really like to see a list of them all.

--
Jauhien



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 21:52                                   ` Jauhien Piatlicki
@ 2014-11-08 22:05                                     ` hasufell
  2014-11-08 22:39                                       ` Zac Medico
  2014-11-09 12:41                                     ` Ciaran McCreesh
  1 sibling, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-08 22:05 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 10:52 PM, Jauhien Piatlicki wrote:
> 08.11.14 22:47, hasufell написав(ла):
>> On 11/08/2014 10:30 PM, Rich Freeman wrote:
>>> On Sat, Nov 8, 2014 at 2:48 PM, hasufell <hasufell@gentoo.org> wrote:
>>>> On 11/08/2014 08:32 PM, hasufell wrote:
>>>>>> Sorry to chime in like that but if you don't mind, I'd like to ask for a
>>>>>> real-life example for badly declared dependencies with a few words why
>>>>>> those are bad and how to make them actually better?
>>>>>>
>>>>>
>>>>> from dev-haskell/hashtables (note "hashtables" != "hashable"):
>>>>> || ( ( >=dev-haskell/hashable-1.1:=[profile?]
>>>>>        <dev-haskell/hashable-1.2:=[profile?] )
>>>>>      ( >=dev-haskell/hashable-1.2.1:=[profile?]
>>>>>        <dev-haskell/hashable-1.3:=[profile?] )
>>>>>    )
>>>>>
>>>>> Latest stable version of dev-haskell/hashable is 1.2.1.0.
>>>>> On a stable system (arch) the paludis dep-solver will try to match the
>>>>> first group first and realize that there is also a stable version
>>>>> 1.1.2.5 that matches that group. At that point there is a correct
>>>>> solution, but since that involves downgrading a package, it will require
>>>>> user-intervention, because it may not be what the user wants.
>>>>> (this is the easy scenario... if downgrading causes blockers, you get
>>>>> much more interesting output)
>>>>>
>>>>
>>>> To be more specific... it is assumed that hashable-1.2.1.0 is already
>>>> installed. Every time the dep solver runs through those packages without
>>>> specifying what you want, you will hit the downgrade-problem.
>>>
>>> I'm missing the problem.  The package requires one of two ranges of
>>> hashable versions.  One of them is already installed.  The dependency
>>> is satisfied.
>>>
>>
>> The one that is installed (1.2.1.0) is *excluded* by the first group,
>> but there is a valid version that fits instead (1.1.2.5).
>>
>> That's the point where the assumptions start about what the depstring
>> means and what the user wants.
>>
> 
> So the problem is only with intervals? First of all, maintainer would specify higher interval first here and it would solve a problem with possible downgrading.

I have a feeling that this is an assumption as well. PMS just says this
is an 'any-of' group. There is not a single word about the processing
order of these specs or which one to prefer, in which case some is
better than the other and so on.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 22:05                                     ` hasufell
@ 2014-11-08 22:39                                       ` Zac Medico
  2014-11-08 23:52                                         ` hasufell
  0 siblings, 1 reply; 74+ messages in thread
From: Zac Medico @ 2014-11-08 22:39 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 02:05 PM, hasufell wrote:
> On 11/08/2014 10:52 PM, Jauhien Piatlicki wrote:
>> 08.11.14 22:47, hasufell написав(ла):
>>> On 11/08/2014 10:30 PM, Rich Freeman wrote:
>>>> On Sat, Nov 8, 2014 at 2:48 PM, hasufell <hasufell@gentoo.org> wrote:
>>>>> On 11/08/2014 08:32 PM, hasufell wrote:
>>>>>>> Sorry to chime in like that but if you don't mind, I'd like to ask for a
>>>>>>> real-life example for badly declared dependencies with a few words why
>>>>>>> those are bad and how to make them actually better?
>>>>>>>
>>>>>>
>>>>>> from dev-haskell/hashtables (note "hashtables" != "hashable"):
>>>>>> || ( ( >=dev-haskell/hashable-1.1:=[profile?]
>>>>>>        <dev-haskell/hashable-1.2:=[profile?] )
>>>>>>      ( >=dev-haskell/hashable-1.2.1:=[profile?]
>>>>>>        <dev-haskell/hashable-1.3:=[profile?] )
>>>>>>    )
>>>>>>
>>>>>> Latest stable version of dev-haskell/hashable is 1.2.1.0.
>>>>>> On a stable system (arch) the paludis dep-solver will try to match the
>>>>>> first group first and realize that there is also a stable version
>>>>>> 1.1.2.5 that matches that group. At that point there is a correct
>>>>>> solution, but since that involves downgrading a package, it will require
>>>>>> user-intervention, because it may not be what the user wants.
>>>>>> (this is the easy scenario... if downgrading causes blockers, you get
>>>>>> much more interesting output)
>>>>>>
>>>>>
>>>>> To be more specific... it is assumed that hashable-1.2.1.0 is already
>>>>> installed. Every time the dep solver runs through those packages without
>>>>> specifying what you want, you will hit the downgrade-problem.
>>>>
>>>> I'm missing the problem.  The package requires one of two ranges of
>>>> hashable versions.  One of them is already installed.  The dependency
>>>> is satisfied.
>>>>
>>>
>>> The one that is installed (1.2.1.0) is *excluded* by the first group,
>>> but there is a valid version that fits instead (1.1.2.5).
>>>
>>> That's the point where the assumptions start about what the depstring
>>> means and what the user wants.
>>>
>>
>> So the problem is only with intervals? First of all, maintainer would specify higher interval first here and it would solve a problem with possible downgrading.
> 
> I have a feeling that this is an assumption as well. PMS just says this
> is an 'any-of' group. There is not a single word about the processing
> order of these specs or which one to prefer, in which case some is
> better than the other and so on.

I think the two obvious algorithms are:

A) If the user's resolver parameters request maximum upgrades, then the
resolver should choose the choice that results the most upgrades.

B) If the user's resolver parameters request minimum change, then the
resolver should choose the choice which results in keeping the most
installed packages in place.

For the dev-haskell/hashtables scenario, the choice on the right wins
regardless of whether you're using algorithm A or algorithm B.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 14:48                   ` hasufell
@ 2014-11-08 23:33                     ` Patrick Lauer
  2014-11-08 23:45                       ` Zac Medico
                                         ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: Patrick Lauer @ 2014-11-08 23:33 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 10:48 PM, hasufell wrote:
> On 11/08/2014 02:24 PM, Patrick Lauer wrote:
>> On 11/08/2014 03:08 AM, hasufell wrote:
>>> On 11/07/2014 07:54 PM, Matthias Maier wrote:
>>>>> Well, you're not comparing like with like. Paludis with "everything
>>>>> turned off" does more than Portage with "everything turned on". If all
>>>>> you're looking for is the wrong answer as fast as possible, there are
>>>>> easier ways of getting it...
>>>>
>>>> The last time I compared the resolver speed of portage and paludis both
>>>> needed almost the same time.
>>>>
>>>> Do you have a speed comparison with a similar feature set of both? (Or,
>>>> alternatively, the speedup one gains by tuning paludis to be as fast as
>>>> possible).
>>>>
>>>
>>> I think you didn't get the idea: it doesn't make much sense to compare
>>> the speed if the correctness differs.
>>>
>>> Also, I don't understand these discussions. The time dependency
>>> resolving takes is marginal compared to the whole update process, no
>>> matter what PM you use.
>>>
>> *ahem*
>>
>> On my old notebook, which luckily suicided thanks to Lenovo's built in
>> obsolete device detection ...
>>
>> emerge -auNDv world took up to 35 minutes
>>
>> So, if something like RUBY_TARGETS or a random useflag changes, it takes
>> me literally DAYS to figure out a valid solution where portage can
>> figure out an upgrade path.
>>
>> No, it's not marginal.
>>
> 
> So we are back to the initial discussion: fix the input instead of
> making the resolver even worse. You can't have both bad input and a fast
> _correct_ resolver.

... ... .... eeeeeeeeh?

If I'm telling portage to not enable mysql support, and some kde bits
through a transitive dependency need the mysql useflag enabled on
$whateverlib, then portage can't find a valid solution because of my
local decisions.

There's nothing gentoo can do in terms of policy or ebuild changes to
avoid that apart from removing all useflags and making me install debian
instead (just to find a variation of that problem with apt)

> 
> So you choose between "good enough results which may not be what you
> want" and "pretty good results with lots of things to figure out
> yourself, because of the input and because the resolver does not make
> random assumptions about what you want".

We can choose for "code that works reasonably fast" - portage hasn't
gotten any noticeable work on performance in a while, and people have
just piled on more and more features and complexity. There's no reason
that it takes a minute to start up, so let's focus on fixing that
instead of trying to write a dependency resolution algorithm that
assumes the Riemann Hypothesis is correct.
> 
> Both solutions s**k, tbh.
> 
> I'd just like people to look at the whole picture and don't keep PM
> discussions narrow-minded by all this NIH crap.
> 
It's not about NIH, it's about slow code being slow.



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 23:33                     ` Patrick Lauer
@ 2014-11-08 23:45                       ` Zac Medico
  2014-11-09  6:53                         ` Andrew Savchenko
  2014-11-09  0:04                       ` hasufell
  2014-11-09 12:43                       ` Ciaran McCreesh
  2 siblings, 1 reply; 74+ messages in thread
From: Zac Medico @ 2014-11-08 23:45 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 03:33 PM, Patrick Lauer wrote:
> We can choose for "code that works reasonably fast" - portage hasn't
> gotten any noticeable work on performance in a while, and people have
> just piled on more and more features and complexity.

Yes, as one of only 2 people who have worked on the resolver in recent
history, I agree with your statement. There have been lots of features
added (including EAPI 5 sub-slot rebuilds), with not much thought to
performance tuning. It will be interesting to do some profiling and see
what we can improve.

> There's no reason that it takes a minute to start up,

If it takes a minute then it probably means that
/var/cache/edb/vdb_metadata.pickle got deleted. In that case, it has to
reload lots of metadata from /var/db/pkg/*/*/*.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 22:39                                       ` Zac Medico
@ 2014-11-08 23:52                                         ` hasufell
  2014-11-08 23:58                                           ` Zac Medico
  0 siblings, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-08 23:52 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 11:39 PM, Zac Medico wrote:
> On 11/08/2014 02:05 PM, hasufell wrote:
>> I have a feeling that this is an assumption as well. PMS just says this
>> is an 'any-of' group. There is not a single word about the processing
>> order of these specs or which one to prefer, in which case some is
>> better than the other and so on.
> 
> I think the two obvious algorithms are:
> 
> A) If the user's resolver parameters request maximum upgrades, then the
> resolver should choose the choice that results the most upgrades.
> 

Neither the first nor the second dependency spec group in this example
leads to an upgrade.

> B) If the user's resolver parameters request minimum change, then the
> resolver should choose the choice which results in keeping the most
> installed packages in place.
> 

I don't know of any such switch in portage or paludis (I may be wrong,
please point me to it unless you mean --nodeps). Whether I get minimum
change is a side effect of other choices and hardly predictable, afais, no?


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 23:52                                         ` hasufell
@ 2014-11-08 23:58                                           ` Zac Medico
  0 siblings, 0 replies; 74+ messages in thread
From: Zac Medico @ 2014-11-08 23:58 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 03:52 PM, hasufell wrote:
> On 11/08/2014 11:39 PM, Zac Medico wrote:
>> On 11/08/2014 02:05 PM, hasufell wrote:
>>> I have a feeling that this is an assumption as well. PMS just says this
>>> is an 'any-of' group. There is not a single word about the processing
>>> order of these specs or which one to prefer, in which case some is
>>> better than the other and so on.
>>
>> I think the two obvious algorithms are:
>>
>> A) If the user's resolver parameters request maximum upgrades, then the
>> resolver should choose the choice that results the most upgrades.
>>
> 
> Neither the first nor the second dependency spec group in this example
> leads to an upgrade.

If you consider "not downgrading" equivalent to an upgrade, the then the
second dependency spec wins (the one on the "right").


>> B) If the user's resolver parameters request minimum change, then the
>> resolver should choose the choice which results in keeping the most
>> installed packages in place.
>>
> 
> I don't know of any such switch in portage or paludis (I may be wrong,
> please point me to it unless you mean --nodeps).

For portage, absence of the emerge --update parameter will trigger a
"minimum change" behavior for packages that are not matched by argument
atoms.

> Whether I get minimum
> change is a side effect of other choices and hardly predictable, afais, no?

Consider it a "best effort" algorithm. Some other constraints may
override the minimum change request.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 23:33                     ` Patrick Lauer
  2014-11-08 23:45                       ` Zac Medico
@ 2014-11-09  0:04                       ` hasufell
  2014-11-09  0:08                         ` Patrick Lauer
  2014-11-09 12:43                       ` Ciaran McCreesh
  2 siblings, 1 reply; 74+ messages in thread
From: hasufell @ 2014-11-09  0:04 UTC (permalink / raw
  To: gentoo-dev

On 11/09/2014 12:33 AM, Patrick Lauer wrote:
> It's not about NIH, it's about slow code being slow.
> 

Can't disagree more. It's about wrong results, broken systems,
unreadable error messages, days of figuring out ruby, perl, haskell,
multilib, python blockers, incorrect autounmask suggestions and even
missing files from time to time (heard that other portage users hit that
too)...

I think you are approaching the problem from the wrong angle. These
problems are connected as has already been pointed out.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-09  0:04                       ` hasufell
@ 2014-11-09  0:08                         ` Patrick Lauer
  2014-11-09  0:10                           ` hasufell
  0 siblings, 1 reply; 74+ messages in thread
From: Patrick Lauer @ 2014-11-09  0:08 UTC (permalink / raw
  To: gentoo-dev

On 11/09/2014 08:04 AM, hasufell wrote:
> On 11/09/2014 12:33 AM, Patrick Lauer wrote:
>> It's not about NIH, it's about slow code being slow.
>>
> 
> Can't disagree more. It's about wrong results, broken systems,
> unreadable error messages, days of figuring out ruby, perl, haskell,
> multilib, python blockers, incorrect autounmask suggestions and even
> missing files from time to time (heard that other portage users hit that
> too)...
> 
> I think you are approaching the problem from the wrong angle. These
> problems are connected as has already been pointed out.
> 

So you are saying everything is bad ...

... why are you still here then? :)

(Constructive criticism, it's a art!)


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-09  0:08                         ` Patrick Lauer
@ 2014-11-09  0:10                           ` hasufell
  0 siblings, 0 replies; 74+ messages in thread
From: hasufell @ 2014-11-09  0:10 UTC (permalink / raw
  To: gentoo-dev

On 11/09/2014 01:08 AM, Patrick Lauer wrote:
> On 11/09/2014 08:04 AM, hasufell wrote:
>> On 11/09/2014 12:33 AM, Patrick Lauer wrote:
>>> It's not about NIH, it's about slow code being slow.
>>>
>>
>> Can't disagree more. It's about wrong results, broken systems,
>> unreadable error messages, days of figuring out ruby, perl, haskell,
>> multilib, python blockers, incorrect autounmask suggestions and even
>> missing files from time to time (heard that other portage users hit that
>> too)...
>>
>> I think you are approaching the problem from the wrong angle. These
>> problems are connected as has already been pointed out.
>>
> 
> So you are saying everything is bad ...
> 

no.



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 23:45                       ` Zac Medico
@ 2014-11-09  6:53                         ` Andrew Savchenko
  2014-11-09  8:15                           ` Zac Medico
  0 siblings, 1 reply; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-09  6:53 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]

On Sat, 08 Nov 2014 15:45:30 -0800 Zac Medico wrote:
> On 11/08/2014 03:33 PM, Patrick Lauer wrote:
> > We can choose for "code that works reasonably fast" - portage hasn't
> > gotten any noticeable work on performance in a while, and people have
> > just piled on more and more features and complexity.
> 
> Yes, as one of only 2 people who have worked on the resolver in recent
> history, I agree with your statement. There have been lots of features
> added (including EAPI 5 sub-slot rebuilds), with not much thought to
> performance tuning. It will be interesting to do some profiling and see
> what we can improve.
> 
> > There's no reason that it takes a minute to start up,
> 
> If it takes a minute then it probably means that
> /var/cache/edb/vdb_metadata.pickle got deleted. In that case, it has to
> reload lots of metadata from /var/db/pkg/*/*/*.

On old hardware it may take dozens of minutes of CPU time. I have
that *.pickle files, I have sqlite metadata cache, I have 100% CPU
usage. It's not an I/O problem. Just take into account that due to
instruction set, Lx cache, frequency and memory speed difference
old CPU-based system may be 20x times slower than recent one.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
       [not found]                   ` <20141108133520.6ec790fe@googlemail.com>
@ 2014-11-09  6:57                     ` Andrew Savchenko
  0 siblings, 0 replies; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-09  6:57 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 657 bytes --]

On Sat, 8 Nov 2014 13:35:20 +0000 Ciaran McCreesh wrote:
> On Sat, 8 Nov 2014 13:56:21 +0300
> Andrew Savchenko <bircoph@gmail.com> wrote:
> > Wanna ~20-30 minutes with sqlite metadata cache enabled?
> > Try on Intel Atom N270 with 2800 packages installed.
> > Dependency resolution is utterly slow.
> 
> Well I highly doubt Paludis will be *that* slow...

Last time I give it a try, it was even slower... May be "more
correct", though as far as I understand term "correctness" is
quite vague for current PMS specification. Just add factor ~30 speed
difference compared to recent Core i7 or Xeon or whatever.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-09  6:53                         ` Andrew Savchenko
@ 2014-11-09  8:15                           ` Zac Medico
  2014-11-18  3:07                             ` Andrew Savchenko
  0 siblings, 1 reply; 74+ messages in thread
From: Zac Medico @ 2014-11-09  8:15 UTC (permalink / raw
  To: gentoo-dev

On 11/08/2014 10:53 PM, Andrew Savchenko wrote:
> On Sat, 08 Nov 2014 15:45:30 -0800 Zac Medico wrote:
>> On 11/08/2014 03:33 PM, Patrick Lauer wrote:
>>> We can choose for "code that works reasonably fast" - portage hasn't
>>> gotten any noticeable work on performance in a while, and people have
>>> just piled on more and more features and complexity.
>>
>> Yes, as one of only 2 people who have worked on the resolver in recent
>> history, I agree with your statement. There have been lots of features
>> added (including EAPI 5 sub-slot rebuilds), with not much thought to
>> performance tuning. It will be interesting to do some profiling and see
>> what we can improve.
>>
>>> There's no reason that it takes a minute to start up,
>>
>> If it takes a minute then it probably means that
>> /var/cache/edb/vdb_metadata.pickle got deleted. In that case, it has to
>> reload lots of metadata from /var/db/pkg/*/*/*.
> 
> On old hardware it may take dozens of minutes of CPU time. I have
> that *.pickle files, I have sqlite metadata cache, I have 100% CPU
> usage. It's not an I/O problem. Just take into account that due to
> instruction set, Lx cache, frequency and memory speed difference
> old CPU-based system may be 20x times slower than recent one.

It could be useful for us to collect profiling data generated on old
hardware. If you'd like to help, you can use python's cProfile module to
generate statistics for us to analyze. The statistics can be nicely
visualized as a shaded call graph by using gprof2dot and graphviz [1].

[1]
http://grasswiki.osgeo.org/wiki/Tools_for_Python_programming#cProfile_profiling_tool
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 19:01                         ` Matthias Dahl
  2014-11-08 19:32                           ` hasufell
@ 2014-11-09 12:34                           ` Ciaran McCreesh
  1 sibling, 0 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-09 12:34 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 549 bytes --]

On Sat, 08 Nov 2014 20:01:54 +0100
Matthias Dahl <ml_gentoo-lists@binary-island.eu> wrote:
> Sorry to chime in like that but if you don't mind, I'd like to ask
> for a real-life example for badly declared dependencies with a few
> words why those are bad and how to make them actually better?

https://bugs.gentoo.org/show_bug.cgi?id=421497

For example. The key message is that developers aren't "writing what
they mean", they're writing "something that appears to work with
Portage if you don't look very hard".

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 21:30                               ` Rich Freeman
  2014-11-08 21:47                                 ` hasufell
@ 2014-11-09 12:40                                 ` Ciaran McCreesh
  1 sibling, 0 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-09 12:40 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1335 bytes --]

On Sat, 8 Nov 2014 16:30:04 -0500
Rich Freeman <rich0@gentoo.org> wrote:
> if you have the second dep installed

Unfortunately the notion of "installed" is where most of the mess with
|| dependencies comes from...

What about "not installed yet, but will be installed during this
resolution"?

What about "an earlier version is installed, and will be upgraded
during this resolution"?

What about "an earlier version is installed, and we weren't going to
upgrade it, but we could"?

What about "a version is installed, but with the wrong USE flags"?

What about "a version in a different SLOT is installed"?

What about "it's installed, and we want to upgrade it, but selecting
this would lock us to an old version"?

Paludis has a *massive* list of scoring rules for these sorts of things
to try to do "the right thing" most of the time. Unfortunately there
are situations in the tree with identically-expressed dependencies
where doing one thing is the "right" answer in one case, and the other
thing is the "right" answer in the other.

One small step towards sanity is to stop using ( ) and || ( ) when the
intent is to select a single package and give a choice of how it's
installed (even if it means new syntax). The second step is to abolish
pretty much every use of ||.

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 21:52                                   ` Jauhien Piatlicki
  2014-11-08 22:05                                     ` hasufell
@ 2014-11-09 12:41                                     ` Ciaran McCreesh
  1 sibling, 0 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-09 12:41 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 588 bytes --]

On Sat, 08 Nov 2014 22:52:24 +0100
Jauhien Piatlicki <jauhien@gentoo.org> wrote:
> So the problem is only with intervals?

No. Intervals are one of many examples.

> I would really like to see a list of them all.

That could be rather tricky... I doubt anyone has a complete list,
particularly since most of the crazy dependencies more or less work
most of the time. The way to find them is to ask every developer "have
you done something crazy with dependencies?", but that relies upon
developers recognising that trying to be clever is a bad thing.

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-08 23:33                     ` Patrick Lauer
  2014-11-08 23:45                       ` Zac Medico
  2014-11-09  0:04                       ` hasufell
@ 2014-11-09 12:43                       ` Ciaran McCreesh
  2 siblings, 0 replies; 74+ messages in thread
From: Ciaran McCreesh @ 2014-11-09 12:43 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 458 bytes --]

On Sun, 09 Nov 2014 07:33:44 +0800
Patrick Lauer <patrick@gentoo.org> wrote:
> instead of trying to write a dependency resolution algorithm that
> assumes the Riemann Hypothesis is correct.

Thank you for your contribution to this thread. I just realised that if
we assume the Generalised Riemann Hypothesis, we can reduce ordering to
knot equivalence, which would now be in co-NP. Please shower me with
more of your wisdom.

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-09  8:15                           ` Zac Medico
@ 2014-11-18  3:07                             ` Andrew Savchenko
  2014-11-18  3:56                               ` Zac Medico
  0 siblings, 1 reply; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-18  3:07 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2042 bytes --]

Hello,

On Sun, 09 Nov 2014 00:15:56 -0800 Zac Medico wrote:
> On 11/08/2014 10:53 PM, Andrew Savchenko wrote:
[...]
> > On old hardware it may take dozens of minutes of CPU time. I have
> > that *.pickle files, I have sqlite metadata cache, I have 100% CPU
> > usage. It's not an I/O problem. Just take into account that due to
> > instruction set, Lx cache, frequency and memory speed difference
> > old CPU-based system may be 20x times slower than recent one.
> 
> It could be useful for us to collect profiling data generated on old
> hardware. If you'd like to help, you can use python's cProfile module to
> generate statistics for us to analyze. The statistics can be nicely
> visualized as a shaded call graph by using gprof2dot and graphviz [1].

Sorry for delay, I generated samples on two old hosts.

Tarball contains per host directories. Each one contains:
- pstats file;
- generated pdf with call graphs and timing;
- host-related information:
  * emerge --info
  * /proc/cpuinfo
  * /proc/memnifo

*.pstats and *.pdf filename should describe command unambiguously,
e.g. emerge-pv_python:2.7_python:3.3-python-3.3.5-r1 means:
emerge -pv python:2.7 python:3.3 while using python-3.3.5-r1 as
interpreter.

hitomi system was not fully updated for a bit more than a year,
while another one for about half a year. So dependency calculations
may be of different intencities. Sets of packages installed are
similar but not the same:
2502 on hitomi
2953 on desktop

I understand that most interesting data will be from emerge -DNupv
output, but to obtain it I must first fix all blockers and
conflicts, this really takes a lot of time, so please be patient.

And last but not the list: on both systems portage is configured to
use sqlite3 cache for metadata, so there was no I/O delays (CPU
usage about 100% according to top).

In order not to abuse mail list with large attachments, data is
available here:
ftp://brunestud.info/gentoo/portage.tar.xz

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-18  3:07                             ` Andrew Savchenko
@ 2014-11-18  3:56                               ` Zac Medico
  2014-11-18  5:47                                 ` Andrew Savchenko
  2014-11-18  7:04                                 ` Zac Medico
  0 siblings, 2 replies; 74+ messages in thread
From: Zac Medico @ 2014-11-18  3:56 UTC (permalink / raw
  To: gentoo-dev

On 11/17/2014 07:07 PM, Andrew Savchenko wrote:
> Hello,
> 
> On Sun, 09 Nov 2014 00:15:56 -0800 Zac Medico wrote:
>> On 11/08/2014 10:53 PM, Andrew Savchenko wrote:
> [...]
>>> On old hardware it may take dozens of minutes of CPU time. I have
>>> that *.pickle files, I have sqlite metadata cache, I have 100% CPU
>>> usage. It's not an I/O problem. Just take into account that due to
>>> instruction set, Lx cache, frequency and memory speed difference
>>> old CPU-based system may be 20x times slower than recent one.
>>
>> It could be useful for us to collect profiling data generated on old
>> hardware. If you'd like to help, you can use python's cProfile module to
>> generate statistics for us to analyze. The statistics can be nicely
>> visualized as a shaded call graph by using gprof2dot and graphviz [1].
> 
> Sorry for delay, I generated samples on two old hosts.
> 
> Tarball contains per host directories. Each one contains:
> - pstats file;
> - generated pdf with call graphs and timing;
> - host-related information:
>   * emerge --info
>   * /proc/cpuinfo
>   * /proc/memnifo
> 
> *.pstats and *.pdf filename should describe command unambiguously,
> e.g. emerge-pv_python:2.7_python:3.3-python-3.3.5-r1 means:
> emerge -pv python:2.7 python:3.3 while using python-3.3.5-r1 as
> interpreter.

Thank you for this data. I will see what I can to do optimize the
problem areas that your data highlights.

> hitomi system was not fully updated for a bit more than a year,
> while another one for about half a year. So dependency calculations
> may be of different intencities. Sets of packages installed are
> similar but not the same:
> 2502 on hitomi

For hitomi, _slot_operator_update_probe/use_reduce is an obvious thing
to optimize. It called use_reduce 53763 times there, so it seems to
repeat use_reduce multiple times for the same packages. That means we
should see a benefit from memoization.

> 2953 on desktop

For desktop, _dynamic_deps_preload is an obvious thing to optimize. You
can avoid this function entirely if you use --dynamic-deps=n. You may
need to run 'emerge @changed-deps' in order to ensure that emerge will
be able to cope with --dynamic-deps=n. IIRC you need at least
sys-apps/portage-2.2.14 for @changed-deps support.

-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-18  3:56                               ` Zac Medico
@ 2014-11-18  5:47                                 ` Andrew Savchenko
  2014-11-18  5:55                                   ` Zac Medico
  2014-11-18  7:04                                 ` Zac Medico
  1 sibling, 1 reply; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-18  5:47 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2219 bytes --]

Hi,

On Mon, 17 Nov 2014 19:56:02 -0800 Zac Medico wrote:
> On 11/17/2014 07:07 PM, Andrew Savchenko wrote:
[...]
> > Tarball contains per host directories. Each one contains:
> > - pstats file;
> > - generated pdf with call graphs and timing;
> > - host-related information:
> >   * emerge --info
> >   * /proc/cpuinfo
> >   * /proc/memnifo
> > 
> > *.pstats and *.pdf filename should describe command unambiguously,
> > e.g. emerge-pv_python:2.7_python:3.3-python-3.3.5-r1 means:
> > emerge -pv python:2.7 python:3.3 while using python-3.3.5-r1 as
> > interpreter.
> 
> Thank you for this data. I will see what I can to do optimize the
> problem areas that your data highlights.
> 
> > hitomi system was not fully updated for a bit more than a year,
> > while another one for about half a year. So dependency calculations
> > may be of different intencities. Sets of packages installed are
> > similar but not the same:
> > 2502 on hitomi
> 
> For hitomi, _slot_operator_update_probe/use_reduce is an obvious thing
> to optimize. It called use_reduce 53763 times there, so it seems to
> repeat use_reduce multiple times for the same packages. That means we
> should see a benefit from memoization.
> 
> > 2953 on desktop
> 
> For desktop, _dynamic_deps_preload is an obvious thing to optimize. You
> can avoid this function entirely if you use --dynamic-deps=n. You may
> need to run 'emerge @changed-deps' in order to ensure that emerge will
> be able to cope with --dynamic-deps=n. IIRC you need at least
> sys-apps/portage-2.2.14 for @changed-deps support.
 
I use 2.2.14 on both hosts (and usually latest ~x86 portage is
there). I thought that running fixpackages should be enough to run
emerge with --dynamic-deps=n. 

I'm afraid I will not be able to run emerge @changed-deps prior to
@world update due to too old packages installed (e.g. versions not
in tree anymore), blockers and unsatisfied deps.

When I'll manage to run emerge -DNupv @world without errors, I'll
send you stats for both runs with and without dynamic deps.

By the way, do you need pstats files (e.g. for some extra data) or
pdf graphs are sufficient?

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-18  5:47                                 ` Andrew Savchenko
@ 2014-11-18  5:55                                   ` Zac Medico
  2014-11-19 19:59                                     ` Andrew Savchenko
  0 siblings, 1 reply; 74+ messages in thread
From: Zac Medico @ 2014-11-18  5:55 UTC (permalink / raw
  To: gentoo-dev

On 11/17/2014 09:47 PM, Andrew Savchenko wrote:
> I use 2.2.14 on both hosts (and usually latest ~x86 portage is
> there). I thought that running fixpackages should be enough to run
> emerge with --dynamic-deps=n. 

It depends on how badly the installed deps have diverged from the
corresponding ebuilds in the tree.

> I'm afraid I will not be able to run emerge @changed-deps prior to
> @world update due to too old packages installed (e.g. versions not
> in tree anymore), blockers and unsatisfied deps.

Yeah, it's probably better to run it after, but skip it if emerge
--dynamic-deps=n seems to behave well already.

> When I'll manage to run emerge -DNupv @world without errors, I'll
> send you stats for both runs with and without dynamic deps.

Great, hopefully that will reveal some more good things to optimize.

> By the way, do you need pstats files (e.g. for some extra data) or
> pdf graphs are sufficient?

The pdf graphs are typically enough for me, since they highlight the hot
spots really well. I did not even bother with your pstats files.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-18  3:56                               ` Zac Medico
  2014-11-18  5:47                                 ` Andrew Savchenko
@ 2014-11-18  7:04                                 ` Zac Medico
  1 sibling, 0 replies; 74+ messages in thread
From: Zac Medico @ 2014-11-18  7:04 UTC (permalink / raw
  To: gentoo-dev

On 11/17/2014 07:56 PM, Zac Medico wrote:
> For hitomi, _slot_operator_update_probe/use_reduce is an obvious thing
> to optimize. It called use_reduce 53763 times there, so it seems to
> repeat use_reduce multiple times for the same packages. That means we
> should see a benefit from memoization.

I've written a patch for this [1], and it gives good results (22.4% less
time for dep calc on one of my computers). If you do more profiling, it
would be best to apply this patch first, in order to increase the
contrast for other hot spots.

[1] https://bugs.gentoo.org/show_bug.cgi?id=529660
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-18  5:55                                   ` Zac Medico
@ 2014-11-19 19:59                                     ` Andrew Savchenko
  2014-11-20  2:44                                       ` Andrew Savchenko
  2014-11-20 20:19                                       ` Zac Medico
  0 siblings, 2 replies; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-19 19:59 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 3367 bytes --]

Hello,

On Mon, 17 Nov 2014 21:55:48 -0800 Zac Medico wrote:
> On 11/17/2014 09:47 PM, Andrew Savchenko wrote:
> > I use 2.2.14 on both hosts (and usually latest ~x86 portage is
> > there). I thought that running fixpackages should be enough to run
> > emerge with --dynamic-deps=n. 
> 
> It depends on how badly the installed deps have diverged from the
> corresponding ebuilds in the tree.

I tried fixpackages. It fixed some problems and looks like
dependencies resolution became faster. But not all problems are
fixed and I can't use --dynamic-deps n on both systems for now;
and emerge @changed-deps fails due to numerous conflicts, blocks,
unsatisfied deps (this is not surprising, since it doesn't try to
update all packages in tree).

By the way, is there any way to unroll conflict lists in portage
output? I mean if I have following:

  (dev-lang/ghc-7.6.3-r1:0/7.6.3::gentoo, installed) pulled in by
    >=dev-lang/ghc-6.8.2:0/7.6.3= required by (dev-haskell/random-1.0.1.1-r1:0/1.0.1.1::gentoo, installed)
			^^^^^^^^^
    (and 68 more with the same problem)

How can I see all list of these 68 packages? Sometimes this feature is
really desired, e.g. if I don't want to update all @world but need to
apply GLSA fix which leads to similar conflicts. I can't find any
switch in emerge manual.

> > When I'll manage to run emerge -DNupv @world without errors, I'll
> > send you stats for both runs with and without dynamic deps.
> 
> Great, hopefully that will reveal some more good things to optimize.
> 
> > By the way, do you need pstats files (e.g. for some extra data) or
> > pdf graphs are sufficient?
> 
> The pdf graphs are typically enough for me, since they highlight the hot
> spots really well. I did not even bother with your pstats files.

OK. I managed to run emerge -DNupv @world on desktop without
conflicts. What was done:
1) fixpacgkages run
2) portage is updated to use your patch from bug 529660

At this point performance boost was really great: from ~35
minutes to ~19-20 minutes.

Afterward I tried emerge -DNupv @world with different python
versions:
     (2.7)  (~)2.7.8
     (3.3)  3.3.5-r1
     (3.4)  (~)3.4.2

Results are interesting (confidence level for error is 0.95, time
real value was used for calculations):
3.3 is 3% ± 5% faster than 2.7
3.4 is 20% ± 5% faster than 3.3
And with python:3.4 and steps above it takes now 15.5 minutes
instead of 35. Nice result :)

So there is no evidence that portage on 3.3 is faster than on 2.7,
but 3.4 is faster than 3.3 with very good confidence. Of course
this data is biased by -m cProfile overhead, but bias should
similar for each version. Just checked time to run command for
python:3.4 without profiling: it takes 11.5 minutes!

You may find generated pdf graphs together with system information
for each host here:
ftp://brunestud.info/gentoo/portage-v2.tar.xz

As for hitomi box, it is both slower and have much older packages,
so I'm still struggling to fix conflicts and other issues. Results
will be available later.

P.S. Note for those who would like to use gpro2dot: it should be
run with the same python interpreter active as was used during
pstats data collection, otherwise it will fail to process data.
I spent some time while figuring this out.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-19 19:59                                     ` Andrew Savchenko
@ 2014-11-20  2:44                                       ` Andrew Savchenko
  2014-11-20 20:19                                       ` Zac Medico
  1 sibling, 0 replies; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-20  2:44 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 3316 bytes --]

Hi,

On Wed, 19 Nov 2014 22:59:05 +0300 Andrew Savchenko wrote:
> On Mon, 17 Nov 2014 21:55:48 -0800 Zac Medico wrote:
[...]
> > > When I'll manage to run emerge -DNupv @world without errors, I'll
> > > send you stats for both runs with and without dynamic deps.
> > 
> > Great, hopefully that will reveal some more good things to optimize.
> > 
> > > By the way, do you need pstats files (e.g. for some extra data) or
> > > pdf graphs are sufficient?
> > 
> > The pdf graphs are typically enough for me, since they highlight the hot
> > spots really well. I did not even bother with your pstats files.
> 
> OK. I managed to run emerge -DNupv @world on desktop without
> conflicts. What was done:
> 1) fixpacgkages run
> 2) portage is updated to use your patch from bug 529660
> 
> At this point performance boost was really great: from ~35
> minutes to ~19-20 minutes.
> 
> Afterward I tried emerge -DNupv @world with different python
> versions:
>      (2.7)  (~)2.7.8
>      (3.3)  3.3.5-r1
>      (3.4)  (~)3.4.2
> 
> Results are interesting (confidence level for error is 0.95, time
> real value was used for calculations):
> 3.3 is 3% ± 5% faster than 2.7
> 3.4 is 20% ± 5% faster than 3.3
> And with python:3.4 and steps above it takes now 15.5 minutes
> instead of 35. Nice result :)
> 
> So there is no evidence that portage on 3.3 is faster than on 2.7,
> but 3.4 is faster than 3.3 with very good confidence. Of course
> this data is biased by -m cProfile overhead, but bias should
> similar for each version. Just checked time to run command for
> python:3.4 without profiling: it takes 11.5 minutes!
> 
> You may find generated pdf graphs together with system information
> for each host here:
> ftp://brunestud.info/gentoo/portage-v2.tar.xz
> 
> As for hitomi box, it is both slower and have much older packages,
> so I'm still struggling to fix conflicts and other issues. Results
> will be available later.

I managed to get data from hitomi too, see:
ftp://brunestud.info/gentoo/portage-v3.tar.xz
(this archive also contains all graphs previously obtained)

Graphs were obtained the same way as on desktop.
Portage and python versions are the same, time information follows
for _profiled_ runs:

>      (2.7)  (~)2.7.8
real    55m19.892s
user    39m11.913s
sys     15m37.586s

>      (3.3)  3.3.5-r1
real    52m34.640s
user    36m45.325s
sys     15m25.663s

>      (3.4)  (~)3.4.2
real    53m32.657s
user    37m12.369s
sys     15m52.641s

Without profiling using 3.3.5-r1:
real    25m50.439s
user    25m28.260s
sys     0m7.863s

This is quite surprising. On hitomi (Intel Atom N270) there is no
difference between 3.3 and 3.4, but both are slightly better than
2.7. (To exclude possible cache issues I made a blank run before
first test run.) Probably some arch-dependent optimizations.

What surprises me most is that profiling overhead is huge (~105%)
compared to overhead on desktop (~35%). CPU speeds are not that
different, instruction sets too (Atom has sse2, sse3, ssse3, but
lacks 3dnow, 3dnowext). L2 cache is the same (512КB), but L1
differs significantly: 64 KB data/64 KB instruction cache vs 24 KB
data/32 KB instruction cache. Look like this is the reason.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-19 19:59                                     ` Andrew Savchenko
  2014-11-20  2:44                                       ` Andrew Savchenko
@ 2014-11-20 20:19                                       ` Zac Medico
  2014-11-21  4:16                                         ` Zac Medico
  1 sibling, 1 reply; 74+ messages in thread
From: Zac Medico @ 2014-11-20 20:19 UTC (permalink / raw
  To: gentoo-dev

On 11/19/2014 11:59 AM, Andrew Savchenko wrote:
> Hello,
> 
> On Mon, 17 Nov 2014 21:55:48 -0800 Zac Medico wrote:
>> On 11/17/2014 09:47 PM, Andrew Savchenko wrote:
>>> I use 2.2.14 on both hosts (and usually latest ~x86 portage is
>>> there). I thought that running fixpackages should be enough to run
>>> emerge with --dynamic-deps=n. 
>>
>> It depends on how badly the installed deps have diverged from the
>> corresponding ebuilds in the tree.
> 
> I tried fixpackages. It fixed some problems and looks like
> dependencies resolution became faster. But not all problems are
> fixed and I can't use --dynamic-deps n on both systems for now;
> and emerge @changed-deps fails due to numerous conflicts, blocks,
> unsatisfied deps (this is not surprising, since it doesn't try to
> update all packages in tree).
> 
> By the way, is there any way to unroll conflict lists in portage
> output? I mean if I have following:
> 
>   (dev-lang/ghc-7.6.3-r1:0/7.6.3::gentoo, installed) pulled in by
>     >=dev-lang/ghc-6.8.2:0/7.6.3= required by (dev-haskell/random-1.0.1.1-r1:0/1.0.1.1::gentoo, installed)
> 			^^^^^^^^^
>     (and 68 more with the same problem)
> 
> How can I see all list of these 68 packages? Sometimes this feature is
> really desired, e.g. if I don't want to update all @world but need to
> apply GLSA fix which leads to similar conflicts. I can't find any
> switch in emerge manual.

There's currently no switch for this. However, you can use a a command
like this to see all installed packages that pull in your installed ghc:

	emerge -pv --depclean dev-lang/ghc

I've filed a feature request bug for the switch that you have requested:

	https://bugs.gentoo.org/show_bug.cgi?id=529988

> As for hitomi box, it is both slower and have much older packages,
> so I'm still struggling to fix conflicts and other issues. Results
> will be available later.

From your results, it seems that _select_pkg_highest_available would be
an obvious thing to optimize. This method already uses memoization, but
the cache is entirely discarded each time that a package is added to the
graph. I will see about making it salvage as much cache as possible when
a package is added.

> P.S. Note for those who would like to use gpro2dot: it should be
> run with the same python interpreter active as was used during
> pstats data collection, otherwise it will fail to process data.
> I spent some time while figuring this out.

I wasn't aware of that, so thanks for the tip.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-20 20:19                                       ` Zac Medico
@ 2014-11-21  4:16                                         ` Zac Medico
  2014-11-24  1:50                                           ` Andrew Savchenko
  0 siblings, 1 reply; 74+ messages in thread
From: Zac Medico @ 2014-11-21  4:16 UTC (permalink / raw
  To: gentoo-dev

On 11/20/2014 12:19 PM, Zac Medico wrote:
> On 11/19/2014 11:59 AM, Andrew Savchenko wrote:
>> Hello,
>>
>> On Mon, 17 Nov 2014 21:55:48 -0800 Zac Medico wrote:
>>> On 11/17/2014 09:47 PM, Andrew Savchenko wrote:
>>>> I use 2.2.14 on both hosts (and usually latest ~x86 portage is
>>>> there). I thought that running fixpackages should be enough to run
>>>> emerge with --dynamic-deps=n. 
>>>
>>> It depends on how badly the installed deps have diverged from the
>>> corresponding ebuilds in the tree.
>>
>> I tried fixpackages. It fixed some problems and looks like
>> dependencies resolution became faster. But not all problems are
>> fixed and I can't use --dynamic-deps n on both systems for now;
>> and emerge @changed-deps fails due to numerous conflicts, blocks,
>> unsatisfied deps (this is not surprising, since it doesn't try to
>> update all packages in tree).
>>
>> By the way, is there any way to unroll conflict lists in portage
>> output? I mean if I have following:
>>
>>   (dev-lang/ghc-7.6.3-r1:0/7.6.3::gentoo, installed) pulled in by
>>     >=dev-lang/ghc-6.8.2:0/7.6.3= required by (dev-haskell/random-1.0.1.1-r1:0/1.0.1.1::gentoo, installed)
>> 			^^^^^^^^^
>>     (and 68 more with the same problem)
>>
>> How can I see all list of these 68 packages? Sometimes this feature is
>> really desired, e.g. if I don't want to update all @world but need to
>> apply GLSA fix which leads to similar conflicts. I can't find any
>> switch in emerge manual.
> 
> There's currently no switch for this. However, you can use a a command
> like this to see all installed packages that pull in your installed ghc:
> 
> 	emerge -pv --depclean dev-lang/ghc
> 
> I've filed a feature request bug for the switch that you have requested:
> 
> 	https://bugs.gentoo.org/show_bug.cgi?id=529988

I forgot, we have a --verbose-conflicts option already.

>> As for hitomi box, it is both slower and have much older packages,
>> so I'm still struggling to fix conflicts and other issues. Results
>> will be available later.
> 
> From your results, it seems that _select_pkg_highest_available would be
> an obvious thing to optimize. This method already uses memoization, but
> the cache is entirely discarded each time that a package is added to the
> graph. I will see about making it salvage as much cache as possible when
> a package is added.

I've written a patch, and it gives good results. On one of my computers
with this patch, 'emerge -puvDN @world' takes 15% less time, and results
in 58% fewer _select_pkg_highest_available_imp calls:

	https://bugs.gentoo.org/show_bug.cgi?id=530010
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-21  4:16                                         ` Zac Medico
@ 2014-11-24  1:50                                           ` Andrew Savchenko
  2014-11-24  2:34                                             ` Zac Medico
  2014-11-24  7:41                                             ` Alexander Berntsen
  0 siblings, 2 replies; 74+ messages in thread
From: Andrew Savchenko @ 2014-11-24  1:50 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]

Hi, 

On Thu, 20 Nov 2014 20:16:09 -0800 Zac Medico wrote:
> > There's currently no switch for this. However, you can use a a command
> > like this to see all installed packages that pull in your installed ghc:
> > 
> > 	emerge -pv --depclean dev-lang/ghc
> > 
> > I've filed a feature request bug for the switch that you have requested:
> > 
> > 	https://bugs.gentoo.org/show_bug.cgi?id=529988
> 
> I forgot, we have a --verbose-conflicts option already.

Yeah, that's exactly what I need. Somehow I missed this option,
sorry.
 
> > From your results, it seems that _select_pkg_highest_available would be
> > an obvious thing to optimize. This method already uses memoization, but
> > the cache is entirely discarded each time that a package is added to the
> > graph. I will see about making it salvage as much cache as possible when
> > a package is added.
> 
> I've written a patch, and it gives good results. On one of my computers
> with this patch, 'emerge -puvDN @world' takes 15% less time, and results
> in 58% fewer _select_pkg_highest_available_imp calls:
> 
> 	https://bugs.gentoo.org/show_bug.cgi?id=530010

I've done some profiling with this both patches (from bugs 529660
and 530010) applied. See *p530010*.pdf files for both hosts here:
ftp://brunestud.info/gentoo/portage-v4.tar.xz

For performance enhancement I get the following data for
normal (that is unprofiled) runs:
	time	calls
desktop	13%	62%
hitomi	 6%	62%

where time — is % less real time for emerge run and calls stands
for % less _select_pkg_highest_available_imp calls.

While testing --verbose-conflicts I found that if tree contains
unresolved conflict, it takes much longer time for emerge to
proceed: +36% on desktop and +44% on hitomi.

Unresolved conflict was created by adding apt-text/dvipdfmx to the
@world: it blocks latex-2014 update. (Actually I just reverted one
of fixes I made earlier in order to run -DNupv @world without
errors .) Looks like emerge tries to build more depgraps to solves
this issue. I'm not sure if this may be a subject for optimization.
Graps are also present in the tarball in "*@world_with_blocks*.pdf"
files.

And thank your for optimizations! They really give a new breath to
my old boxes. Awaiting for them upstream :)

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-24  1:50                                           ` Andrew Savchenko
@ 2014-11-24  2:34                                             ` Zac Medico
  2014-11-24  7:41                                             ` Alexander Berntsen
  1 sibling, 0 replies; 74+ messages in thread
From: Zac Medico @ 2014-11-24  2:34 UTC (permalink / raw
  To: gentoo-dev

On 11/23/2014 05:50 PM, Andrew Savchenko wrote:
> I've done some profiling with this both patches (from bugs 529660
> and 530010) applied. See *p530010*.pdf files for both hosts here:
> ftp://brunestud.info/gentoo/portage-v4.tar.xz
> 
> For performance enhancement I get the following data for
> normal (that is unprofiled) runs:
> 	time	calls
> desktop	13%	62%
> hitomi	 6%	62%
> 
> where time — is % less real time for emerge run and calls stands
> for % less _select_pkg_highest_available_imp calls.
> 
> While testing --verbose-conflicts I found that if tree contains
> unresolved conflict, it takes much longer time for emerge to
> proceed: +36% on desktop and +44% on hitomi.
> 
> Unresolved conflict was created by adding apt-text/dvipdfmx to the
> @world: it blocks latex-2014 update. (Actually I just reverted one
> of fixes I made earlier in order to run -DNupv @world without
> errors .) Looks like emerge tries to build more depgraps to solves
> this issue.

This is controlled by the --backtrack option (10 is default). That means
it can create up to 10 depgraphs before it aborts. On a slow computer,
you might consider putting --backtrack=1 in EMERGE_DEFAULT_OPTS. That
way, it won't waist so mush time searching for a solution before it
ultimately fails. Note that you need at least --backtrack=1 for EAPI 5
sub-slot rebuilds to work.

> I'm not sure if this may be a subject for optimization.

Yes, there are many possible ways to optimize it. Heuristics can be used
to recognize various kinds of solvable conflicts and solve them more
efficiently. For example, the code in this commit added support for
solving some slot conflicts without backtracking:

https://github.com/gentoo/portage/commit/a862cc5dd1a56114fa579c5fb01b518b243666d9

> Graps are also present in the tarball in "*@world_with_blocks*.pdf"
> files.

Unfortunately, I don't see anything else that's easily optimized by
memoization. Further optimization will require more heuristics as
mentioned above.

> And thank your for optimizations! They really give a new breath to
> my old boxes. Awaiting for them upstream :)

You're welcome. Thanks for collecting the data.
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Portage dependency solving algorithm
  2014-11-24  1:50                                           ` Andrew Savchenko
  2014-11-24  2:34                                             ` Zac Medico
@ 2014-11-24  7:41                                             ` Alexander Berntsen
  1 sibling, 0 replies; 74+ messages in thread
From: Alexander Berntsen @ 2014-11-24  7:41 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 24/11/14 02:50, Andrew Savchenko wrote:
> On Thu, 20 Nov 2014 20:16:09 -0800 Zac Medico wrote:
>>> I forgot, we have a --verbose-conflicts option already.
> Yeah, that's exactly what I need. Somehow I missed this option, 
> sorry.
That's OK. I forgot about it too. And I wrote it. :-]

- -- 
Alexander
bernalex@gentoo.org
https://secure.plaimi.net/~alexander
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlRy4RoACgkQRtClrXBQc7XWYAEAhnmSUuhmAIO5coaEBPyqp/VH
GwrWo7ZsvLUdUVEjfgcBAJESgR5qsUULpqdAhQOcWek1Uhny9Y83hROjAc1hKq+E
=Yvs5
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2014-11-07  9:49       ` Luca Barbato
@ 2015-01-16 17:30         ` Jan Matejka
  2015-01-16 20:00           ` Luca Barbato
  0 siblings, 1 reply; 74+ messages in thread
From: Jan Matejka @ 2015-01-16 17:30 UTC (permalink / raw
  To: Luca Barbato; +Cc: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Fri, 07 Nov 2014 10:49:13 +0100
Luca Barbato <lu_zero@gentoo.org> wrote:

> On 07/11/14 06:06, Harsh Bhatt wrote:
> 
> Also make might enjoy improvements.

shake?



- --
Jan Matějka        | Developer
https://gentoo.org | Gentoo Linux
GPG: A33E F5BC A9F6 DAFD 2021  6FB6 3EBF D45B EEB6 CA8B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCgAGBQJUuUqnAAoJEIN+7RD5ejahNwMH/R2OruTRy0yi4cwKFhPGqVZv
SKVLp5jNGkY9pTFnMApuqqh53Tb4OW3uDO99wpkzpyzzr0wgarGFg1N6YAMkRf+g
3Vy+WvDrK6zQeu0IYq1VBMODSun6fgWUsNiEBgqYbDPqa/SmfTAGhIF3dt5HH6Gx
J6T2SVFjjPFN+6LtWxVHph3G6/zSvKlHXKevqr4Po7PqnMXDnDBJ24LreNPVV0Aw
9G2lzT8/yuIvTF1x8FMinqOAWlp3CXYcfhizdYaFmMb7ROGZZFZJISx4L4GhkEK+
ojW457sX20Payc3GnY0O6yT29FDAf+1HQEhpEW2WEQ2hdP1lovtAiq+qyJnubrA=
=lB3d
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2015-01-16 17:30         ` Jan Matejka
@ 2015-01-16 20:00           ` Luca Barbato
  2015-02-02 17:26             ` Jan Matejka
  0 siblings, 1 reply; 74+ messages in thread
From: Luca Barbato @ 2015-01-16 20:00 UTC (permalink / raw
  To: Jan Matejka; +Cc: gentoo-dev

On 16/01/15 18:30, Jan Matejka wrote:
> On Fri, 07 Nov 2014 10:49:13 +0100
> Luca Barbato <lu_zero@gentoo.org> wrote:
> 
>> On 07/11/14 06:06, Harsh Bhatt wrote:
> 
>> Also make might enjoy improvements.
> 
> shake?

Anything written in haskell tend to be impractical to deploy.

tup managed to get lots of great ideas spoiled by being impractically
extremist in tracking the directory changes.

lu


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2015-01-16 20:00           ` Luca Barbato
@ 2015-02-02 17:26             ` Jan Matejka
  2015-02-02 18:01               ` hasufell
  0 siblings, 1 reply; 74+ messages in thread
From: Jan Matejka @ 2015-02-02 17:26 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Fri, 16 Jan 2015 21:00:24 +0100
Luca Barbato <lu_zero@gentoo.org> wrote:

> On 16/01/15 18:30, Jan Matejka wrote:
> > On Fri, 07 Nov 2014 10:49:13 +0100
> > Luca Barbato <lu_zero@gentoo.org> wrote:
> > 
> >> On 07/11/14 06:06, Harsh Bhatt wrote:
> > 
> >> Also make might enjoy improvements.
> > 
> > shake?
> 
> Anything written in haskell tend to be impractical to deploy.

http://code.haskell.org/~dons/talks/dons-google-2015-01-27.pdf

> tup managed to get lots of great ideas spoiled by being impractically
> extremist in tracking the directory changes.

I don't know what tup is but I'm guessing it's an application.

Are you judging a language to be impractical because one
application made (allegedly) a bad design decision?


- --
Jan Matějka        | Developer
https://gentoo.org | Gentoo Linux
GPG: A33E F5BC A9F6 DAFD 2021  6FB6 3EBF D45B EEB6 CA8B
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCgAGBQJUz7MtAAoJEIN+7RD5ejahmuoH/1CYfKRdrgtcms2U1Rcio2HQ
oJsDY+5SZerGSJrnnohd7l/FHbxcA51H04IUws22GlJ7OnIlVRD/IuYlAyLogc9m
bvg/Tt/OuRavHqdhi5JmJfQqYUVZJiEBQok5jG9Aa6+0+d1rPYzUQFsbNQ4ywO12
LLdVATR/2ovrEgVNmgUJQlfeZy6Axo3MwHbBRjsoi+2eKlSVBwKmAQMvpifLr5bI
8l2hOa7CGis02uWa8t8JixZ3XSkqrcjExGQYcBbWdCYVulfXgUbz0pNkQipOCOh+
+bNzubNDOGMSyiJ1mmtRG46vEKhgefns+IvxEhiOIIeJajPJR+R3EU0cV2LAvD0=
=+ORA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2015-02-02 17:26             ` Jan Matejka
@ 2015-02-02 18:01               ` hasufell
  2015-02-06 16:34                 ` Alexander Berntsen
  0 siblings, 1 reply; 74+ messages in thread
From: hasufell @ 2015-02-02 18:01 UTC (permalink / raw
  To: gentoo-dev

Jan Matejka:
> On Fri, 16 Jan 2015 21:00:24 +0100
> Luca Barbato <lu_zero@gentoo.org> wrote:
> 
>> On 16/01/15 18:30, Jan Matejka wrote:
>>> On Fri, 07 Nov 2014 10:49:13 +0100
>>> Luca Barbato <lu_zero@gentoo.org> wrote:
>>>
>>>> On 07/11/14 06:06, Harsh Bhatt wrote:
>>>
>>>> Also make might enjoy improvements.
>>>
>>> shake?
> 
>> Anything written in haskell tend to be impractical to deploy.
> 
> http://code.haskell.org/~dons/talks/dons-google-2015-01-27.pdf
> 

Yep, I too think that statement above is incorrect.

Also have a look at https://wiki.haskell.org/Haskell_in_industry

Microsoft, Google, Facebook, Nvidia...


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [gentoo-dev] Regarding my final year thesis
  2015-02-02 18:01               ` hasufell
@ 2015-02-06 16:34                 ` Alexander Berntsen
  0 siblings, 0 replies; 74+ messages in thread
From: Alexander Berntsen @ 2015-02-06 16:34 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 02/02/15 19:01, hasufell wrote:
> Jan Matejka:
>> On Fri, 16 Jan 2015 21:00:24 +0100 Luca Barbato
>> <lu_zero@gentoo.org> wrote:
>>> Anything written in haskell tend to be impractical to deploy.
>> 
>> http://code.haskell.org/~dons/talks/dons-google-2015-01-27.pdf
>> 
> 
> Yep, I too think that statement above is incorrect.
> 
> Also have a look at https://wiki.haskell.org/Haskell_in_industry
> 
> Microsoft, Google, Facebook, Nvidia...
As someone who deploys things in Haskell, I would have to disagree
with the original comment as well.
- -- 
Alexander
bernalex@gentoo.org
https://secure.plaimi.net/~alexander
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlTU7SgACgkQRtClrXBQc7XsTAD/S8QuDfjokcbNU7b5k/vovYOD
hJhSb97gDLI2I+cTv3gA/1gLO5cbVePebqI0NafJs6tFrr8gZ46/Plb0nVwNJSfz
=UCwp
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2015-02-06 16:34 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-06 12:49 [gentoo-dev] Regarding my final year thesis Harsh Bhatt
2014-11-06 13:25 ` Jauhien Piatlicki
2014-11-06 13:43   ` Ciaran McCreesh
2014-11-06 15:18     ` Ian Stakenvicius
2014-11-06 15:26       ` Ciaran McCreesh
2014-11-07  9:42     ` [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis) Jauhien Piatlicki
2014-11-07 17:36       ` Zac Medico
2014-11-07 18:07       ` Ciaran McCreesh
2014-11-07 18:11         ` Jauhien Piatlicki
2014-11-07 18:30           ` Ciaran McCreesh
2014-11-07 18:54             ` [gentoo-dev] Portage dependency solving algorithm Matthias Maier
2014-11-07 19:08               ` hasufell
2014-11-07 19:56                 ` Jauhien Piatlicki
2014-11-07 20:44                   ` hasufell
2014-11-07 20:55                     ` Jauhien Piatlicki
2014-11-07 21:04                       ` hasufell
2014-11-07 21:41                         ` Zac Medico
2014-11-08 13:40                         ` Ciaran McCreesh
2014-11-08  8:29                 ` vivo75
2014-11-08 13:35                   ` Ciaran McCreesh
2014-11-08 14:08                     ` vivo75
2014-11-08 11:10                 ` Andrew Savchenko
2014-11-08 13:24                 ` Patrick Lauer
2014-11-08 14:48                   ` hasufell
2014-11-08 23:33                     ` Patrick Lauer
2014-11-08 23:45                       ` Zac Medico
2014-11-09  6:53                         ` Andrew Savchenko
2014-11-09  8:15                           ` Zac Medico
2014-11-18  3:07                             ` Andrew Savchenko
2014-11-18  3:56                               ` Zac Medico
2014-11-18  5:47                                 ` Andrew Savchenko
2014-11-18  5:55                                   ` Zac Medico
2014-11-19 19:59                                     ` Andrew Savchenko
2014-11-20  2:44                                       ` Andrew Savchenko
2014-11-20 20:19                                       ` Zac Medico
2014-11-21  4:16                                         ` Zac Medico
2014-11-24  1:50                                           ` Andrew Savchenko
2014-11-24  2:34                                             ` Zac Medico
2014-11-24  7:41                                             ` Alexander Berntsen
2014-11-18  7:04                                 ` Zac Medico
2014-11-09  0:04                       ` hasufell
2014-11-09  0:08                         ` Patrick Lauer
2014-11-09  0:10                           ` hasufell
2014-11-09 12:43                       ` Ciaran McCreesh
2014-11-07 19:21               ` Ciaran McCreesh
2014-11-07 19:57                 ` Jauhien Piatlicki
2014-11-08 13:40                   ` Ciaran McCreesh
2014-11-08 18:11                     ` Hilco Wijbenga
2014-11-08 18:26                       ` Ciaran McCreesh
2014-11-08 19:01                         ` Matthias Dahl
2014-11-08 19:32                           ` hasufell
2014-11-08 19:48                             ` hasufell
2014-11-08 21:30                               ` Rich Freeman
2014-11-08 21:47                                 ` hasufell
2014-11-08 21:52                                   ` Jauhien Piatlicki
2014-11-08 22:05                                     ` hasufell
2014-11-08 22:39                                       ` Zac Medico
2014-11-08 23:52                                         ` hasufell
2014-11-08 23:58                                           ` Zac Medico
2014-11-09 12:41                                     ` Ciaran McCreesh
2014-11-09 12:40                                 ` Ciaran McCreesh
2014-11-09 12:34                           ` Ciaran McCreesh
2014-11-07 20:06                 ` Matthias Maier
2014-11-08 10:56                 ` Andrew Savchenko
2014-11-08 20:59                   ` [gentoo-dev] " Duncan
     [not found]                   ` <20141108133520.6ec790fe@googlemail.com>
2014-11-09  6:57                     ` [gentoo-dev] " Andrew Savchenko
2014-11-06 21:28   ` [gentoo-dev] Regarding my final year thesis Jeroen Roovers
2014-11-07  5:06     ` Harsh Bhatt
2014-11-07  9:49       ` Luca Barbato
2015-01-16 17:30         ` Jan Matejka
2015-01-16 20:00           ` Luca Barbato
2015-02-02 17:26             ` Jan Matejka
2015-02-02 18:01               ` hasufell
2015-02-06 16:34                 ` Alexander Berntsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox