* [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
@ 2003-12-05 9:58 George Shapovalov
2003-12-05 12:26 ` Paul de Vrieze
2003-12-05 16:54 ` [gentoo-portage-dev] portage-ng design competition -- not yet Daniel Robbins
0 siblings, 2 replies; 27+ messages in thread
From: George Shapovalov @ 2003-12-05 9:58 UTC (permalink / raw
To: gentoo-portage-dev, gentoo-dev; +Cc: Daniel Robbins, dholm
Sorry for the crosspost, but it looks like this topic is approximately
equivalently active on either lis, and I did not find the "submission
instructions" perhaps because its not yet time for design submits :)..
On Wednesday 03 December 2003 15:08, Daniel Robbins wrote:
> I haven't looked at twisted, but a good solution suggested by nerdboy is
> to have a design competition once we have the requirements finalized.
So, we are going to do it according to "accepted practices" :).
Seriously, I am glad to see it! And here is my entry ;).
Well, this really is a proposal of the language to use for core stuff, not as
much of a design. Ever since the implementation in Prolog was mentioned I was
keeping some thoughts on the backburner and finally I decided to do a
competing entry, for the reason's I'll try to outline.
I have them nicely wrapped up here:
http://dev.gentoo.org/~george/portage-ng_core-proposal.html
To reiterate them shortly, Prolog is a really esoteric language and I am not
sure we will be able to find enough people to feel comfortable about having
the very core of portage-ng implemented in it. Also there might be issues of
portability and efficiency..
On the other hand I understand the desire to stay clear off the C/C++ use and
completely support it. Therefore I propose a middle-ground solution, to use a
common compiled procedural language that was designed to enhance readability,
modularization and ease maintaince of a large system. Oh, it is also very
portable and widely awailable and is alive and well supported.
What else? It took me only about two weeks (of like 1-2 hrs per day of
reading) to get into it and sturt crunching out some code when I decided to
learn it :).. (not Hellow World, but real code, mind you).
But read-on for the details..
However that's not all. I have produced some basic prototyping code to
illustrate what could be expected. The prototype is quite crude, as I did
this during relatively rare breaks from writing an article (completely
unrelated to CS :)), but it should serve the purpose. Did I say the code
shoul be readable? So, even though I do not expect many people to be familiar
with that language I would still suggest trying to look at the code. You are
in for a one nice surprise ;).
(I am not revealing the name of language in this posting deliberately, because
I want people to read through arguments first).
The code is available here:
http://dev.gentoo.org/~george/proto_portage-0.7.5.tar.bz2
but you will probably want to read the text before that.
In any case, if you want to jump in, just a short install instruction:
run "emerge gnat booch_components "
then untar the package and run make
Although reading INSTALL that comes with the package might be usefull too ;)
(it has some details in case you experience problems).
George
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-05 9:58 [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page George Shapovalov
@ 2003-12-05 12:26 ` Paul de Vrieze
2003-12-05 21:33 ` George Shapovalov
2003-12-05 16:54 ` [gentoo-portage-dev] portage-ng design competition -- not yet Daniel Robbins
1 sibling, 1 reply; 27+ messages in thread
From: Paul de Vrieze @ 2003-12-05 12:26 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 2449 bytes --]
On Friday 05 December 2003 10:58, George Shapovalov wrote:
> Sorry for the crosspost, but it looks like this topic is approximately
> equivalently active on either lis, and I did not find the "submission
> instructions" perhaps because its not yet time for design submits :)..
Some words about your page,
Prolog != AI
AI != undeterministic
actually most AI research is into the area of deterministic approaches. Most
undeterministic behaviour which is not in games (where you want
undeterministic behaviour) can be found into trying to trying to get a low
complexity for problems that have natural big complexity. In this way
undeterministic means that in some cases the tricks don't work and you get
the big complexity (hell even caching is undeterministic in the same way).
prolog is very good at graph traversal tasks (which is what AI in many cases
involves and why in AI prolog is used a lot) as the mode of operation is
basically a depth-first backtracking search algorithm (about 10 lines of
pseudocode). For that reason I think it is a good candidate for the new and
improved dependency algorithm.
While prolog is very good for such tasks I don't think it is suited to
implement the core of portage. This is not because I think prolog is a
problem, it is because I think that such languages are not suited for
procedural dispatch, divide, etc. tasks. That is something that C is good at.
However procedural languages are very bad at tree traversal, so I think that
prolog is also a good option.
Then about caching
While you have a point with your prediction on number of packages, most of
those packages involve leaf packages. That means that the number of
dependencies per leaf package will probably not increase. However the amount
of packages totally not regarded will grow. For this portage should do some
kind of on-demand loading of packages. I see no need whatsoever to have a
normal emerge task load in the whole tree and create a graph of that.
I would like portage to remain lean and mean. I do run gentoo on a 16MB 60mhz
pentium, and I like it. While the cache should also be a module that replaces
a direct-tree-read module (possibly wrapping it), I think that caching should
be possible on a configurable level. Not just dumbly load in all ebuilds.
Paul
--
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net
[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-05 12:26 ` Paul de Vrieze
@ 2003-12-05 21:33 ` George Shapovalov
2003-12-06 14:26 ` Paul de Vrieze
0 siblings, 1 reply; 27+ messages in thread
From: George Shapovalov @ 2003-12-05 21:33 UTC (permalink / raw
To: gentoo-portage-dev
I'll answer this one as this seems a bit more substantial and I would really
like to keep the discussion on this list.
On Friday 05 December 2003 04:26, Paul de Vrieze wrote:
> Prolog != AI
> AI != undeterministic
Quite often these are synonimous (or were, when I was looking into this. But I
must admit this was quite some time ago), but strictly speaking you are
right. I'll update that section of the page.
> prolog is very good at graph traversal tasks (which is what AI in many
> cases involves and why in AI prolog is used a lot) as the mode of operation
> is basically a depth-first backtracking search algorithm (about 10 lines of
> pseudocode). For that reason I think it is a good candidate for the new and
> improved dependency algorithm.
Shouldn't it be breadth-first? With depth first you might need to do
adjustments to the formed list occasionally, if multiple nodes go to the same
dependency in a different number of steps. I think I was able to produce the
example of such situation, but I would appreciate any good pointers to a
theory :).
As fot the lines of code, that's anecdotal.
(adding from the other email:)
> Also, your code (which is about 1000 lines long)
> does -only- a simple dfs and topological ordering, while I can do the
> same in about 10 lines in prolog and have backtracking for free. I
The *traversal* code (btw I use BFS traversal there, not DFS) is completely
localized in bc-graphs-directed-bfs_traverse.adb and is about the same 10
lines :) and is completely generic.
The rest of 990 lines deal with such mundane tasks as reading the (possibly
misformed) ebuilds and dealing with user (inluding minimalistic help).
Essentially just doing what you describe here:
> While prolog is very good for such tasks I don't think it is suited to
> implement the core of portage. This is not because I think prolog is a
> problem, it is because I think that such languages are not suited for
> procedural dispatch, divide, etc. tasks. That is something that C is good
> Then about caching
Well, I stated in the writeup that this is a simplistic design only relevant
for the prototype. However I'll try to explain my reasoning for the future
discussions.
(To drobbins: yes, I am afraid we are pulling the "discussion before
requirements" thing again here. But this it the core and unavoidable part of
the design, so it will definitely be there. Also I am trying to localize the
discussion to this list only).
Thus note, all the comments below concern this hypothetical design I used in
language demonstration, but that will probably be rethought for the real
design.
> While you have a point with your prediction on number of packages, most of
> those packages involve leaf packages. That means that the number of
> dependencies per leaf package will probably not increase. However the
> amount of packages totally not regarded will grow. For this portage should
> do some kind of on-demand loading of packages. I see no need whatsoever to
> have a normal emerge task load in the whole tree and create a graph of
> that.
I somewhat agree with that. However my observation is that these packages most
often are libraries or other helpers that user is usualy not interested in
directly. And the long operations are the ones where you have to do searches
or operations involving world. So I have been thinking along the lines of
making the longest operations fast and bounded, getting the O(1)
responsivity.
But in reality I do not think this shpould be "the ultimate solution" or even
that there should be one. See the next one.
> I would like portage to remain lean and mean. I do run gentoo on a 16MB
> 60mhz pentium, and I like it. While the cache should also be a module that
> replaces a direct-tree-read module (possibly wrapping it), I think that
> caching should be possible on a configurable level. Not just dumbly load in
> all ebuilds.
Touche :).
That's why I spent so many words discussing memory layout.
However the real solution IMHO should allow multiple dependency trackers
(cores) to be plugged as desired (choice of both at compilation and run-time
(i.e. choosing at compilation to be able to choose at run time :))). And for
that we would need a high level of efficient abstraction. And this is what
got me started thinking about Ada :).
And finally:
> My biggest concern when reading your small paper is that you chose a
> deterministic approach to this problem, while in fact the problem is
> non-determinisitic.
Could you please elaborate? I am afraid we are thinking about slightly
different things here.
I stay by my thought that any important to the system tool should be dumb. If
there is any uncertainty it should stop and ask, unless it was designed for a
very specific situation, where it can be trusted to make the choice. Package
maintaince is not such area - IMHO every user that has identical
configuration should get identical results (to the extent possible. So here
we are talking requirements Daniel ;)). Otherwise we are facing a disasterous
consequencies wil many people complaining and us being unable to reproduce
anything reliably.
George
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-05 21:33 ` George Shapovalov
@ 2003-12-06 14:26 ` Paul de Vrieze
2003-12-06 19:35 ` Daniel Robbins
2003-12-06 23:00 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page George Shapovalov
0 siblings, 2 replies; 27+ messages in thread
From: Paul de Vrieze @ 2003-12-06 14:26 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 4444 bytes --]
On Friday 05 December 2003 22:33, George Shapovalov wrote:
>
> Shouldn't it be breadth-first? With depth first you might need to do
> adjustments to the formed list occasionally, if multiple nodes go to the
> same dependency in a different number of steps. I think I was able to
> produce the example of such situation, but I would appreciate any good
> pointers to a theory :).
Prolog interpreters work as depth-first searchers. Breath-first is in most
cases a waste of both memory and clock cycles. In the case of portage this
would mean something like first getting deep dependencies like glibc, and
then less important ones. Of course a full dependency tree for portage is a
bit more complicated than just a graph, anyway that is implementation.
> Well, I stated in the writeup that this is a simplistic design only
> relevant for the prototype. However I'll try to explain my reasoning for
> the future discussions.
> (To drobbins: yes, I am afraid we are pulling the "discussion before
> requirements" thing again here. But this it the core and unavoidable part
> of the design, so it will definitely be there. Also I am trying to localize
> the discussion to this list only).
I didn't really look at your sourcecode, just at your web-page.
>
> Thus note, all the comments below concern this hypothetical design I used
> in language demonstration, but that will probably be rethought for the real
> design.
>
> > While you have a point with your prediction on number of packages, most
> > of those packages involve leaf packages. That means that the number of
> > dependencies per leaf package will probably not increase. However the
> > amount of packages totally not regarded will grow. For this portage
> > should do some kind of on-demand loading of packages. I see no need
> > whatsoever to have a normal emerge task load in the whole tree and create
> > a graph of that.
>
> I somewhat agree with that. However my observation is that these packages
> most often are libraries or other helpers that user is usualy not
> interested in directly. And the long operations are the ones where you have
> to do searches or operations involving world. So I have been thinking along
> the lines of making the longest operations fast and bounded, getting the
> O(1)
> responsivity.
> But in reality I do not think this shpould be "the ultimate solution" or
> even that there should be one. See the next one.
I don't believe that the average size of world is going to grow significantly.
In any case more installed packages often also means a beefier computer.
>
> Touche :).
> That's why I spent so many words discussing memory layout.
> However the real solution IMHO should allow multiple dependency trackers
> (cores) to be plugged as desired (choice of both at compilation and
> run-time (i.e. choosing at compilation to be able to choose at run time
> :))). And for that we would need a high level of efficient abstraction. And
> this is what got me started thinking about Ada :).
>
We need indeed a highlevel abstraction, and dep trackers are one of the
modules. I think that access to the package tree is another, where caching
can be modular too.
> And finally:
> > My biggest concern when reading your small paper is that you chose a
> > deterministic approach to this problem, while in fact the problem is
> > non-determinisitic.
>
> Could you please elaborate? I am afraid we are thinking about slightly
> different things here.
>
I didn't write that, guess you'll need to ask the person who did.
> I stay by my thought that any important to the system tool should be dumb.
> If there is any uncertainty it should stop and ask, unless it was designed
> for a very specific situation, where it can be trusted to make the choice.
> Package maintaince is not such area - IMHO every user that has identical
> configuration should get identical results (to the extent possible. So here
> we are talking requirements Daniel ;)). Otherwise we are facing a
> disasterous consequencies wil many people complaining and us being unable
> to reproduce anything reliably.
The only less deterministic behaviour I think is acceptable is in the time it
takes for all dependencies to be calculated, as long as the maximum is
sensible.
Paul
--
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net
[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-06 14:26 ` Paul de Vrieze
@ 2003-12-06 19:35 ` Daniel Robbins
2003-12-06 19:41 ` Jon Portnoy
` (2 more replies)
2003-12-06 23:00 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page George Shapovalov
1 sibling, 3 replies; 27+ messages in thread
From: Daniel Robbins @ 2003-12-06 19:35 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 1232 bytes --]
On Sat, 2003-12-06 at 07:26, Paul de Vrieze wrote:
> We need indeed a highlevel abstraction, and dep trackers are one of the
> modules. I think that access to the package tree is another, where caching
> can be modular too.
If by "caching" you mean the metadata cache, this is something I want to
eliminate in portage-ng. I would like things to be designed to be fast
from the start, with no slow bash<->python linkage like there is in the
current portage that makes us require a metadata cache for decent
performance.
It should be possible to get portage-ng without caching running as fast
as portage does now when it has a fully up-to-date cache. Then if we
need more performance, portage-ng's datastore can be moved to a
database, or we can add an enhanced caching mode to make it even faster.
For backwards compatibility with existing ebuilds, yes we will probably
still need the metadata cache since we'll still have some kind of bash
linkage. It's important to point out that the design of portage-ng will
not be tied to ebuilds. Ebuilds will likely become "legacy" build
scripts that are superceded by something a lot better, cleaner, powerful
and also faster for portage-ng.
Regards,
Daniel
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-06 19:35 ` Daniel Robbins
@ 2003-12-06 19:41 ` Jon Portnoy
2003-12-07 0:13 ` [gentoo-portage-dev] ebuild strengths/weaknesses Daniel Robbins
2003-12-07 1:44 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page Jason Stubbs
2003-12-07 11:05 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page Paul de Vrieze
2003-12-07 19:59 ` Philippe Lafoucrière
2 siblings, 2 replies; 27+ messages in thread
From: Jon Portnoy @ 2003-12-06 19:41 UTC (permalink / raw
To: gentoo-portage-dev
On Sat, Dec 06, 2003 at 12:35:11PM -0700, Daniel Robbins wrote:
> For backwards compatibility with existing ebuilds, yes we will probably
> still need the metadata cache since we'll still have some kind of bash
> linkage. It's important to point out that the design of portage-ng will
> not be tied to ebuilds. Ebuilds will likely become "legacy" build
> scripts that are superceded by something a lot better, cleaner, powerful
> and also faster for portage-ng.
>
Please keep in mind that a significant number of users have expressed a
fondness for ebuilds precisely because they can apply simple bash
scripting knowledge to create a complex build script. Any new format
should probably aim for similar syntax for precisely that reason.
(But this is getting way ahead of things.)
--
Jon Portnoy
avenj/irc.freenode.net
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* [gentoo-portage-dev] ebuild strengths/weaknesses
2003-12-06 19:41 ` Jon Portnoy
@ 2003-12-07 0:13 ` Daniel Robbins
2003-12-07 1:44 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page Jason Stubbs
1 sibling, 0 replies; 27+ messages in thread
From: Daniel Robbins @ 2003-12-07 0:13 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 1318 bytes --]
On Sat, 2003-12-06 at 12:41, Jon Portnoy wrote:
> Please keep in mind that a significant number of users have expressed a
> fondness for ebuilds precisely because they can apply simple bash
> scripting knowledge to create a complex build script. Any new format
> should probably aim for similar syntax for precisely that reason.
You mean similar ease of use, I think. It's hard to use bash syntax and
have a high-performance system. But I know where you're coming from. The
goal is to make them easier to use and more powerful than ebuilds.
I'd contend that ebuilds aren't the pinnacle of usability, although they
do have many strengths. There are aspects to ebuilds that can make them
tricky to use such as tons of conditionals all over the place, strange
unexpected side-effects caused by unexpected orders of execution,
limitations of what conditionals are actually *legal* in ebuilds ("foo?"
vs. "use foo" vs. "if [ ]"), etc.) There is a lot to improve. We'll want
to make the new format better while keeping or surpassing existing
strengths.
Then when we get to eclasses, we start to see that we are maxing out the
potential for a totally-bash-based system.
My recommendation: for all the stuff you like about ebuilds, make sure
they are in the requirements.
Regards,
Daniel
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-06 19:41 ` Jon Portnoy
2003-12-07 0:13 ` [gentoo-portage-dev] ebuild strengths/weaknesses Daniel Robbins
@ 2003-12-07 1:44 ` Jason Stubbs
2003-12-07 2:39 ` George Shapovalov
1 sibling, 1 reply; 27+ messages in thread
From: Jason Stubbs @ 2003-12-07 1:44 UTC (permalink / raw
To: gentoo-portage-dev
-----Original Message-----
From: Jon Portnoy [mailto:avenj@gentoo.org]
Sent: Sunday, December 07, 2003 4:41 AM
To: gentoo-portage-dev@gentoo.org
Subject: Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated
Portage project page
On Sat, Dec 06, 2003 at 12:35:11PM -0700, Daniel Robbins wrote:
> For backwards compatibility with existing ebuilds, yes we will
> probably still need the metadata cache since we'll still have some
> kind of bash linkage. It's important to point out that the design of
> portage-ng will not be tied to ebuilds. Ebuilds will likely become
> "legacy" build scripts that are superceded by something a lot better,
> cleaner, powerful and also faster for portage-ng.
>
Please keep in mind that a significant number of users have expressed a
fondness for ebuilds precisely because they can apply simple bash
scripting knowledge to create a complex build script. Any new format
should probably aim for similar syntax for precisely that reason.
(But this is getting way ahead of things.)
---- Jason Stubbs is writing:
It's not getting ahead of things! That's a requirement that's not
covered yet. "Package definition should be powerful but simple with a
small learning curve" or something to that effect.
Regards,
Jason Stubbs
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-07 1:44 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page Jason Stubbs
@ 2003-12-07 2:39 ` George Shapovalov
2003-12-07 3:12 ` Jason Stubbs
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: George Shapovalov @ 2003-12-07 2:39 UTC (permalink / raw
To: gentoo-portage-dev
On Saturday 06 December 2003 17:44, Jason Stubbs wrote:
> It's not getting ahead of things! That's a requirement that's not
> covered yet. "Package definition should be powerful but simple with a
> small learning curve" or something to that effect.
Hm, isn't it a bit too late to change ebuild format, with us sitting on 7000+
ebuilds? The only reasonable way to do so is to make it structurally
compatible and create a converter tool. Even then this is a major endeavor
that would require a very good reason (nothing short of deadly limitations of
the present format, which I woudn't say is the case). Furthermore, this would
require wide publicity and even votes if we do not want to alienate users, as
this is the change that definitely will affect them (take a look at number of
new ebuild submissions ;)).
But then I don't really see the problem with present format. bash involvment
is really necessary only during the pkg_* and src_* steps, when a lot of
other stuff is going to happen anyway, so this is hardly a bottleneck. To get
definitions of various vars and dependency information out is trivial and can
be done in anything. That bash is involved in this step at present is
unfortunate, but there were reasons for it and it definitely may be undone
even for the present portage.
George
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-07 2:39 ` George Shapovalov
@ 2003-12-07 3:12 ` Jason Stubbs
2003-12-07 4:50 ` Ray Russell Reese III
2003-12-07 7:40 ` Daniel Robbins
2 siblings, 0 replies; 27+ messages in thread
From: Jason Stubbs @ 2003-12-07 3:12 UTC (permalink / raw
To: gentoo-portage-dev
> -----Original Message-----
> From: George Shapovalov [mailto:george@gentoo.org]
> Sent: Sunday, December 07, 2003 11:40 AM
> To: gentoo-portage-dev@gentoo.org
> Subject: Re: [gentoo-portage-dev] portage-ng concurse entry
> Was: Updated Portage project page
>
>
> On Saturday 06 December 2003 17:44, Jason Stubbs wrote:
> > It's not getting ahead of things! That's a requirement that's not
> > covered yet. "Package definition should be powerful but
> simple with a
> > small learning curve" or something to that effect.
>
> Hm, isn't it a bit too late to change ebuild format, with us
> sitting on 7000+
> ebuilds? The only reasonable way to do so is to make it structurally
> compatible and create a converter tool. Even then this is a
> major endeavor
> that would require a very good reason (nothing short of
> deadly limitations of
> the present format, which I woudn't say is the case).
> Furthermore, this would
> require wide publicity and even votes if we do not want to
> alienate users, as
> this is the change that definitely will affect them (take a
> look at number of
> new ebuild submissions ;)).
There's another requirement! "The package system needs to be backward
compatible or have a compatibility layer for existing ebuilds" or
something to that effect. Having to rewrite them all just for a package
manager migration would not only be insane but would be, dare I say,
impossible - by the time they are all rewritten, they'd all have become
obsolete. If there is backward compatibility, they should be able to be
phased out in less than a year (assuming the format is in fact
replaced).
> But then I don't really see the problem with present format.
> bash involvment
> is really necessary only during the pkg_* and src_* steps,
> when a lot of
> other stuff is going to happen anyway, so this is hardly a
> bottleneck. To get
> definitions of various vars and dependency information out is
> trivial and can
> be done in anything. That bash is involved in this step at present is
> unfortunate, but there were reasons for it and it definitely
> may be undone
> even for the present portage.
Personally, I don't think the current ebuild format is so bad either
while the dev team was relatively small. But there are a number of
places that mistakes can be made making for bad QA. I think it does make
sense to split the file into the pkg_*/src_* stuff and the various vars
as per current requirements spec. I would also prefer to see the
pkg_*/src_* stuff abstracted so that an ebuild is never a security risk.
ATM, pkg_postinst is the worst that I can think of from a security
viewpoint.
Regards,
Jason Stubbs
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-07 2:39 ` George Shapovalov
2003-12-07 3:12 ` Jason Stubbs
@ 2003-12-07 4:50 ` Ray Russell Reese III
2003-12-07 7:27 ` Daniel Robbins
2003-12-07 7:40 ` Daniel Robbins
2 siblings, 1 reply; 27+ messages in thread
From: Ray Russell Reese III @ 2003-12-07 4:50 UTC (permalink / raw
To: gentoo-portage-dev
Wouldn't it be wise then to allow for multiple ebuild formats through
plug-ins? Like you say, a considerable amount of ebuilds need nothing
more than to run configure, make, and make install.
Then there are those ebuilds that are a few hundred lines (or gasp more)
of bash script that would benefit from something more structured. What
that something is I honestly don't know. But at least with the plug-able
ebuild format, we could retain compatibility with the current ebuilds,
and slowly phase them into something more appropriate.
Just my $0.02.
- Ray Russell Reese III [ freenode:anti ]
On Sat, 2003-12-06 at 21:39, George Shapovalov wrote:
> On Saturday 06 December 2003 17:44, Jason Stubbs wrote:
> > It's not getting ahead of things! That's a requirement that's not
> > covered yet. "Package definition should be powerful but simple with a
> > small learning curve" or something to that effect.
>
> Hm, isn't it a bit too late to change ebuild format, with us sitting on 7000+
> ebuilds? The only reasonable way to do so is to make it structurally
> compatible and create a converter tool. Even then this is a major endeavor
> that would require a very good reason (nothing short of deadly limitations of
> the present format, which I woudn't say is the case). Furthermore, this would
> require wide publicity and even votes if we do not want to alienate users, as
> this is the change that definitely will affect them (take a look at number of
> new ebuild submissions ;)).
>
> But then I don't really see the problem with present format. bash involvment
> is really necessary only during the pkg_* and src_* steps, when a lot of
> other stuff is going to happen anyway, so this is hardly a bottleneck. To get
> definitions of various vars and dependency information out is trivial and can
> be done in anything. That bash is involved in this step at present is
> unfortunate, but there were reasons for it and it definitely may be undone
> even for the present portage.
>
> George
>
>
>
> --
> gentoo-portage-dev@gentoo.org mailing list
>
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-07 2:39 ` George Shapovalov
2003-12-07 3:12 ` Jason Stubbs
2003-12-07 4:50 ` Ray Russell Reese III
@ 2003-12-07 7:40 ` Daniel Robbins
2003-12-07 9:11 ` Kapil Thangavelu
2003-12-08 16:03 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page, ebuild conversion Sandy McArthur
2 siblings, 2 replies; 27+ messages in thread
From: Daniel Robbins @ 2003-12-07 7:40 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 2050 bytes --]
On Sat, 2003-12-06 at 19:39, George Shapovalov wrote:
> Hm, isn't it a bit too late to change ebuild format, with us sitting on 7000+
> ebuilds? The only reasonable way to do so is to make it structurally
> compatible and create a converter tool.
It would be very difficult to get good results from a converter tool,
due to the many complexities of ebuild parsing.
> But then I don't really see the problem with present format.
You just explained how much of a chore it would be to convert from
ebuild to something else. Doesn't this point to a weakness in the syntax
of ebuilds themselves? I mean, if they were more formally defined, they
could be converted to XML or anything else without much effort.
> bash involvment
> is really necessary only during the pkg_* and src_* steps, when a lot of
> other stuff is going to happen anyway, so this is hardly a bottleneck.
This isn't an informed comment :) Portage depends on bash for extraction
of metadata as well. Extraction of metadata is *the* Portage bottleneck.
> To get
> definitions of various vars and dependency information out is trivial and can
> be done in anything. That bash is involved in this step at present is
> unfortunate, but there were reasons for it and it definitely may be undone
> even for the present portage.
If it were trivial we would have done it already. The only way to "undo"
the dependence on bash is to make seemingly arbitrary rules of what is
legal and not legal to type inside ebuilds. This leads to a lot of
strange rules (such as rules about where you can and can't use variable
expansion, and where you can and can't use bash conditionals) and makes
ebuild-writing a tricky process. We already have some of these rules in
effect, and it makes ebuild-writing a bit trickier than it should be.
We don't want ebuild-writing to be tricky, so the solution is to
architect a way to represent ebuild data in a way that is more useful to
portage-ng and more natural for ebuild-ng writers.
Regards,
Daniel
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-07 7:40 ` Daniel Robbins
@ 2003-12-07 9:11 ` Kapil Thangavelu
2003-12-07 11:11 ` Paul de Vrieze
2003-12-08 16:03 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page, ebuild conversion Sandy McArthur
1 sibling, 1 reply; 27+ messages in thread
From: Kapil Thangavelu @ 2003-12-07 9:11 UTC (permalink / raw
To: gentoo-portage-dev
are we at the stage yet, where some structured metadata representation
outside of the ebuild will be used ?
incidentally the mailing list manager isn't honoring requests for
archives/indexes... so, apologies in advance if this has already been
discussed/decided.
-k
On 12/6/03 11:40 PM, "Daniel Robbins" <drobbins@gentoo.org> wrote:
> On Sat, 2003-12-06 at 19:39, George Shapovalov wrote:
>> Hm, isn't it a bit too late to change ebuild format, with us sitting on 7000+
>> ebuilds? The only reasonable way to do so is to make it structurally
>> compatible and create a converter tool.
>
> It would be very difficult to get good results from a converter tool,
> due to the many complexities of ebuild parsing.
>
>> But then I don't really see the problem with present format.
>
> You just explained how much of a chore it would be to convert from
> ebuild to something else. Doesn't this point to a weakness in the syntax
> of ebuilds themselves? I mean, if they were more formally defined, they
> could be converted to XML or anything else without much effort.
>
>> bash involvment
>> is really necessary only during the pkg_* and src_* steps, when a lot of
>> other stuff is going to happen anyway, so this is hardly a bottleneck.
>
> This isn't an informed comment :) Portage depends on bash for extraction
> of metadata as well. Extraction of metadata is *the* Portage bottleneck.
>
>> To get
>> definitions of various vars and dependency information out is trivial and can
>> be done in anything. That bash is involved in this step at present is
>> unfortunate, but there were reasons for it and it definitely may be undone
>> even for the present portage.
>
> If it were trivial we would have done it already. The only way to "undo"
> the dependence on bash is to make seemingly arbitrary rules of what is
> legal and not legal to type inside ebuilds. This leads to a lot of
> strange rules (such as rules about where you can and can't use variable
> expansion, and where you can and can't use bash conditionals) and makes
> ebuild-writing a tricky process. We already have some of these rules in
> effect, and it makes ebuild-writing a bit trickier than it should be.
>
> We don't want ebuild-writing to be tricky, so the solution is to
> architect a way to represent ebuild data in a way that is more useful to
> portage-ng and more natural for ebuild-ng writers.
>
> Regards,
>
> Daniel
>
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page, ebuild conversion
2003-12-07 7:40 ` Daniel Robbins
2003-12-07 9:11 ` Kapil Thangavelu
@ 2003-12-08 16:03 ` Sandy McArthur
1 sibling, 0 replies; 27+ messages in thread
From: Sandy McArthur @ 2003-12-08 16:03 UTC (permalink / raw
To: gentoo-portage-dev
Daniel Robbins wrote:
> On Sat, 2003-12-06 at 19:39, George Shapovalov wrote:
>
>>Hm, isn't it a bit too late to change ebuild format, with us sitting on 7000+
>>ebuilds? The only reasonable way to do so is to make it structurally
>>compatible and create a converter tool.
>
> It would be very difficult to get good results from a converter tool,
> due to the many complexities of ebuild parsing.
One transition methiod from one format to another I haven't seen is the
idea that portage-old be made capable to convert ebuild-ng into
ebuild-old until portage-ng is ready to be declared stable.
I presume ebuild-ng will be more structured and thus easier to convert
to a less structured format like the current ebuild format. This would
make the convertersion tool's job easier and less error prone ... probably.
Sandy McArthur
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-06 19:35 ` Daniel Robbins
2003-12-06 19:41 ` Jon Portnoy
@ 2003-12-07 11:05 ` Paul de Vrieze
2003-12-07 19:59 ` Philippe Lafoucrière
2 siblings, 0 replies; 27+ messages in thread
From: Paul de Vrieze @ 2003-12-07 11:05 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 2198 bytes --]
On Saturday 06 December 2003 20:35, Daniel Robbins wrote:
> On Sat, 2003-12-06 at 07:26, Paul de Vrieze wrote:
> > We need indeed a highlevel abstraction, and dep trackers are one of the
> > modules. I think that access to the package tree is another, where
> > caching can be modular too.
>
> If by "caching" you mean the metadata cache, this is something I want to
> eliminate in portage-ng. I would like things to be designed to be fast
> from the start, with no slow bash<->python linkage like there is in the
> current portage that makes us require a metadata cache for decent
> performance.
What I mean with caching here is a module that maskerades as a tree
representation, but actually is a cache that gets it's data from another
"real" tree representation (be that installed, available ebuilds, or
binaries). This cache module would in someway speed up the retrieval of the
data from the cache. Possibly by a binary database or whatever means (I don't
care). The thing I care about is that it should be optional, and there should
be a caching module that minimizes memory use.
> It should be possible to get portage-ng without caching running as fast
> as portage does now when it has a fully up-to-date cache. Then if we
> need more performance, portage-ng's datastore can be moved to a
> database, or we can add an enhanced caching mode to make it even faster.
>
That would be cool.
> For backwards compatibility with existing ebuilds, yes we will probably
> still need the metadata cache since we'll still have some kind of bash
> linkage. It's important to point out that the design of portage-ng will
> not be tied to ebuilds. Ebuilds will likely become "legacy" build
> scripts that are superceded by something a lot better, cleaner, powerful
> and also faster for portage-ng.
While I see the value of keeping ebuilds I agree that ebuilds have serious
upward compatibility problems, so we might get a new format. (Also parseable
without bash, and a way to add more complex dependency formats without
breaking old scripts).
Paul
--
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net
[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-06 19:35 ` Daniel Robbins
2003-12-06 19:41 ` Jon Portnoy
2003-12-07 11:05 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page Paul de Vrieze
@ 2003-12-07 19:59 ` Philippe Lafoucrière
2003-12-07 20:10 ` Philippe Lafoucrière
2003-12-07 20:12 ` Jeff Smelser
2 siblings, 2 replies; 27+ messages in thread
From: Philippe Lafoucrière @ 2003-12-07 19:59 UTC (permalink / raw
To: gentoo-portage-dev
On Saturday 06 December 2003 20:35, Daniel Robbins wrote:
> If by "caching" you mean the metadata cache, this is something I want to
> eliminate in portage-ng. I would like things to be designed to be fast
> from the start, with no slow bash<->python linkage like there is in the
> current portage that makes us require a metadata cache for decent
> performance.
If you use a backend db using sqllite, you just have 1 file representing your
DB, and so, your portage tree. Then you can send this file through rsync.
Another DB file should be used to manage installed package. an emerge -UDp
world should just look in the installed DB for packages, and then compare
status in portage db. Maybe an idea...
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-07 19:59 ` Philippe Lafoucrière
@ 2003-12-07 20:10 ` Philippe Lafoucrière
2003-12-07 20:12 ` Jeff Smelser
1 sibling, 0 replies; 27+ messages in thread
From: Philippe Lafoucrière @ 2003-12-07 20:10 UTC (permalink / raw
To: gentoo-portage-dev
> If you use a backend db using sqllite, you just have 1 file representing
> your DB, and so, your portage tree. Then you can send this file through
> rsync. Another DB file should be used to manage installed package. an
> emerge -UDp world should just look in the installed DB for packages, and
> then compare status in portage db. Maybe an idea...
I'm not sure every body has understand this :(
I just wonder why the tree has to be build on the cliend side. Official tree
should be built on gentoo servers, and should be updated when new ebuilds are
available in the tree.
this will be part of the python/twisted/sqlite prototype I'll start tomorrow
(with Jason Mobarak, who proposed himself to help !).
--
Philippe
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-07 19:59 ` Philippe Lafoucrière
2003-12-07 20:10 ` Philippe Lafoucrière
@ 2003-12-07 20:12 ` Jeff Smelser
2003-12-07 21:01 ` [gentoo-portage-dev] gpg signing of Manifests Douglas Russell
1 sibling, 1 reply; 27+ messages in thread
From: Jeff Smelser @ 2003-12-07 20:12 UTC (permalink / raw
To: gentoo-portage-dev
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Sunday 07 December 2003 01:59 pm, Philippe Lafoucrière wrote:
> If you use a backend db using sqllite, you just have 1 file representing
> your DB, and so, your portage tree. Then you can send this file through
> rsync. Another DB file should be used to manage installed package. an
> emerge -UDp world should just look in the installed DB for packages, and
> then compare status in portage db. Maybe an idea...
Yes. but people like me would have a problem with that. So I would hope there
will be more of a generic routine on DB calls.. I don't care if I download a
sqlite db, but I don't want 4 copies either.. I would want a central db,
mysql such as, and have one copy of the portage db, and another table of
installed packages.. The table could be smart enough to know the machine
name, or separate tables wouldn't bother me much either.
Unless, I heard a rumor there was going to be more of a client/server version,
so maybe it can be solved that way?
I only keep one copy of /usr/portage now due to disk space. I refuse to tie
up that much disk space when I can centralize it.
- --
Speak softly and carry a cellular phone.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
iD8DBQE/04mXld4MRA3gEwYRAnvQAKDpXn7Oo7Gy8hSAKt125mOzTYXyzQCfS1Vu
Z+HDFOD1Gj3LzuDR0tp5djQ=
=pAur
-----END PGP SIGNATURE-----
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* [gentoo-portage-dev] gpg signing of Manifests
2003-12-07 20:12 ` Jeff Smelser
@ 2003-12-07 21:01 ` Douglas Russell
2003-12-07 21:53 ` Douglas Russell
0 siblings, 1 reply; 27+ messages in thread
From: Douglas Russell @ 2003-12-07 21:01 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: gentoo-core
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
ok. basically I'm trying to get a jump on the rest of portage to allow us
(through repoman) to get the tree populated with signed Manifests ready for
when portage is able to use them.
Their are several choices available for where the sigs will be, and various
advantages and disadvantages. I'm basically waiting to implement one of these
until a decision is made. It will then be ready in short order and ready to
use as soon as carpaski applies the patch against portage and commits it,
etc.
Choices:
a) Signing inline in current Manifest file.
Advantages
1) Low filestorage overhead in the short and long term
Disadvantages
1) Current versions of portage will be unable to parse these files
2) More difficult to parse and post than a seperate signature.
Overall
Basically (a) is an impossibility as it would require everyone to upgrade
portage before introuducing signing.
b) Signing inline in a new Manifest.asc file
Advantages
1) Gets around the problem of old/new portage as old portage will continue to
use the Manifest files and new portage will use the new signed Manifest.asc
files as soon as that "new" portage exists. The old Manifests can be phased
out after a time.
2) Increase in number of files in portage tree is only in the short term
Disadvantages
1) Increase in number of files in portage tree in the short term.
2) More difficult to parse and post than a seperate signature.
Overall
Possible, can be implemented now, best implementation from a portage tree size
point of view.
c) Detached Signing in a Manifest.asc file
Advantages
1) Gets around the problem of old/new portage as old portage will continue to
use the Manifest files and new portage will use the new signed Manifest.asc
in conjunction with the old Manifest files as soon as that portage exists.
2) Easy to parse and post, especially for uses such as grabbing the sigs for
posting on packages.gentoo.org
Disadvantages
1) Increase in number of files in portage tree in short and long term
Overall
Possible, can be implemented now, best implementation from a usability point
of view
____________________________
Swift responses would be appreciated as I want to get this into repoman as
soon as possible so that at the very least, wary users can manually check
their Manifests signatures if they are worried. This will also enable the
rest of portage to use the signatures as soon as it is ready to use them.
Apologies for cross-posting this to -core but I thought everyone should be
aware of this issue seeing as it has been brought to all our attentions of
late. Please continue the discussion on gentoo-portage-dev@gentoo.org list.
Puggy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
iD8DBQE/05UTXYnvgFdTojMRAggGAKCY65KRWeYmTABNbkuUwXOIkcGgqACbBQ/K
8WIcisb+VwYmyEMEQrQts0o=
=cbed
-----END PGP SIGNATURE-----
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] gpg signing of Manifests
2003-12-07 21:01 ` [gentoo-portage-dev] gpg signing of Manifests Douglas Russell
@ 2003-12-07 21:53 ` Douglas Russell
0 siblings, 0 replies; 27+ messages in thread
From: Douglas Russell @ 2003-12-07 21:53 UTC (permalink / raw
To: gentoo-portage-dev, Douglas Russell
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ok, it has been brought to my attention that conveniantly the parsing of the
current Manifest file only looks at lines starting with MD5, so option (a) is
indeed possible after all. It basically replaces option (b) but without the
problem of increasing the number of files in portage in the short term.
This now looks like the easiest solution to implement but still their is the
ease of parsing argument for the seperate signatures.
Puggy
On Sunday 07 December 2003 9:01 pm, Douglas Russell wrote:
> ok. basically I'm trying to get a jump on the rest of portage to allow us
> (through repoman) to get the tree populated with signed Manifests ready for
> when portage is able to use them.
>
> Their are several choices available for where the sigs will be, and various
> advantages and disadvantages. I'm basically waiting to implement one of
> these until a decision is made. It will then be ready in short order and
> ready to use as soon as carpaski applies the patch against portage and
> commits it, etc.
>
> Choices:
>
> a) Signing inline in current Manifest file.
>
> Advantages
> 1) Low filestorage overhead in the short and long term
>
> Disadvantages
> 1) Current versions of portage will be unable to parse these files
> 2) More difficult to parse and post than a seperate signature.
>
> Overall
> Basically (a) is an impossibility as it would require everyone to upgrade
> portage before introuducing signing.
>
>
> b) Signing inline in a new Manifest.asc file
>
> Advantages
> 1) Gets around the problem of old/new portage as old portage will continue
> to use the Manifest files and new portage will use the new signed
> Manifest.asc files as soon as that "new" portage exists. The old Manifests
> can be phased out after a time.
> 2) Increase in number of files in portage tree is only in the short term
>
> Disadvantages
> 1) Increase in number of files in portage tree in the short term.
> 2) More difficult to parse and post than a seperate signature.
>
> Overall
> Possible, can be implemented now, best implementation from a portage tree
> size point of view.
>
> c) Detached Signing in a Manifest.asc file
>
> Advantages
> 1) Gets around the problem of old/new portage as old portage will continue
> to use the Manifest files and new portage will use the new signed
> Manifest.asc in conjunction with the old Manifest files as soon as that
> portage exists. 2) Easy to parse and post, especially for uses such as
> grabbing the sigs for posting on packages.gentoo.org
>
> Disadvantages
> 1) Increase in number of files in portage tree in short and long term
>
> Overall
> Possible, can be implemented now, best implementation from a usability
> point of view
>
> ____________________________
>
> Swift responses would be appreciated as I want to get this into repoman as
> soon as possible so that at the very least, wary users can manually check
> their Manifests signatures if they are worried. This will also enable the
> rest of portage to use the signatures as soon as it is ready to use them.
>
> Apologies for cross-posting this to -core but I thought everyone should be
> aware of this issue seeing as it has been brought to all our attentions of
> late. Please continue the discussion on gentoo-portage-dev@gentoo.org list.
>
> Puggy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
iD8DBQE/06FPXYnvgFdTojMRAqZXAJ9WZtxtUjSTB8GF19SAmHX/G2UeEQCfYXSY
64boL8x1e5cZCc9GtuSaHgk=
=mynT
-----END PGP SIGNATURE-----
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-06 14:26 ` Paul de Vrieze
2003-12-06 19:35 ` Daniel Robbins
@ 2003-12-06 23:00 ` George Shapovalov
2003-12-07 11:18 ` Paul de Vrieze
1 sibling, 1 reply; 27+ messages in thread
From: George Shapovalov @ 2003-12-06 23:00 UTC (permalink / raw
To: gentoo-portage-dev
I have been thinking a bit more about dfs vs bfs traversals, so here goes :).
On Saturday 06 December 2003 06:26, Paul de Vrieze wrote:
>
> Prolog interpreters work as depth-first searchers. Breath-first is in most
Yea, and unfortunately there isn't much choice about it as recursion is pretty
much the only composition method the standard talks.
> cases a waste of both memory and clock cycles. In the case of portage this
Its really only O(n) where small n stands for associated dependencies, so
that's not that much. Besides it does nopt require recursion to be
implemented, so in reality that hit quite often is not even there.
And then it gives:
> would mean something like first getting deep dependencies like glibc, and
> then less important ones.
And isn't this the "right" way to do it? Surely operating on a *complete*
portage tree, that is the one where all the dependencies are explicitly
stated, that wouldn't make any difference. But in reality glibc and similar
dependencies, being a part of the system, are omitted in pretty much all
ebuilds. Thus, with DFS traversal (which is what portage does now) one is
getting glibc somewhere in the middle of world update quite often, which is
not optimal if you want complete coherency (and it can be plain bad in the
case of the major update. This is somewhat alleviated in our case for system
packages by a mandatory bootstrap, but some stuff on perifery can be hit, if
multiple things use the same library and it is affected for example (and not
explicitly included, say because it is consdered "basic")).
> > I somewhat agree with that. However my observation is that these packages
> > most often are libraries or other helpers that user is usualy not
> > interested in directly. And the long operations are the ones where you
> > have to do searches or operations involving world. So I have been
> > thinking along the lines of making the longest operations fast and
> > bounded, getting the O(1)
> > responsivity.
> > But in reality I do not think this shpould be "the ultimate solution" or
> > even that there should be one. See the next one.
>
> I don't believe that the average size of world is going to grow
> significantly.
No, but quite often it includes "major" packages and that means traversal of a
significant portion of the portage.., but the leafs representing uncommon
packages are not touched, that I agree on.
Thus the more efficient approach (and I had a few thoughts on this, although
without much detail, so didn't document it :)) would be to operate on a
memory-mapped cache and on startup suck into memory only the entries having
weight above certain threshold. Weights should probably be proportional to
the numbers of dependencies *coming into* the correpsponding package. Thus
commonly used stuff gets read and the traversals are instantaneous, but 99%
of the tree just sits on disk waitning to be called for...
> > And finally:
> > > My biggest concern when reading your small paper is that you chose a
> > > deterministic approach to this problem, while in fact the problem is
> > > non-determinisitic.
> >
> > Could you please elaborate? I am afraid we are thinking about slightly
> > different things here.
>
> I didn't write that, guess you'll need to ask the person who did.
Um, sorry, I somehow got confused and thought some postings on -dev were from
you as well. I just sent relevant parts to proper place :).
George
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page
2003-12-06 23:00 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page George Shapovalov
@ 2003-12-07 11:18 ` Paul de Vrieze
0 siblings, 0 replies; 27+ messages in thread
From: Paul de Vrieze @ 2003-12-07 11:18 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 3220 bytes --]
On Sunday 07 December 2003 00:00, George Shapovalov wrote:
> I have been thinking a bit more about dfs vs bfs traversals, so here goes
> :).
>
> On Saturday 06 December 2003 06:26, Paul de Vrieze wrote:
> > Prolog interpreters work as depth-first searchers. Breath-first is in
> > most
>
> Yea, and unfortunately there isn't much choice about it as recursion is
> pretty much the only composition method the standard talks.
It is pretty trivial to get whatever behaviour you want in prolog for your
problem. The program does not need to follow the way prolog parses your
language statements.
>
> > cases a waste of both memory and clock cycles. In the case of portage
> > this
>
> Its really only O(n) where small n stands for associated dependencies, so
> that's not that much. Besides it does nopt require recursion to be
> implemented, so in reality that hit quite often is not even there.
>
I don't think there is a significance in breath of depth first here (breath
first has the same problems with changing deps as depth) as the whole tree
needs to be created/walked anyway. (I first thought you were thinking of the
tree of all available ebuilds). If you use prolog, probably depth-first is
easier.
> And isn't this the "right" way to do it? Surely operating on a *complete*
> portage tree, that is the one where all the dependencies are explicitly
> stated, that wouldn't make any difference. But in reality glibc and similar
> dependencies, being a part of the system, are omitted in pretty much all
> ebuilds. Thus, with DFS traversal (which is what portage does now) one is
> getting glibc somewhere in the middle of world update quite often, which is
> not optimal if you want complete coherency (and it can be plain bad in the
> case of the major update. This is somewhat alleviated in our case for
> system packages by a mandatory bootstrap, but some stuff on perifery can be
> hit, if multiple things use the same library and it is affected for example
> (and not explicitly included, say because it is consdered "basic")).
One can easilly create a virtual glibc dependency for all ebuilds. Then as all
ebuilds depend on it it would come first in the flattened vectror result.
> No, but quite often it includes "major" packages and that means traversal
> of a significant portion of the portage.., but the leafs representing
> uncommon packages are not touched, that I agree on.
> Thus the more efficient approach (and I had a few thoughts on this,
> although without much detail, so didn't document it :)) would be to operate
> on a memory-mapped cache and on startup suck into memory only the entries
> having weight above certain threshold. Weights should probably be
> proportional to the numbers of dependencies *coming into* the
> correpsponding package. Thus commonly used stuff gets read and the
> traversals are instantaneous, but 99% of the tree just sits on disk
> waitning to be called for...
That would be an option for one cache module. However with modules it would
also be easy to not use caching at all ;-)
Paul
--
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net
[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* [gentoo-portage-dev] portage-ng design competition -- not yet
2003-12-05 9:58 [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page George Shapovalov
2003-12-05 12:26 ` Paul de Vrieze
@ 2003-12-05 16:54 ` Daniel Robbins
2003-12-05 20:35 ` George Shapovalov
2003-12-05 21:59 ` [gentoo-portage-dev] portage-ng wish list Sandy McArthur
1 sibling, 2 replies; 27+ messages in thread
From: Daniel Robbins @ 2003-12-05 16:54 UTC (permalink / raw
To: George Shapovalov; +Cc: gentoo-portage-dev, gentoo-dev, dholm
[-- Attachment #1: Type: text/plain, Size: 2470 bytes --]
On Fri, 2003-12-05 at 02:58, George Shapovalov wrote:
> On Wednesday 03 December 2003 15:08, Daniel Robbins wrote:
> > I haven't looked at twisted, but a good solution suggested by nerdboy is
> > to have a design competition once we have the requirements finalized.
>
> So, we are going to do it according to "accepted practices" :).
> Seriously, I am glad to see it! And here is my entry ;).
Everyone, please note above that I said "have a design competition *once
we have the requirements finalized*." This hasn't happened yet. Please
focus on the capabilities you want in Portage first. Tell us about
these. These need to be documented first. Any design competition will
not begin until it is officially announced and until we have a set of
requirements for any submitted design to be judged by.
We need to clearly determine what we are shooting for before we choose a
language to get us there.
That being said, ADA is something I'd be comfortable with if the
proposed implementation can meet our requirements, and will be seriously
considered, and we will post george's proposal for ADA to the portage-ng
pages in the proper time (again, when he has the opportunity to see a
complete set of requirements for submissions, and explain how his
implementation would meet those requirements.)
But please, we have not decided on prolog, there is no need to
pre-emptively bash it (no pun intended.) Focus on submitting requests
for what you want portage-ng to be able to *do*, not what language you
think it should be coded in.
Look at it this way (this is something nerdboy explained to me) -- if we
have a solid set of requirements, we could have those requirements
implemented in *any* language, and as long as our requirements are met
fully, we would be happy with the outcome. That is what we want our
requirements to do for us, and why they are so important. I don't want
to add any "rigged" requirements that are designed to steer us towards
prolog, C, C++, python, ADA or anything else. Let's just document
clearly what we need and what we expect portage-ng to be able to do, and
the rest will be sorted out later.
If portability is important, put that in the requirements. Performance?
Put it in the requirements. etc etc. Choice of a particular language
over another will not guarantee that the resultant software will meet
our needs. Having something documented in the requirements will.
Regards,
Daniel
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [gentoo-portage-dev] portage-ng design competition -- not yet
2003-12-05 16:54 ` [gentoo-portage-dev] portage-ng design competition -- not yet Daniel Robbins
@ 2003-12-05 20:35 ` George Shapovalov
2003-12-05 21:59 ` [gentoo-portage-dev] portage-ng wish list Sandy McArthur
1 sibling, 0 replies; 27+ messages in thread
From: George Shapovalov @ 2003-12-05 20:35 UTC (permalink / raw
To: gentoo-portage-dev, gentoo-dev; +Cc: Daniel Robbins
Hi everybody.
On Friday 05 December 2003 08:54, Daniel Robbins wrote:
> Everyone, please note above that I said "have a design competition *once
> we have the requirements finalized*." This hasn't happened yet. Please
Yes, and I apologise about that.
However as there seem to be a bit of confusion I want to emphasize again that
this is really not a design contest entry - I would be too ashaimed to
propose these few ideas I listed in the design *of the illustratory
prototype* as a portage-ng design :).
The purpose of this entry was to illustrate the capabilities of Ada language.
The intention is to bring up some information with the code that illustrates
what can be expected when you use it for people to look at and form the
opinion. I am not trying to push towards Ada-only, as it was taken on one or
two occasions :). However knowing the state of affairs with the languages and
having programmed in a few of these I felt it would be helpfull to have an
illustration up on the better practices I encountered.
Probably because of too much time spent on writing my paper (the unrelated
biophys one) I chose a bit too catchy title, for which I apologise again.
Since a few good points were brought up, that might be relevant to
requirements as well, I feel obliged to answer some of the posters (in
particular pvdabeel's) and dissolve some misconceptions (spider's are a good
example :)).
However starting a language flame war is definitely not anything I would like
to do. As such, I would prefer to keep the design discussions to
gentoo-portage-dev list. I will also answer any ada-related questions you may
have off list, except may be for one or two postings on -dev, to which I'll
reply to the list.
George
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
* [gentoo-portage-dev] portage-ng wish list
2003-12-05 16:54 ` [gentoo-portage-dev] portage-ng design competition -- not yet Daniel Robbins
2003-12-05 20:35 ` George Shapovalov
@ 2003-12-05 21:59 ` Sandy McArthur
1 sibling, 0 replies; 27+ messages in thread
From: Sandy McArthur @ 2003-12-05 21:59 UTC (permalink / raw
To: gentoo-portage-dev
Daniel Robbins wrote:
> focus on the capabilities you want in Portage first. Tell us about
> these. These need to be documented first.
Well, I think I'd like to see implemented:
1. Better mirror detection. Currenty portage uses hard coded data with a
touch of randomness either from DNS RR or via picking one choice from a
list. Little is done to pick the best mirror for the user.
I think portage should query for a current list of mirrors using DNS
Service Discovery (DNS-SD) and perform some simple connectectiviy tests
to find the "best" mirror. After portage has discovered the "best"
mirror it constructs a url as needed, be it rsync or wget. A reasonable
caching method for the connectectivity tests would be good too.
DNS-SD is currently an internet standard draft which is used by Zeroconf
but it doesn't depend on Zeroconf. It is well thought out and can
support things like path components which would be needed for a distfile
mirror since that is more than hostname and a fixed path.
More info at: http://www.dns-sd.org/ and
http://files.dns-sd.org/draft-cheshire-dnsext-dns-sd.txt
2. Moving the default location for distfiles from /usr/portage/distfiles
to /var/cache/distfiles . Write access to /usr should only be needed
when installing a new package. I should also be able to export a NFS
mount of a portage tree readonly and share it across a cluster.
------------
Below here are wild ideas that probably need more baking but I'll throw
them out there anyway.
3. Make it so the portage tree is sparse until part of it is requested.
There are thousands of packages in the portage tree and I use maybe a
few hundred. Why can't I enable an online mode that only rsyncs the
herds and empty package dirs until I try to `emerge herd/foo`. Then it
rsyncs the package fetching the ebuilds and related files as they are
needed. While this would make many times more connections to rsync
mirrors the amount of bandwidth would be a lot less because if I'm a KDE
user and don't touch Gnome then when changes happen to Gnome packages I
don't have to use bandwidth to sync them as I don't care about each
little update to packages I haven't installed.
4. Support Zero Install as a portage mirroring system. Zero Install
seems to be httpfs but doesn't require anything special about the web
server so that we could take advanage of existing web servers as
mirrors. More info at: http://zero-install.sourceforge.net/
5. Do something to speed up the long 'updating portage cache' step in a
`emerge sync`.
6. Support bittorrent distfile distrobution.
Sandy McArthur
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2003-12-08 16:03 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-05 9:58 [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page George Shapovalov
2003-12-05 12:26 ` Paul de Vrieze
2003-12-05 21:33 ` George Shapovalov
2003-12-06 14:26 ` Paul de Vrieze
2003-12-06 19:35 ` Daniel Robbins
2003-12-06 19:41 ` Jon Portnoy
2003-12-07 0:13 ` [gentoo-portage-dev] ebuild strengths/weaknesses Daniel Robbins
2003-12-07 1:44 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page Jason Stubbs
2003-12-07 2:39 ` George Shapovalov
2003-12-07 3:12 ` Jason Stubbs
2003-12-07 4:50 ` Ray Russell Reese III
2003-12-07 7:27 ` Daniel Robbins
2003-12-07 7:40 ` Daniel Robbins
2003-12-07 9:11 ` Kapil Thangavelu
2003-12-07 11:11 ` Paul de Vrieze
2003-12-08 16:03 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page, ebuild conversion Sandy McArthur
2003-12-07 11:05 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page Paul de Vrieze
2003-12-07 19:59 ` Philippe Lafoucrière
2003-12-07 20:10 ` Philippe Lafoucrière
2003-12-07 20:12 ` Jeff Smelser
2003-12-07 21:01 ` [gentoo-portage-dev] gpg signing of Manifests Douglas Russell
2003-12-07 21:53 ` Douglas Russell
2003-12-06 23:00 ` [gentoo-portage-dev] portage-ng concurse entry Was: Updated Portage project page George Shapovalov
2003-12-07 11:18 ` Paul de Vrieze
2003-12-05 16:54 ` [gentoo-portage-dev] portage-ng design competition -- not yet Daniel Robbins
2003-12-05 20:35 ` George Shapovalov
2003-12-05 21:59 ` [gentoo-portage-dev] portage-ng wish list Sandy McArthur
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox