public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Categories
@ 2003-06-03 18:01 Sebastian Werner
  2003-06-04  0:32 ` George Shapovalov
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Werner @ 2003-06-03 18:01 UTC (permalink / raw
  To: gentoo-dev

Hi!

To continue the story of music-compose a bit...

There a some other categories I want to split. For example: net-mail, 
net-www, ...

net-mail ->
   - net-mail-server (exim, sendmail, ...)
   - net-mail-clients (sylpheed, evolution, ...)

net-www ->
   - net-www-server (apache, cherokee, ...)
   - net-www-plugins (netscape-flash, netscape-plugger, ...)
   - net-www-modules (mode_dav, mod_gzip, ...)
   - net-www-clients (epiphany, mozilla, opera, ...)

Another more advanced solution I think, is to handle categories 
something like epiphany. "Add one or more categories to a application 
and find it in all". We must move all apps out of there categories flat 
in the main-directory - or eventually better in this case one new 
created dir.

Comments?

Sebastian






-- 
Sebastian Werner
Karlsruhe

http://www.sebastian-werner.net


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-dev] Categories
  2003-06-03 18:01 [gentoo-dev] Categories Sebastian Werner
@ 2003-06-04  0:32 ` George Shapovalov
  2003-06-04 14:23   ` Rolf Veen
  0 siblings, 1 reply; 5+ messages in thread
From: George Shapovalov @ 2003-06-04  0:32 UTC (permalink / raw
  To: gentoo-dev

On Tuesday 03 June 2003 11:01, Sebastian Werner wrote:
> net-www ->
>    - net-www-server (apache, cherokee, ...)
>    - net-www-plugins (netscape-flash, netscape-plugger, ...)
>    - net-www-modules (mode_dav, mod_gzip, ...)
>    - net-www-clients (epiphany, mozilla, opera, ...)
So, are we finally getting serious about more than two levels of 
categorisation?

I was watching related topics with interest lately and will try to summarise 
what has been said so far. If I am missing something, please do comment on 
it.

Basically most of the related proposals were coming down to treating every 
categorisation level as a dir in order to enhance browsability of the tree. 
Thus we would get only a handfull top-level categories containing more 
subcategories, possibly containing more subcategories... Like in this case it 
would be /net/www/{server,plugins...}.

Alternatively there was a question whether we may consider abandoning 
categorisation altogether and going with a flat namespace (and corresponding 
lack of categorisation), relying only on search capabilities. More on 
searches later..
The main objection to this was that categories are actually used and 
appreciated (by me included) since some people do browse the tree on 
occasions (its like library access. Sure you can search for pretty much 
everything, but somethimes nothing beats coming down to the shelves and 
finding what other books are neaby the one you were able to find). 

Other complications with such approach include high probability of name 
clashes in flat namespace (what we already do experience) and the fact that 
this indirectly contradicts the policy, which requires categories to be 
listed in [R|P]DEPEND statments..

One downside of purely tree-like structure that was mentioned is the 
difficulty of unambiguous categorisation of certain packages.  Enhancement 
calls were made, such as the one below:
> Another more advanced solution I think, is to handle categories
> something like epiphany. "Add one or more categories to a application
> and find it in all". We must move all apps out of there categories flat
> in the main-directory - or eventually better in this case one new
> created dir.

I should probably leave it up to Nick to comment on implementation details, 
however my uderstanding is that it is really beneficial to have a tree-like 
structure for at least internal data representation, as it makes it easy to 
traverse and search by particular relation. If this inernal structure is 
usefull for browsing, so much the better.

However the question of multiple logical placement still stands. It has been 
addressed in general by proposals to add some search keywords to ebuilds or 
package directory in some way. Well, KEYWORDS are already used (albeit for a 
different purpose, though IIRC (and I am not rally sure here) originally such 
use was considered as well, but I think it was decided to narrow the scope in 
order to avoid overloading it), but it may be possible to introduce some 
other var, say SEARCHKEYS and add database indexing by these keys 
functionality to portage (and naturally cache the index tables in addition to 
what is already being cached).  This should allow portage tree to be treated 
like a "real" database and should cover the requirement of multiple logical 
adherence.

One remark is possible here: that ebuilds themselves are probably not the 
ideal place for the SEARCHKEYS, as this information will be for the most part 
duplicated among all the different versions. It would make sence to store 
this informaation in some "central" to the package place and I seem to 
remember this mentioned as one of the proposed provisions of "enhanced" 
ChangeLog's discussed back quite some time..

Ok, this seems to be pretty much it, at least from what I remember being 
mentioned on this topic. Again, if anybody thinks I ommitted something, 
please stand up and mention it :).

George



--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-dev] Categories
  2003-06-04  0:32 ` George Shapovalov
@ 2003-06-04 14:23   ` Rolf Veen
  2003-06-06 16:35     ` Paul de Vrieze
  0 siblings, 1 reply; 5+ messages in thread
From: Rolf Veen @ 2003-06-04 14:23 UTC (permalink / raw
  To: gentoo-dev

George Shapovalov wrote:
> Ok, this seems to be pretty much it, at least from what I remember
> being mentioned on this topic. Again, if anybody thinks I ommitted
> something, please stand up and mention it :).

Namespace orthogonal to categories.

Categories change, as packages are being added; if a category has more
that N (lets say 50, for example) entries it looses its usefullness.
While browsing for packages it is natural to have some level of depth;
two or tree levels of categories should be ok. Also a package can fit
into more that one category. Let categories be a graph. A symlink
hierarchy, for example.

But since categories are variable and somewhat arbitrary, don't let
the basic system, the core algorithms, depend on them. So take a flat
namespace for packages, resolving name conficts in the download (url
to local dir) phase, adding the necesary information to the ebuild.

Concluding, have a flat namespace for machine interaction, and an
arbitrarily complicated category graph on top of that for user
interaction.

Well, it's an opinion.
Rolf.


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-dev] Categories
  2003-06-04 14:23   ` Rolf Veen
@ 2003-06-06 16:35     ` Paul de Vrieze
  2003-06-09  7:30       ` Rolf Veen
  0 siblings, 1 reply; 5+ messages in thread
From: Paul de Vrieze @ 2003-06-06 16:35 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 2191 bytes --]

On Wednesday 04 June 2003 16:23, Rolf Veen wrote:
> George Shapovalov wrote:
> > Ok, this seems to be pretty much it, at least from what I remember
> > being mentioned on this topic. Again, if anybody thinks I ommitted
> > something, please stand up and mention it :).
>
> Namespace orthogonal to categories.
>
> Categories change, as packages are being added; if a category has more
> that N (lets say 50, for example) entries it looses its usefullness.
> While browsing for packages it is natural to have some level of depth;
> two or tree levels of categories should be ok. Also a package can fit
> into more that one category. Let categories be a graph. A symlink
> hierarchy, for example.

Unfortunately CVS does not work well with symlinks, so this is not really an 
option. Also there is an advantage in being able to have one unique name for 
a package.

>
> But since categories are variable and somewhat arbitrary, don't let
> the basic system, the core algorithms, depend on them. So take a flat
> namespace for packages, resolving name conficts in the download (url
> to local dir) phase, adding the necesary information to the ebuild.

We need unique names. For me category/name is a good way, and certainly better 
than UID's (Like microsoft uses) as those are impossible to remember and easy 
to do wrong. Also we have a central repository, so we don't need to worry 
about clashes that much.

>
> Concluding, have a flat namespace for machine interaction, and an
> arbitrarily complicated category graph on top of that for user
> interaction.

Flat namespaces are actually slower in machine interaction. There are allready 
very many packages in portage currently. Thousands of entries in a directory 
is NOT fun to look at, or to search for a computer (albeight doable).

My suggestion would be an alias list simmilar to the virtuals list, but one 
that is not allowed inside ebuilds. In those ways packages can still be 
presented in multiple categories, while the aliasses do not interfere with 
the inner workings of portage.

Paul

-- 
Paul de Vrieze
Researcher
Mail: pauldv@cs.kun.nl
Homepage: http://www.devrieze.net

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-dev] Categories
  2003-06-06 16:35     ` Paul de Vrieze
@ 2003-06-09  7:30       ` Rolf Veen
  0 siblings, 0 replies; 5+ messages in thread
From: Rolf Veen @ 2003-06-09  7:30 UTC (permalink / raw
  To: gentoo-dev

Paul de Vrieze wrote:

> Unfortunately CVS does not work well with symlinks, so this is not
> really an option. 

Not unavoidable. Let a script and a descriptor combination
reconstruct the whole hierarchy, i.e., ebuild the categories.
Starting from a descriptor in XML (for example), you could ebuild
a symlink hierarchy, or you could choose other backends too, such
as a database (in a distant future).

Or you could even include categories in each ebuild descriptor. Each
package says to which categories it belongs.

> Flat namespaces are actually slower in machine interaction. There are
> allready very many packages in portage currently. Thousands of
> entries in a directory is NOT fun to look at, or to search for a
> computer (albeight doable).

Examples of what I'm proposing are Sourceforge and Freshmeat. Both
use flat namespaces, and on top of that a search engine and a complex
category structure. And they manage a lot of entries ! Sourceforge
solves the directory problem in the form /g/ge/gentoo; that can be
handled transparently by the tools.

Cheers / Groeten.
Rolf.


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-06-09  7:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-03 18:01 [gentoo-dev] Categories Sebastian Werner
2003-06-04  0:32 ` George Shapovalov
2003-06-04 14:23   ` Rolf Veen
2003-06-06 16:35     ` Paul de Vrieze
2003-06-09  7:30       ` Rolf Veen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox