From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.gentoo.org (smtp.gentoo.org [134.68.220.30]) by robin.gentoo.org (8.13.4/8.13.4) with ESMTP id j4BAfnGr024256 for ; Wed, 11 May 2005 10:41:49 GMT Received: from adsl-67-39-48-193.dsl.milwwi.ameritech.net ([67.39.48.193] helo=exodus) by smtp.gentoo.org with esmtpa (Exim 4.43) id 1DVof4-0000DU-1K for gentoo-dev@lists.gentoo.org; Wed, 11 May 2005 10:41:54 +0000 Date: Wed, 11 May 2005 05:42:18 -0500 From: Brian Harring To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] Re: New category proposal Message-ID: <20050511104218.GC13132@exodus.wit.org> References: <20050511050920.GC17034@exodus.wit.org> <91AEL.27774KEO@kevquinn.com> <20050511084004.GB13132@exodus.wit.org> <20050511090116.GA3093@ols-dell.gg3.net> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@gentoo.org Reply-to: gentoo-dev@lists.gentoo.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050511090116.GA3093@ols-dell.gg3.net> User-Agent: Mutt/1.5.8i X-Archives-Salt: 520cf460-919f-482e-8ada-a8b15b00d4c7 X-Archives-Hash: 3cc5c862aaf0626a2fc06de91b9a76a5 On Wed, May 11, 2005 at 06:01:17PM +0900, Georgi Georgiev wrote: > maillog: 11/05/2005-03:40:04(-0500): Brian Harring types > > > On Wed, May 11, 2005 at 09:46:03AM +0200, Kevin F. Quinn wrote: > > > Here's my suggestion, for what it's worth :) > > > > > > The layout on disk and the semantics of categories do not need to be related. > > Yes and no. You're assuming that people don't use the layout on disk for digging > > around without calling portage. Personally, I do. > > > > > I like the idea of using the first character of a package name as the > > > sub-directory name. This can be extended more deeply as and when necessary to > > > avoid over-large directories which cause problems on some filesystems. e.g. > > > for sudo you get "s/sudo" and vim-sudo "v/vim-sudo". This is > > > architecture-neutral, rsyncable, scalable, and not too difficult for users to > > > parse manually (see later for searching through categories). If the algorithm > > > portage would use to locate a package is such that it doesn't mandate the depth > > > (i.e. tries "package", "p/package" if "p/" exists, "p/a/package" if "p/a/" > > > exists) then overlays can have a different depth to the rsync tree; if you only > > > have a few packages in overlay then they need not be in subdirectories at all. > > Re-asserting that the fs layout *does* matter, how is that more intuitive when trying > > to track down the ebuild for dev-util/diffball ? > > > The fact that the directory where diffball is is easily deducable by its > name. As it is, I'd be a bit lost if I had to guess whether diffball is > in app-arch or dev-util. Even if I remembered it was something > dev-related I'd still be inclined to look in sys-devel. dev-util is accurate (it's a compressor, but a specialized variant, same as patch is). From it's current fs location/layout, we get thus- quick lookup on the atom, and inference of it's intentions. This is why we have xml at the category level, for example. One thing that's being unstated also- it's implicitly stated that this directory structure is somehow easier to look up a package. If you know the _exact_ package name, maybe. Otherwise, you're falling back to a search tool (which defeats to some degree the whole arguement for flattened namespace). Some quicky python, grouping by first char of the package name, and you wind up with (top 8)- 421, 521, 571, 582, 657, 663, 664, 746 Seperate directories within an individual directory. Say 'd' for example, and we'll pretend 746 is the count of packages that start with 'd'. That's a butload of directories to go digging in. The response would be, "well then extend it to the first two chars after the first dir". You narrow it down, but add another layer of dirs, again, for what gain? See, the thing I find odd about this thread/request is that essentially breaking it down to first letter groupping, is being argued as being _easier_ for people, while allowing multi cats, or just flat out dropping the category aspect. The example above, imo, proves otherwise. Keep in mind at this point, the discussion is whats easiest for _humans_. What's easiest for code/comp is another matter, and (within limits) can work with anything that's thrown at it. > > How many directories deep would I have to go before I reached the > > ebuild? > > Does it matter? You know the path exactly. It's p/portage. It's > not ... "was it sys-apps/portage or app-portage/portage"? I know the path as sys-apps/portage already though. Doesn't that count for something? :) Or, a bit more likely case, I know the type of the package, the category, but don't recall it's exact name. What y'all are proposing forces the user to use a tool, rather then just a quicky ls. > > > Having said that, some things could be done now. If a flat package namespace > > > is desirable, the existing package name clashes could be resolved by renaming > > > the few packages that clash. > > 74 packages, roughly, out of 9429 roughly. > > 76/9295, which is not that bad, considering about half of them are > emacs/xemacs related. 'cept either you, or someone else was proposing a totally flat namespace, no cats in the atoms. That means the count of changes (the 76 above is just # of conflicting packages) is around 19000, plus a fairly large amount of portage modifications. > > > Category could be added as a field in > > > metadata.xml, so that a package could "belong" to multiple categories. > > > The query/search tools could be enhanced to scan this metadata (perhaps including > > > the current category directory as an implied entry in the metadata.xml). > > If that's the goal of the "belong to N categories" thread, strictly searching, > > sure, although I don't like it. It can't become an atom for *DEPEND due to the cpv > > nonconflicting bit. > > I personally want to see the category part *disappear* from the *DEPEND > which is one of the reasons to advocate a flat tree. If the category (or > part of it) goes in the package name, so be it, but at least there will > be no package moves to break older ebuilds or outdated overlays. Frankly, you need to give a *really* damn good reason for why this is better. I don't think it is, convince me otherwise. What do we gain from a flat namespace? Right now, I can infer an atom out of a DEPEND string's purpose to some degree, based upon it's category. To head off the "well you don't need to know the category, you should know the packages intentions if you're modifying the ebuild", that dodges the point; via the category portion of an atom, I can infer at least -intention- of a package. Ignoring changes required (have stated them already, no point in sniping ya over it), what _exactly_ do we gain from the change? ~brian -- gentoo-dev@gentoo.org mailing list