From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 6FB1F1391DB for ; Sun, 23 Mar 2014 21:40:36 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 83C2DE0B00; Sun, 23 Mar 2014 21:40:25 +0000 (UTC) Received: from qmta12.westchester.pa.mail.comcast.net (qmta12.westchester.pa.mail.comcast.net [76.96.59.227]) by pigeon.gentoo.org (Postfix) with ESMTP id 8A661E09F6 for ; Sun, 23 Mar 2014 21:40:24 +0000 (UTC) Received: from omta21.westchester.pa.mail.comcast.net ([76.96.62.72]) by qmta12.westchester.pa.mail.comcast.net with comcast id h9cc1n0051ZXKqc5C9gQuV; Sun, 23 Mar 2014 21:40:24 +0000 Received: from [192.168.1.13] ([50.190.84.14]) by omta21.westchester.pa.mail.comcast.net with comcast id h9gP1n00g0JZ7Re3h9gQoh; Sun, 23 Mar 2014 21:40:24 +0000 Message-ID: <532F54C4.7080205@gentoo.org> Date: Sun, 23 Mar 2014 17:40:20 -0400 From: Joshua Kinard User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 To: =?ISO-8859-2?Q?Micha=B3_G=F3rny?= , gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] RFC GLEP 1005: Package Tags References: <20140323204428.58216f16@pomiot.lan> <532F43BF.7070405@gentoo.org> <20140323220515.22bcced5@pomiot.lan> In-Reply-To: <20140323220515.22bcced5@pomiot.lan> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 8bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1395610824; bh=Oxl/GYAUX2USrHcR/IPzYDTMmFPPCfdKvK0Po826bHA=; h=Received:Received:Message-ID:Date:From:MIME-Version:To:Subject: Content-Type; b=JXOlEx7uv0/9epJwNhq/X3ypmXJZ6PaHrwpQwDa/sFsCcTIEZlnYfbexFbK3MozNl +Dyi/EooLgwjPWmpAuSijPWRDBXHvVL0wIcTnmdOhHTi7yNL5pWLd6VGsKW3aPG/r2 WAxWiaBgXVg0xCnQZXt3HthimBZ3jXLGgTUBxQlhkkHFiOGOR1wa5CWndW8YvV4+Js 4p8K55RL+kQkGo5wL+lkCUv2+c30ZAeU1uIF6eQYmdCP0BhEOmiLCdA0nyV8GtnIoj 9JZ0UI/ZwwyvauvK1BD7XsI2GmRCzemoCptdKoyu0PMsC7MOYrbxvhUbo5anqCXGky jCdzA2to0jAVw== X-Archives-Salt: a43b5231-b907-4d71-93cd-f99a7dbd8cc4 X-Archives-Hash: b2141a3d971cc0af0be3d0e828d25ff6 On 03/23/2014 17:05, Michał Górny wrote: > Dnia 2014-03-23, o godz. 16:27:43 > Joshua Kinard napisał(a): > >> On 03/23/2014 15:44, Michał Górny wrote: >>> Tags, on the other hand, are more 'live'. They place the package >>> somewhere in the 'global' tag hierarchy that can change over time. >>> I expect that people other than maintainers will be adding tags to >>> packages (and changing them), and that people will invent new tags >>> and apply them to more packages. >>> >>> So, first of all, your solution would mean that every commit adding >>> a new tag or changing one of the tags would modify the package >>> metadata.xml. This means a Manifest update and a ChangeLog entry (please >>> don't get into more rules for ChangeLogs now), and this means it will be >>> harder to find actually useful entries there. >>> >>> So we make tag updates harder, and increase time and size of rsync. >> >> Instead of individual lines in metadata.xml for each tag, why not a >> single line that contains a comma-delimited list of up to five tags, >> whitespace optional? That should help reduce the "fluff" of the tree by >> adding this feature. >> >> E.g., >> >> one,two,three,four,five > > Either use XML, or don't use XML. Don't make this some kind of ugly > mixture of XML with non-XML. > > So: > > > one > two > > > if we're really going for this. But I guess our DTD doesn't allow easy > definition of single with no forced position. TBH, I don't like the use of XML at all. Never have and never will. I am a big fan of INI-style definitions (i.e., like Samba's config). XML just leads to a lot of unneeded fluff in what should be a really small file, which is why I was proposing a single element instead of multiple elements. E.g., instead for local USE of this: FOO BAR BAZ (96 bytes) This would be better: [local use] foo = "FOO" bar = "BAR" baz = "BAZ" (47 bytes) Not a complicated example, but would be >50% reduction in size. But, I digress... >>> Secondly, since tags for every package will be held in different files, >>> people will need dedicated tools to collect tags from all those files >>> and add matching tags to their own packages. Long story short, we're >>> going to have many 'duplicate' tags that will require even more commits >>> with ChangeLog entries and Manifest updates. >> >> If we automate the generation of a master tag index file, like >> use.desc.local, this can be avoided. emerge can simply go rummage through >> the master index for matching tag entries instead of going through the >> entire tree. Because if we wanted to sift through the entire tree, grep >> would be a far better method (compiled C and probably better text-matching >> algorithms than emerge). > > And this goes pretty much backwards to what we were aiming at. We > should finally kill use.desc.local, not get inspired by the redundancy. And what replaces it? What differentiates a global USE flag that has purpose across multiple packages (like 'ipv6') against a flag that only exists for a single package? I'll agree that USE flags have definitely gotten out of control, and the trend now seems to be moving sharply away from defining a global USE definition in make.conf instead to per-package USE flags in /etc/portage/package.use. Which, while offering more granular control, can be mind-numbingly annoying at times. The automated generation of use.local.desc definitely made maintenance of some things easier. We've gotta index USE flags some how, and separating them into global and local categories still makes sense to me. But, I'm probably just going senile... >>> Worse than that, your GLEP doesn't even have any basic rules for naming >>> tags -- like what language form to use and, say, which character to use >>> instead of space. This sounds like the sort of things that's going to >>> make it even harder to get some consistency, especially if some >>> developers are going to follow someone else committing earlier and some >>> will follow their own rules. >> >> Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no >> spaces. A lot of problems are avoided if we keep tags to one-word >> descriptors only. E.g., for mail clients, they would carry both 'mail' and >> 'client' as two of their five tags. For kmail, a third tag would be 'kde' >> and Evolution would have 'gnome' instead. > > I'm pretty sure you will finally hit something that goes with two > words. Protocol name or something. Perhaps, but we can fight that battle when we get there. starting off with one-word tags keeps things simple for now and that'll make it easier to determine whether this experiment actually pans out or not. >> I'd also suggest that 'all' be considered a default, global tag for all >> packages, it be a reserved tag internal to emerge and other package >> managers, and not count against the number of allowed tags (meaning that >> technically, a package is allow five tags + 'all'). >> >> As for default tags when a package does not define any, the package category >> gets split at the hyphen and becomes two independent tags. This is >> overridden when at least one tag is defined in metadata.xml. > > Will this have a real benefit? Sounds like unnecessary confusion for > a minor gain to me. Which? The internal 'all' tag or the use of existing category names as a default set of tags for packages that don't have any tags defined? The 'all' thing is probably unnecessary, as the same effect can be done with wildcarding or some other programming trick. The latter is just a way to avoid having to handle the lack of tags. Because if this is implemented, it's going to take years for most of the packages in the tree to get tags assigned to them. By having a default set of tags to link most packages to, it makes finding them via a tag search easy. E.g., even if a particular package in dev-python lacks tags, you can still find it by searching for the tag "python". Granted, a tag of "dev" offers no value (dev-python -> 'dev','python'), but if you were looking for a web browser versus a web server, having default tags of 'www','client' or 'www','servers' helps for packages in www-client and www-servers. Tags aside, wasn't there a proposal long ago to re-categorize the entire tree because someone felt that the double-atom naming mechanism for categories (atom1-atom2) wasn't flexible nor descriptive enough? The entire Portage tree idea derives from Ports, and it's really ballooned over the years, while a modern-day Ports tree in /usr/ports is still pretty small and self-contained. I've always wondered is we allowed portage to have one additional level of nesting if that'd help any (i.e., games-* -> games/*). It really seems like this is what tags is attempting to solve, so maybe that problem needs to be revisited instead. -- Joshua Kinard Gentoo/MIPS kumba@gentoo.org 4096R/D25D95E3 2011-03-28 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic