From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1SL9Bs-0001Ua-Vo for garchives@archives.gentoo.org; Fri, 20 Apr 2012 08:27:41 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 4602AE0782; Fri, 20 Apr 2012 08:27:21 +0000 (UTC) Received: from mail-iy0-f181.google.com (mail-iy0-f181.google.com [209.85.210.181]) by pigeon.gentoo.org (Postfix) with ESMTP id BBA21E073A for ; Fri, 20 Apr 2012 08:26:36 +0000 (UTC) Received: by iagk10 with SMTP id k10so17223648iag.40 for ; Fri, 20 Apr 2012 01:26:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=BEynyyFyCRvJcoFmXoejGXdx4Obo4uf5/BWiOQRHOeU=; b=W2k7wBOBCgYwikm94HYPI5O388yljEdiyMX7MIHlnOMsyP7lsBhe8be2ulHn/Ff7KS EeL13sYjm8RDy1mqb83wZpQFyJNdd8UPNhwqjZIcT5JcEEdKn6rbZ8bofKgQ68RX/NMn 2A8G7NbLawPHSS7inaZoH+Yo5QVn/UmwGUskes5b1NZotnLELs5AM+ouD61Tvj37+b5P Emt12kGXQIb75HwGuFkWIvdpR/DXjUZRh/m9bTBAyjjVJGZ2bo/h4p54t4BxzC99+56B Qa5yI2VRwlwG0f+KvuRM3QXmJ52pBf8uqoCxDtKjSRkMUQtLxLYOOBIetoEKXH+2MDC9 NEdw== Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 Received: by 10.50.157.167 with SMTP id wn7mr4906935igb.46.1334910396234; Fri, 20 Apr 2012 01:26:36 -0700 (PDT) Received: by 10.42.197.5 with HTTP; Fri, 20 Apr 2012 01:26:36 -0700 (PDT) In-Reply-To: References: <20120419153111.GD24273@falgoret> Date: Fri, 20 Apr 2012 20:26:36 +1200 Message-ID: Subject: Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd From: Kent Fredric To: gentoo-dev@lists.gentoo.org Cc: perl@gentoo.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Archives-Salt: 212b2dce-4546-493b-b6d3-4d3ad4c88b4b X-Archives-Hash: a6a11712787cc3f274b2f87a30487a43 On 20 April 2012 19:46, Corentin Chary wrote: > On Fri, Apr 20, 2012 at 9:37 AM, Kent Fredric wro= te: >> On 20 April 2012 03:31, Corentin Chary wrote: >>> Add rubygems, github, gitorious, pecl, pear, bitbucket. >>> All of them are handled by my remoteids.py script. >>> >>> ref: https://bugs.gentoo.org/show_bug.cgi?id=3D406287 >>> ref: https://github.com/iksaif/portage-janitor/blob/master/remoteids.py >>> >>> --- a/metadata/dtd/metadata.dtd 2010-03-02 18:52:11.000000000 +0100 >>> +++ b/metadata/dtd/metadata.dtd 2012-04-19 14:22:14.077954310 +0200 >>> @@ -61,7 +61,7 @@ >>> =C2=A0 =C2=A0 >>> =C2=A0 =C2=A0 >>> =C2=A0 =C2=A0 >>> - =C2=A0 =C2=A0 =C2=A0 >>> + =C2=A0 =C2=A0 =C2=A0 >>> >>> =C2=A0 >>> >>> -- >>> Corentin Chary >>> http://xf.iksaif.net/ >> >> >> I suggested last week on #gentoo-perl that it might be nice to have >> 'cpan' and 'cpan-module' =C2=A0( or something like that ) to disambiguat= e 2 >> queryable terms. ( where 'cpan' =C2=A0=3D> 'the package name on cpan' ) >> >> For some purposes, its most convenient to use the distribution name, >> and for other purposes, (ie: cpan clients) its more convenient to use >> a Module name, and its not easy to translate between the two, as >> Module names sometimes switch between packages =C2=A0they're shipped in. >> >> For instance, a while ago, the BioPerl module was shipped in a >> distribution 'bioperl' , which has only recently been changed to >> BioPerl >> >> >> http://api.metacpan.org/release/_search?q=3Ddistribution:bioperl&fields= =3Darchive,author,date,download_url >> >> http://api.metacpan.org/release/_search?q=3Ddistribution:BioPerl&fields= =3Darchive,author,date,download_url >> >> vs >> >> >> http://api.metacpan.org/module/_search?q=3Dmodule.name:Bio\:\:Perl&field= s=3Ddistribution,author,release > > Looks sane since the goal of remote-id is being able to identify the > package upstream. > Do you think you could patch remotesid.py to generate tags for cpan / > cpan-modules ? Or at least give me a pseudo-algo that does the trick. > Thanks :) > > -- > Corentin Chary > http://xf.iksaif.net > That is sadly not straight forward. Extracting the package name can be straight forward if you have the URL, because the package name is literally the same as the archive name in SRC_URI , sans version information. However, if you look at many perl ebuilds, you'll notice many lack this field and we've got other things in place, so the current parsing technique you use to detect uses of SRC_URI wont work there ( I could be wrong, I don't fully grok your python code ) And more-over, determining the value of 'cpan-module' may be impossible without access to the tar.gz itself, or querying the MetaCPAN API. Usually, upstream are sensible and have package names which closely correspond with the module names, ie: "Dist::Zilla" is shipped in 'Dist-Zilla-$VERSION.tar.gz', but there are many packages which dont do this, such as this notable example: https://metacpan.org/release/Scalar-List-Utils , which has no modules corresponding to the package name, and no way to divine the/a 'main' module from the package itself. ( and this is exacerbated by packages changing names, or package joins ( 2 packages becoming 1 via releasing modules together ), and package splits ( 1 package rips into 2 sets of modules ). Essentially, using a cpan-module as an identifier is somewhat "forwards only" , and even then, what it will resolve to is governed by time. This is fine for CPAN clients, which do the resolution hot, using the whole of CPAN as their data, if a user asks for "Foo::Bar", their cpan client will ask a cpan server ( or regularly (hourly) updated list ) as to what package that module can be found in ( and this only returns the most recent package, so name changes and so-forth are invisible to the user ). And being helpful to CPAN clients is one of the reasons we want this value as a specifiable option in the first place. For us, its easier to track the package name, and then when that has to change we can manually resolve the issue --=20 Kent perl -e=C2=A0 "print substr( \"edrgmaM=C2=A0 SPA NOcomil.ic\\@tfrken\", \$_= * 3, 3 ) for ( 9,8,0,7,1,6,5,4,3,2 );" http://kent-fredric.fox.geek.nz