From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1MHi0x-0000TP-Gt for garchives@archives.gentoo.org; Fri, 19 Jun 2009 17:36:35 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id D993CE0326; Fri, 19 Jun 2009 17:36:33 +0000 (UTC) Received: from smtprelay07.ispgateway.de (smtprelay07.ispgateway.de [80.67.31.30]) by pigeon.gentoo.org (Postfix) with ESMTP id AB8F6E0326 for ; Fri, 19 Jun 2009 17:36:33 +0000 (UTC) Received: from [85.179.16.156] (helo=[192.168.0.3]) by smtprelay07.ispgateway.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.68) (envelope-from ) id 1MHi0u-0001KO-2Z; Fri, 19 Jun 2009 19:36:32 +0200 Message-ID: <4A3BCC9F.4050102@hartwork.org> Date: Fri, 19 Jun 2009 19:36:31 +0200 From: Sebastian Pipping User-Agent: Thunderbird 2.0.0.21 (X11/20090502) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 To: Paul Wise CC: PackageKit users and developers list , gentoo-dev@lists.gentoo.org, Christian Faulhammer , =?ISO-8859-1?Q?Petteri_R=E4ty?= , Robert Buchholz Subject: Re: [packagekit] [gentoo-dev] Inviting you to project "PackageMap" References: <4A3206DA.3090907@hartwork.org> <200906151552.08793.rbu@gentoo.org> <4A367F08.3050209@hartwork.org> <200906152024.54418.rbu@gentoo.org> <4A369D57.2030004@hartwork.org> <4A36AE9C.40207@gentoo.org> <4A383A12.2010004@hartwork.org> <4A38B949.8050601@gentoo.org> <4A3985BD.4040005@hartwork.org> <1245295820.11471.223.camel@chianamo.mine.nu> <4A3AC0D5.60107@hartwork.org> <1245382383.14805.281.camel@chianamo.mine.nu> In-Reply-To: <1245382383.14805.281.camel@chianamo.mine.nu> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Df-Sender: 874396 X-Archives-Salt: 18f7676b-2e85-4ebe-bf11-8b87d20221b9 X-Archives-Hash: 74f67d6b255b95f13672682c47a9956c Paul Wise wrote: > The scripts were in my mail and the files are on every Debian mirror: > > wget -O - http://ftp.debian.org/debian/dists/unstable/main/binary-amd64/Packages | grep -h ^Homepage | sort | uniq -c | sort -n -r | head -n 10 > wget -O - http://ftp.debian.org/debian/dists/unstable/main/source/Sources | grep -h ^Homepage | sort | uniq -c | sort -n -r | head -n 10 I see, thanks. I wrote: > I'd like to determine the subset of URLs that appear > exactly once in both gentoo and debian source packages. I made a script for this job now. With zero normalization I get this result: Mappable homepages in Debian: 6222 Mappable homepages in Gentoo: 9582 Shared (without normalization): 1183 That's about 11% of the Gentoo tree. The script is up here: http://git.goodpoint.de/?p=packagemap.git;a=tree;f=code/debian Sebastian