From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1MAaJ3-0005nD-QA for garchives@archives.gentoo.org; Sun, 31 May 2009 01:57:50 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 1C0BDE0391; Sun, 31 May 2009 01:57:47 +0000 (UTC) Received: from vms173003pub.verizon.net (vms173003pub.verizon.net [206.46.173.3]) by pigeon.gentoo.org (Postfix) with ESMTP id 00DE2E0391 for ; Sun, 31 May 2009 01:57:47 +0000 (UTC) Received: from gw.thefreemanclan.net ([68.238.176.199]) by vms173003.mailsrvcs.net (Sun Java(tm) System Messaging Server 6.3-7.04 (built Sep 26 2008; 32bit)) with ESMTPA id <0KKH00E6IK3VEE15@vms173003.mailsrvcs.net> for gentoo-dev@lists.gentoo.org; Sat, 30 May 2009 20:57:36 -0500 (CDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) by gw.thefreemanclan.net (Postfix) with ESMTP id 19BF71759CD3 for ; Sat, 30 May 2009 21:57:31 -0400 (EDT) Message-id: <4A21E40A.60500@gentoo.org> Date: Sat, 30 May 2009 21:57:30 -0400 From: Richard Freeman User-Agent: Thunderbird 2.0.0.21 (X11/20090321) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-version: 1.0 To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] Re: How not to discuss References: <20090527210642.6b7b0f21@snowcone> <200905280828.13024.patrick@gentoo.org> <20090528201204.1be8805b@halo.dirtyepic.sk.ca> <200905292349.26320.patrick@gentoo.org> <20090530145613.37514ceb@halo.dirtyepic.sk.ca> In-reply-to: <20090530145613.37514ceb@halo.dirtyepic.sk.ca> Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7bit X-Archives-Salt: 258a015f-8420-47e4-ba57-96b76926a338 X-Archives-Hash: 43ee4561a0112317c4d7ac4583f4ad05 Ryan Hill wrote: > I'm tired of playing, as I'm sure you are. So please, > let's be quiet now, and let the big people talk. > This is a public list designed to facilitate discussion of gentoo software development. Anybody with something constructive to say is more than welcome to speak up - particularly gentoo staff. I don't pretend to be an expert on package management. However, hiding internal implementation details is just good design. I can see how putting eapi in the filename can be a convenience to the package manager, but it still seems like a bad design, as it exposes end users to an implementation detail of the package management system. There are lots of ways that EAPI could be cached that would avoid the various penalties that have been referred to. Even without an improved cache the penalty seems superior to accepting the design compromise of EAPI in the filename. As to how EAPI could be cached goes - I could think of a few high-level design options: 1. Cache files are distributed with the portage tree. EAPIs that break the cache format would use different files that older package managers would ignore. Downside is that it doesn't handle user-modified ebuilds (unless the user tells the package manager to regenerate the cache), and it doesn't handle overlays unless the maintainer generates the cache. 2. Cache files are generated when the tree is synced. The package manager would look at the list of modified files and scan only those files one time to index them. The index could contain the mtime and path of the file. Then, when you perform an operation the package manager could check the mtimes in the directories containing those files and see if anything was touched and regenerate the cache if needed. This takes a little more time during syncing but I suspect that it would perform very well - after all after a sync all those files would be in the disk cache anyway. A suitably clever package manager could read the files as they are being synced and guarantee they are in-memory. If we were talking about a 300TB table that got 300k transactions per second I could see why we'd be talking about hacks to sacrifice normalized design for speed. We're talking about a package database - one that contains < 150k records. Sacrificing good design for speed (instead of improving the algorithm) is a short term gain for a long-term cost.