From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1N2Eio-0005Rv-EB for garchives@archives.gentoo.org; Mon, 26 Oct 2009 01:50:10 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 2D3CFE0931; Mon, 26 Oct 2009 01:50:09 +0000 (UTC) Received: from mail-yw0-f191.google.com (mail-yw0-f191.google.com [209.85.211.191]) by pigeon.gentoo.org (Postfix) with ESMTP id 05B7BE0931 for ; Mon, 26 Oct 2009 01:50:08 +0000 (UTC) Received: by ywh29 with SMTP id 29so9642751ywh.32 for ; Sun, 25 Oct 2009 18:50:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:date:from:to:cc :subject:message-id:mime-version:content-type:content-disposition :user-agent; bh=h5gui+mzUE7IbfpME0sxE/JzIk0DlsWC5H84/lZjwk4=; b=wpLGUQAh2NkiZM88McN1EGsJf9T4OA5mFUOB/HpikUseI4g+3rfnsQQ6w/Qiz/y8Gy FOtS1SexAAAn7SDDdpdjDfq2vnXAYFoA5Ua+KsMudafBAEEtna7vTqh9EocfulmN9Rig ErDWULJa5MsB1w9DNSohRO9CGwTLinAnaHopM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:mime-version:content-type :content-disposition:user-agent; b=tp/xhhq3JAMA597v1ZvPaQumBfwHHHNc8MkAzfYpGFfDWfzVpnbny97kMxLTGpqeMf xfZHxActvO2U9fGHv1GrOEMvFNiXTPnpVXp3wWLt80CJIq9VXMkmyAPKkbHJCLSZKJ+k UBPAY16MKDtrFqvHY14fPNsbMpm+Rvf4PNqBg= Received: by 10.101.200.32 with SMTP id c32mr2007773anq.179.1256521808714; Sun, 25 Oct 2009 18:50:08 -0700 (PDT) Received: from smtp.gmail.com (c-24-130-139-50.hsd1.ca.comcast.net [24.130.139.50]) by mx.google.com with ESMTPS id 34sm1051319yxf.65.2009.10.25.18.50.05 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 25 Oct 2009 18:50:07 -0700 (PDT) Received: by smtp.gmail.com (sSMTP sendmail emulation); Sun, 25 Oct 2009 18:50:05 -0700 Date: Sun, 25 Oct 2009 18:50:05 -0700 From: Brian Harring To: gentoo-dev@lists.gentoo.org Cc: zmedico@gentoo.org, solar@gentoo.org, ciaran.mccreesh@googlemail.com, fuzzyray@gentoo.org Subject: [gentoo-dev] adding a modification timestamp to the installed pkgs database (vdb) Message-ID: <20091026015005.GA12250@hrair.hsd1.ca.comcast.net> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="sdtB3X0nJg68CQEu" Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) X-Archives-Salt: 537c7e66-f917-4100-885a-c5dbc58bd3b2 X-Archives-Hash: 6b3e00049a1bf35fbf7a5e66d1449553 --sdtB3X0nJg68CQEu Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable First of all, feel free to forward this to anyone who is responsible=20 for code pkged in the tree that access the vdb (/var/db/pkg) in some=20 fashion. The proposal is pretty simple; if code modifies the vdb in any=20 fashion, it needs to update the mtime on a file named=20 '.modification_time' in the root of the vdb. For example- 1) ${PACKAGE_MANAGER} fires ups, builds a pkg. it's now ready to=20 install it. 2) this step isn't strictly required, but is a zero cost safety=20 measure- prior to modifying the vdb, it updates the timestamp. The=20 reason for doing this is to protect against the manager blowing up in=20 some fashion and now updating the timestamp- there still is a window=20 if the manager breaks down during merging but it's far reduced. 3) manager does it's thing to the livefs, and to the vdb. 4) once finished, again, updates the timestamp. This isn't an incredibly complex change. What it enables however is=20 package managers to get serious about optimizing access to the vdb. =20 For example for the 3 managers: paludis: installed-cache currently needs to be manually ran by the user;=20 specifically, the user is responsible for regenerating this cache if=20 they use a non paludis manager to modify the VDB. This can be=20 automated via checking the vdb timestamp against a stored copy of the=20 the vdb timestamp at the time of the cache generation. portage: portage maintains a set of denormalized caches of the vdb- it however=20 has to do validation of those caches on each access, meaning quite a=20 few stats. Same thing, can compare timestamp from current vdb to when=20 it was generated to identify if it is no longer authorative. pkgcore: pkgcore maintains a denormalized old style virtuals cache- same thing=20 w/ portage, it has to do validation (stat'ing) whenever it uses that=20 cache to ensure the data is accurate. Same thing, can compare=20 timestamp from current vdb to whenit was generated to identify if it=20 is no longer authorative. The existing vdb caching could all be modified to use this timestamp. =20 One stat in the best (common) case, instead of having to either scan=20 the whole vdb each time or doing a subset of stats.=20 This change enables further caching/denormalization of the vdb data=20 while maintaining the old format- basically, it allows the manager to=20 build out a helluva lot faster access to the vdb while keeping on=20 disk compatibility in /var/db/pkg. Now unfortunately since the vdb is not format versioned in any=20 fashion, to get this timestamp we have to do the following- 1) nudge everyone who has code poking into the vdb to update their=20 code to update the timestamp 2) sit on our hands for N months until such time we've deemed=20 "everyone we care about has upgraded" 3) push out a new release, and start pushing out versions of the=20 managers/vdb consumers that use this timestamp instead of just=20 updating it. For anyone who has been around gentoo for a couple of years, this is a=20 pretty familiar pattern- eapi, profile changes, etc, all go through=20 this unfortunately. That's the core of the proposal; there is a ticket open=20 ( http://bugs.gentoo.org/290428 ) regarding this although there is=20 some debate from ciaran which I'll try to now summarize, along w/ the=20 counterarguments. 1) do a new vdb. Counter: this mechanism provides a way to synchronize the new vdb=20 while maintaining the old during it's transition period, so this is=20 needed anyways. Further, pinning all of our optimization hopes on a=20 new vdb is daft- it's been discussed for 5+ years now and still=20 hasn't materialized (pkgcore has been able to have a new vdb for=20 several years, but without a synchronization mechanism it would=20 require locking users into the new format and locking out old=20 consumers of the vdb- an unfriendly choice to push on users, hence=20 never being implemented). 2) code that hasn't been updated to adjust the timestamp, but is still=20 in use after the transition period will break things. Counter: nature of any modification of this sort, frankly the gains=20 outweight the costs of users being rediculously out of date. Not=20 saying it's perfect, but until someone comes up with a proposal that=20 versions every PMS component (meaning PMS has to start documenting=20 the VDB), it's what we have if we wish to move forward in=20 refactoring. 3) the correct approach is to require users to tell each manager that=20 changes have occured outside it's purview (run paludis=20 --regenerate-installed-cache after every time you invoke pmerge or=20 emerge). Counter: that's rather unfriendly to users, and isn't what=20 pkgcore/portage do. Further, it's historically the opposite of the=20 norm- consider the ebuild cache (we do validation as we go there,=20 instead of expecting users to do a emerge --regen everytime they=20 modify an ebuild). That's roughly the three points raised; there is some minor quibbling=20 that mtime cannot be trusted, but that's mostly a variation of #2. =20 Feel free to dig into the bug for exact specifics, or wait for=20 ciaran's reply to this post. So... thoughts? ~harring --sdtB3X0nJg68CQEu Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (GNU/Linux) iEYEARECAAYFAkrlAE0ACgkQsiLx3HvNzgcpaACgojWLrsicuymps7XGuLwxTenC fD8AoJQuUzE3O12+hR86Sw7xZ7A5kXPv =I6+S -----END PGP SIGNATURE----- --sdtB3X0nJg68CQEu--