From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1M9c0u-0005FO-Vg for garchives@archives.gentoo.org; Thu, 28 May 2009 09:35:06 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 730ADE03F0; Thu, 28 May 2009 09:35:03 +0000 (UTC) Received: from smtp.tmcs.ch (113.245.131.213.static.inetbone.net [213.131.245.113]) by pigeon.gentoo.org (Postfix) with ESMTP id 26F68E03F0 for ; Thu, 28 May 2009 09:35:03 +0000 (UTC) Received: from [89.206.68.127] (dhcp-vpn-89-206-68-127.unizh.ch [89.206.68.127]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by smtp.tmcs.ch (Postfix) with ESMTPSA id 6F0D51648132 for ; Thu, 28 May 2009 11:35:02 +0200 (CEST) Subject: Re: [gentoo-dev] Gentoo Council Reminder for May 28 From: Tiziano =?ISO-8859-1?Q?M=FCller?= To: gentoo-dev@lists.gentoo.org In-Reply-To: <200905280923.46297.patrick@gentoo.org> References: <1243460607.3480.3@NeddySeagoon> <1243489596.10450.24.camel@localhost> <200905280923.46297.patrick@gentoo.org> Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-KGFn4Dka04La2SFWskJw" Organization: Gentoo Date: Thu, 28 May 2009 11:35:01 +0200 Message-Id: <1243503301.10450.83.camel@localhost> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org Mime-Version: 1.0 X-Mailer: Evolution 2.26.1.1 X-Archives-Salt: bebf0e5a-26ff-4e43-9d98-7ecbc47f2adf X-Archives-Hash: 41608770cd9f90844fe670caba5d8ad0 --=-KGFn4Dka04La2SFWskJw Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Am Donnerstag, den 28.05.2009, 09:23 +0200 schrieb Patrick Lauer: > On Thursday 28 May 2009 07:46:36 Tiziano M=C3=BCller wrote: >=20 > > And here is why (I'm only looking at the non-degenerated case with vali= d > > metadata, ignoring overlays which some consider a corner case (I don't > > understand that argument, but that's another thing)): >=20 > overlays tend to come without metadata. Just enabling the KDE overlay cha= nged=20 > the time for "emerge -upNDv world" from ~30 seconds cold cache to ~120=20 > seconds. Running emerge --metadata gets the performance back to pretty mu= ch=20 > the old levels. >=20 > > When the package manager looks at a package, it first reads the > > package's ebuild directory and gets the mtimes. It does the same for th= e > > cache entries and validates the caches (there is more stuff in here, > > like checking eclasses and so on). > Eclasses are negligible because you only have to look at them once for th= e=20 > whole caclulation. You can cache the mtime for the duration of your opera= tion. >=20 > > Then the following happens based on the "solution" we choose: > > eapi-in-filename: the package manager starts from the highest version > > with a supported eapi (the others are inexistant with the used glob). > > For that ebuild it reads the cache entry and decides whether or not it > > can be used.=20 > In this case you amusingly do NOT want to cache the eapi in the cache, so= you=20 > can even defer sourcing the ebuild until you actually need the metadata. by "whether or not it can be used" I meant "keyword-like", surely not eapi-like since you already know it at that point. > (You don't want to cache it because you need to check the file mtime anyw= ay,=20 > and then you read the filename anyway. No need to look for it in another = place=20 > then :) ) > > If not, it proceeds to the next version, if yes, it's done. > > eapi-in-ebuild: the package manager reads all cache entries and sorts > > out those with an EAPI it doesn't support. The rest gets ordered and th= e > > same procedure as above applies. > > > > So, one of the main differences is: "reading one cache file" (if runnin= g > > unstable you can asssume you support the highest version, thus reading > > only one cache file) vs. "reading all cache files". > That assumes a dumb cache format.=20 > Why don't we make the cache more efficient so you read one file per packa= ge /=20 > category / ... ? >=20 > > > > I did some performance measurements based on that. I have 1507 installe= d > > packages with 5541 different versions/revisions. > > > > Reading from hot cache: > > 1507 files: ~50ms > > 5541 files: ~170ms > > > > Reading from cold cache: > > 1507 files: ~2.8s > > 5541 files: ~6s > And now you need to pull metadata for dependency calculation. How big is = the=20 > impact of that? The 1507 files are the complete dep-tree cache entries for the highest version, where the 5541 files are all the cache entries for all packages in dep-tree. I did say that I simplified the case a lot, didn't I? :) >=20 > > > > I made a lot of assumptions here (neglecting seek between ebuild-dir an= d > > metadata-dir, other processes using the drive, 80 ebuilds from overlays > > where the ebuild would have to be read, etc.). But estimating from the > > numbers above I'd say that a "emerge -uD world"/"paludis -i world" will > > be at least twice as slow, which I think is not acceptable. > I find that quite acceptable. As long as we're using such a bad layout th= e=20 > performance is secondary. ... and I don't :) >=20 > To fix the performance you'd "only" have to guarantee that the repo is=20 > unchanged (readonly), so you can add lots of simple caches/indexes - no n= eed=20 > to source any ebuild for metadata again, one cachefile for eapi if you wa= nt=20 > ... I bet you find lots of small improvements that that would yield. Much= more=20 > impressive than managing to avoid a few open() here and there ... >=20 >=20 > > And I also don't understand your point of stating it's "bad design". > Bad design is like smelly feet. It's hard not to notice ... >=20 > > I mean: when coding you should "not optimize prematurely", but with > > eapi-in-ebuild it is against the other principle of "not pessimize > > prematurely" (Sutter/Alexandrescu: C++ Coding Standards). > If you quote that try the full quote: >=20 > "We should forget about small efficiencies, say about 97% of the time:=20 > premature optimization is the root of all evil." >=20 > In other words, we should not try to make that path faster when we can av= oid=20 > hitting it at all with a small design revision. >=20 Which you still failed (after one year or so) to provide a nice cleanly written document for. --=20 Tiziano M=C3=BCller Gentoo Linux Developer, Council Member Areas of responsibility: Samba, PostgreSQL, CPP, Python, sysadmin, GLEP Editor E-Mail : dev-zero@gentoo.org GnuPG FP : F327 283A E769 2E36 18D5 4DE2 1B05 6A63 AE9C 1E30 --=-KGFn4Dka04La2SFWskJw Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Dies ist ein digital signierter Nachrichtenteil -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (GNU/Linux) iEYEABECAAYFAkoeWsUACgkQGwVqY66cHjBO5QCdFBFhXMbC1d3TA5qX61fJgbya 15gAmwQIEfaHG4RUQno2mZP1YNreK7nW =ZQK9 -----END PGP SIGNATURE----- --=-KGFn4Dka04La2SFWskJw--