public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: "Tiziano Müller" <dev-zero@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] Gentoo Council Reminder for May 28
Date: Thu, 28 May 2009 11:35:01 +0200	[thread overview]
Message-ID: <1243503301.10450.83.camel@localhost> (raw)
In-Reply-To: <200905280923.46297.patrick@gentoo.org>

[-- Attachment #1: Type: text/plain, Size: 4949 bytes --]

Am Donnerstag, den 28.05.2009, 09:23 +0200 schrieb Patrick Lauer:
> On Thursday 28 May 2009 07:46:36 Tiziano Müller wrote:
> 
> > And here is why (I'm only looking at the non-degenerated case with valid
> > metadata, ignoring overlays which some consider a corner case (I don't
> > understand that argument, but that's another thing)):
> 
> overlays tend to come without metadata. Just enabling the KDE overlay changed 
> the time for "emerge -upNDv world" from ~30 seconds cold cache to ~120 
> seconds. Running emerge --metadata gets the performance back to pretty much 
> the old levels.
> 
> > When the package manager looks at a package, it first reads the
> > package's ebuild directory and gets the mtimes. It does the same for the
> > cache entries and validates the caches (there is more stuff in here,
> > like checking eclasses and so on).
> Eclasses are negligible because you only have to look at them once for the 
> whole caclulation. You can cache the mtime for the duration of your operation.
> 
> > Then the following happens based on the "solution" we choose:
> > eapi-in-filename: the package manager starts from the highest version
> > with a supported eapi (the others are inexistant with the used glob).
> > For that ebuild it reads the cache entry and decides whether or not it
> > can be used. 
> In this case you amusingly do NOT want to cache the eapi in the cache, so you 
> can even defer sourcing the ebuild until you actually need the metadata.
by "whether or not it can be used" I meant "keyword-like", surely not
eapi-like since you already know it at that point.

> (You don't want to cache it because you need to check the file mtime anyway, 
> and then you read the filename anyway. No need to look for it in another place 
> then :) )
> > If not, it proceeds to the next version, if yes, it's done.
> > eapi-in-ebuild: the package manager reads all cache entries and sorts
> > out those with an EAPI it doesn't support. The rest gets ordered and the
> > same procedure as above applies.
> >
> > So, one of the main differences is: "reading one cache file" (if running
> > unstable you can asssume you support the highest version, thus reading
> > only one cache file) vs. "reading all cache files".
> That assumes a dumb cache format. 
> Why don't we make the cache more efficient so you read one file per package / 
> category / ... ?
> 
> >
> > I did some performance measurements based on that. I have 1507 installed
> > packages with 5541 different versions/revisions.
> >
> > Reading from hot cache:
> > 1507 files: ~50ms
> > 5541 files: ~170ms
> >
> > Reading from cold cache:
> > 1507 files: ~2.8s
> > 5541 files: ~6s
> And now you need to pull metadata for dependency calculation. How big is the 
> impact of that?
The 1507 files are the complete dep-tree cache entries for the highest
version, where the 5541 files are all the cache entries for all packages
in dep-tree.
I did say that I simplified the case a lot, didn't I? :)

> 
> >
> > I made a lot of assumptions here (neglecting seek between ebuild-dir and
> > metadata-dir, other processes using the drive, 80 ebuilds from overlays
> > where the ebuild would have to be read, etc.). But estimating from the
> > numbers above I'd say that a "emerge -uD world"/"paludis -i world" will
> > be at least twice as slow, which I think is not acceptable.
> I find that quite acceptable. As long as we're using such a bad layout the 
> performance is secondary.
... and I don't :)

> 
> To fix the performance you'd "only" have to guarantee that the repo is 
> unchanged (readonly), so you can add lots of simple caches/indexes - no need 
> to source any ebuild for metadata again, one cachefile for eapi if you want 
> ... I bet you find lots of small improvements that that would yield. Much more 
> impressive than managing to avoid a few open() here and there ...
> 
> 
> > And I also don't understand your point of stating it's "bad design".
> Bad design is like smelly feet. It's hard not to notice ...
> 
> > I mean: when coding you should "not optimize prematurely", but with
> > eapi-in-ebuild it is against the other principle of "not pessimize
> > prematurely" (Sutter/Alexandrescu: C++ Coding Standards).
> If you quote that try the full quote:
> 
> "We should forget about small efficiencies, say about 97% of the time: 
> premature optimization is the root of all evil."
> 
> In other words, we should not try to make that path faster when we can avoid 
> hitting it at all with a small design revision.
> 
Which you still failed (after one year or so) to provide a nice cleanly
written document for.

-- 
Tiziano Müller
Gentoo Linux Developer, Council Member
Areas of responsibility:
  Samba, PostgreSQL, CPP, Python, sysadmin, GLEP Editor
E-Mail   : dev-zero@gentoo.org
GnuPG FP : F327 283A E769 2E36 18D5  4DE2 1B05 6A63 AE9C 1E30

[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2009-05-28  9:35 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-26 18:57 [gentoo-dev] Gentoo Council Reminder for May 28 Tiziano Müller
2009-05-27 12:46 ` Ferris McCormick
2009-05-27 13:25   ` Ulrich Mueller
2009-05-27 19:55   ` Roy Bamford
2009-05-27 20:06     ` Ciaran McCreesh
2009-05-27 21:43       ` Roy Bamford
2009-05-27 21:52         ` Ciaran McCreesh
2009-05-27 23:26           ` [gentoo-dev] " Mark Bateman
2009-05-27 23:45             ` Ciaran McCreesh
2009-05-27 23:48               ` Jeroen Roovers
2009-05-27 23:54                 ` Ciaran McCreesh
2009-05-28  3:58                   ` Jeroen Roovers
2009-05-28  6:28               ` [gentoo-dev] How not to discuss Patrick Lauer
2009-05-28 18:14                 ` Ciaran McCreesh
2009-05-28 18:36                   ` Alec Warner
2009-05-28 18:58                     ` Roy Bamford
2009-05-28 19:15                     ` Joe Peterson
2009-05-28 19:40                       ` Piotr Jaroszyński
2009-05-29  9:23                         ` Marijn Schouten (hkBst)
2009-05-30  0:38                       ` Alec Warner
2009-05-30 15:08                         ` Joe Peterson
2009-05-28 18:49                   ` Patrick Lauer
2009-05-28 19:11                     ` Ciaran McCreesh
2009-05-29  2:41                   ` [gentoo-dev] " Duncan
2009-05-29  2:12                 ` Ryan Hill
2009-05-29 21:49                   ` Patrick Lauer
2009-05-30 20:56                     ` Ryan Hill
2009-05-31  1:57                       ` Richard Freeman
2009-05-31  9:25                         ` Thilo Bangert
2009-05-31 10:57                           ` Duncan
2009-05-31 22:01                             ` Richard Freeman
2009-06-02  8:20                             ` Steven J Long
2009-06-02 12:53                               ` Duncan
2009-06-04 14:11                                 ` Steven J Long
2009-06-02 15:38                               ` Richard Freeman
2009-06-03 10:43                                 ` Marijn Schouten (hkBst)
2009-06-03 18:23                                   ` Richard Freeman
2009-05-28  5:46         ` [gentoo-dev] Gentoo Council Reminder for May 28 Tiziano Müller
2009-05-28  7:23           ` Patrick Lauer
2009-05-28  9:35             ` Tiziano Müller [this message]
2009-05-28 17:56           ` Roy Bamford
2009-05-28 18:04             ` Ciaran McCreesh
2009-05-28 18:30               ` Patrick Lauer
2009-05-28 18:48                 ` Ciaran McCreesh
2009-05-28 19:19                   ` Patrick Lauer
2009-05-28 19:26                     ` Ciaran McCreesh
2009-05-28 19:42                       ` Josh Saddler
2009-05-28 19:43                         ` Ciaran McCreesh
2009-05-28 19:42                       ` Roy Bamford
2009-05-28 19:54                         ` Ciaran McCreesh
2009-05-28 21:31                           ` Roy Bamford
2009-05-28 19:46                       ` Patrick Lauer
2009-05-28 19:52                         ` Ciaran McCreesh
2009-05-28 20:56                           ` Patrick Lauer
2009-05-28 21:09                             ` Ciaran McCreesh
2009-05-27 20:57     ` Joe Peterson
2009-05-27 21:58       ` Patrick Lauer
2009-05-27 22:12         ` Piotr Jaroszyński
2009-05-27 22:33           ` Patrick Lauer
2009-05-27 23:10             ` Piotr Jaroszyński
2009-05-28  6:36               ` Patrick Lauer
2009-06-01 20:42     ` Tiziano Müller
2009-05-28 13:11   ` Ferris McCormick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1243503301.10450.83.camel@localhost \
    --to=dev-zero@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox