public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-portage-dev] exclude cache from sync, was:Cache rewrite backport
@ 2005-10-13 17:12 Francesco R.
  2005-10-13 17:47 ` Brian Harring
  0 siblings, 1 reply; 2+ messages in thread
From: Francesco R. @ 2005-10-13 17:12 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 3459 bytes --]

Elaborating on the previous subject an idea come in mind,
avoid the $PORTDIR/metadata/cache sync, It's 80 MB, 20000 files

So I've tryed the first python patch of my life, obviously it's also the 
first portage one (subliminal message, take the result with care and 
check it yourself)

how? the patch add the option "--withregen", the to 
exclude /metadata/cache from the sync and to issue a regen of the cache 
before to update the metadata.
calling "emerge --withregen --sync" do the work.

The following test has run with a dual opteron as rsync server and a 
Athlon XP 2500+ with an old and slow disk, cabled with gigabit 
ethernet. The rsync is forced removing the timestamp.

The portage tree has been synced _before_ starting the bench, so the 
overhead of rsync is really reduced at the bare minimum, check the 
differences between the files on server and client.

The surprising result is that recreating the cache is faster than rsync 
it. (real time)

===================================|===================================
===================================|===================================
1st try, portage [2.0.53_rc5] + cac|1st try, portage [2.0.53_rc5] + cac
===================================|===================================
sync with cache and _no_ regen     |sync without cache and regen after
                                   |Regenerating cache entries...
                                   |done regen!
                                   |
real    6m23.727s                  |real    4m14.837s
user    0m12.373s                  |user    0m18.681s
sys     0m13.849s                  |sys     0m7.744s
===================================|===================================
===================================|===================================
2nd try, portage [2.0.53_rc5] + cac|2nd try, portage [2.0.53_rc5] + cac
===================================|===================================
sync with cache and _no_ regen     |sync without cache and regen after
                                   |Regenerating cache entries...
                                   |done regen!
                                   |
real    6m53.649s                  |real    2m40.794s
user    0m12.361s                  |user    0m18.361s
sys     0m14.117s                  |sys     0m6.800s
===================================|===================================
===================================|===================================
3rd try, portage [2.0.53_rc5] + cac|3rd try, portage [2.0.53_rc5] + cac
===================================|===================================
sync with cache and _no_ regen     |sync without cache and regen after
                                   |Regenerating cache entries...
                                   |done regen!
                                   |
real    6m46.973s                  |real    4m19.261s
user    0m12.605s                  |user    0m18.593s
sys     0m13.733s                  |sys     0m7.648s
                                   |

To run the test yourself (please with a local rsync mirror)

================
mkdir tmptest ; cdtmptest
# download the patch here
bunzip2 emerge.patch.bz2
cp /usr/bin/emerge .
patch -i emerge.patch emerge
grep -B1000 "###end test with" emerge.patch > test_emerge
# modify test_emerge for your needs
. test_emerge
==================
(/please with a local rsync mirror)

Please check the correctness of what the patch do before test it.

Cheers 

[-- Attachment #2: emerge.patch.bz2 --]
[-- Type: application/x-bzip2, Size: 2179 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [gentoo-portage-dev] exclude cache from sync, was:Cache rewrite backport
  2005-10-13 17:12 [gentoo-portage-dev] exclude cache from sync, was:Cache rewrite backport Francesco R.
@ 2005-10-13 17:47 ` Brian Harring
  0 siblings, 0 replies; 2+ messages in thread
From: Brian Harring @ 2005-10-13 17:47 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 4369 bytes --]

On Thu, Oct 13, 2005 at 07:12:42PM +0200, Francesco R. wrote:
> Elaborating on the previous subject an idea come in mind,
> avoid the $PORTDIR/metadata/cache sync, It's 80 MB, 20000 files
> 
> So I've tryed the first python patch of my life, obviously it's also the 
> first portage one (subliminal message, take the result with care and 
> check it yourself)
> 
> how? the patch add the option "--withregen", the to 
> exclude /metadata/cache from the sync and to issue a regen of the cache 
> before to update the metadata.
> calling "emerge --withregen --sync" do the work.
> 
> The following test has run with a dual opteron as rsync server and a 
> Athlon XP 2500+ with an old and slow disk, cabled with gigabit 
> ethernet. The rsync is forced removing the timestamp.
> 
> The portage tree has been synced _before_ starting the bench, so the 
> overhead of rsync is really reduced at the bare minimum, check the 
> differences between the files on server and client.
> 
> The surprising result is that recreating the cache is faster than rsync 
> it. (real time)
Nuke the cache and try it ;)
Being slightly sarcastic there, fastest I've managed to get a 
full regen down to was around 22 minutes for ebd, with stable being 
around 34m
http://dev.gentoo.org/~ferringb/ebuild-daemon/stats/paired-stats

I'd expect the gap has narrowed a bit since those timing runs, 
although it should still be sizable.  Either way, the longer timing 
run (stable ebuild.sh) is the one to note :)

So... that --metadata run has an extra stat call per check best case, 
but the cost of getting a cache entry is pretty heavily different.
For metadata, just is a file read/write, for regen, exec(bash -c 
ebuild.sh) which, via quicky commandline test (that sets a floor for 
it), time bash /usr/lib/portage/bin/ebuild.sh , you're looking at 
around .1s per call.

Pretty heavy difference on updates, eg, worst case- something a user 
hits on first sync, or nuking of the tree :)

> ===================================|===================================
> ===================================|===================================
> 1st try, portage [2.0.53_rc5] + cac|1st try, portage [2.0.53_rc5] + cac
> ===================================|===================================
> sync with cache and _no_ regen     |sync without cache and regen after
>                                    |Regenerating cache entries...
>                                    |done regen!
>                                    |
> real    6m23.727s                  |real    4m14.837s
> user    0m12.373s                  |user    0m18.681s
> sys     0m13.849s                  |sys     0m7.744s
> ===================================|===================================
> ===================================|===================================
> 2nd try, portage [2.0.53_rc5] + cac|2nd try, portage [2.0.53_rc5] + cac
> ===================================|===================================
> sync with cache and _no_ regen     |sync without cache and regen after
>                                    |Regenerating cache entries...
>                                    |done regen!
>                                    |
> real    6m53.649s                  |real    2m40.794s
> user    0m12.361s                  |user    0m18.361s
> sys     0m14.117s                  |sys     0m6.800s
> ===================================|===================================
> ===================================|===================================
> 3rd try, portage [2.0.53_rc5] + cac|3rd try, portage [2.0.53_rc5] + cac
> ===================================|===================================
> sync with cache and _no_ regen     |sync without cache and regen after
>                                    |Regenerating cache entries...
>                                    |done regen!
>                                    |
> real    6m46.973s                  |real    4m19.261s
> user    0m12.605s                  |user    0m18.593s
> sys     0m13.733s                  |sys     0m7.648s
>                                    |

That said, I'm curious wth is triggering the 2x sys activity, which 
probably is to blame for the ~90s difference

Anyone game for profiling a --metadata run, vs --regen run via 
profile.run?
~harring

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-10-13 17:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-13 17:12 [gentoo-portage-dev] exclude cache from sync, was:Cache rewrite backport Francesco R.
2005-10-13 17:47 ` Brian Harring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox