public inbox for gentoo-project@lists.gentoo.org
 help / color / mirror / Atom feed
From: Raymond Jennings <shentino@gmail.com>
To: gentoo-project@lists.gentoo.org
Subject: Re: [gentoo-project] RFC: Dropping rsync as a tree distribution method
Date: Tue, 18 Dec 2018 03:36:14 -0800	[thread overview]
Message-ID: <CAGDaZ_p88TOB0ufrtZO3ebR5YzyCZn6NREANddOieM=w6NVGqw@mail.gmail.com> (raw)
In-Reply-To: <20181218125555.1927321328046d0a2ecd3e16@gentoo.org>

On Tue, Dec 18, 2018 at 1:56 AM Andrew Savchenko <bircoph@gentoo.org> wrote:
> On Sat, 15 Dec 2018 23:15:47 -0500 Alec Warner wrote:
> > Hi,
> >
> > I am currently embarking on a plan to redo our existing rsync[0] mirror
> > network. The current network has aged a bit. Its likely too large and is
> > under-maintained. I think in the ideal case we would instead pivot this
> > project to scaling out our git mirror capabilities and slowly migrate all
> > consumers to pulling the git tree directly. To that end, I'm looking for
> > blockers as to why various customers cannot switch to pulling the gentoo
> > ebuild repository from git[1] instead of rsync.
> >
> > So for example:
> >
> > - bandwidth concerns (preferably with documentation / data.)
> > - Firewall concerns
> > - CPU concerns (e.g. rsync is great for tiny systems?)
> > - Disk usage for git vs rsync
> > - Other things i have not thought of.
>
> My main concern with git is downlink fault tolerance. If rsync
> connection is broken, it can be easily restored without much data
> retransmission. If git download connection is broken, it has to
> start all over again. So there are cases where rsync will be always
> much more preferable than git.

Are you talking about in comparison to the initial clone?
If so, would having the clone default to shallow mitigate this?

For the curious, I ran a benchmark.

With a completely purged /usr/portage:

emerge-webrsync took 30.302s
emerge-sync (with git clone --depth 1) took 33.902s
emerge-sync (with regular rsync) took a whoping 1m25.863s

After a fresh sync:

emerge-sync (with regular rsync) took 7.564s
emerge-sync (with git fetch --depth 1, and after priming the repo with
a full clone) took 2.086s



Up front, webrsync seems to be a small winner for initial setups, with
git clone a close second, and regular rsync is 3 fold worse

Routine syncs would seem to prefer git, especially if they are done
with presistent regularity which IMO would amortize things.  My
opinion is that over time git would also place less stress on the
servers since it only has to look at the commit chain instead of
checksumming every single file.



That said, would I be correct to surmise that you're advancing a
robustness issue and not simply a performance issue?


> Best regards,
> Andrew Savchenko


  reply	other threads:[~2018-12-18 11:36 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-16  4:15 [gentoo-project] RFC: Dropping rsync as a tree distribution method Alec Warner
2018-12-16  4:40 ` Matt Turner
2018-12-16  5:13   ` Georgy Yakovlev
2018-12-16  5:17     ` Alec Warner
2018-12-16  6:50       ` Raymond Jennings
2018-12-16  6:52         ` Raymond Jennings
2018-12-16  7:38       ` Zac Medico
2018-12-16  7:42       ` Zac Medico
2018-12-18 17:28         ` Andrew Savchenko
2018-12-16  6:55     ` Raymond Jennings
2018-12-16 10:22     ` Toralf Förster
2018-12-17 17:26     ` Matt Turner
2018-12-17 17:43       ` Raymond Jennings
2018-12-18  3:57         ` Georgy Yakovlev
2018-12-18  4:02           ` Raymond Jennings
2018-12-18  8:06           ` Robin H. Johnson
2018-12-20  1:18           ` Kent Fredric
2018-12-16 11:34 ` Rich Freeman
2018-12-16 21:10   ` Matthew Thode
2018-12-20  1:26   ` Kent Fredric
2018-12-16 17:15 ` Toralf Förster
2018-12-16 17:38   ` M. J. Everitt
2018-12-16 18:05     ` M. J. Everitt
2018-12-16 18:36       ` Rich Freeman
2018-12-16 18:41         ` M. J. Everitt
2018-12-18  9:55 ` Andrew Savchenko
2018-12-18 11:36   ` Raymond Jennings [this message]
2018-12-18 17:14     ` Andrew Savchenko
2018-12-18 18:00       ` Alec Warner
2018-12-18 22:13         ` M. J. Everitt
2018-12-18 11:55   ` Michał Górny
2018-12-20  1:43   ` Kent Fredric
2018-12-20  2:33     ` Rich Freeman
2018-12-20 16:21       ` Kent Fredric
2018-12-18 18:14 ` Brian Evans
2018-12-18 18:37   ` Alec Warner
2018-12-18 18:38     ` Raymond Jennings
2018-12-18 20:29       ` Alec Warner
2018-12-18 18:42   ` Rich Freeman
2018-12-19 23:46   ` Robin H. Johnson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGDaZ_p88TOB0ufrtZO3ebR5YzyCZn6NREANddOieM=w6NVGqw@mail.gmail.com' \
    --to=shentino@gmail.com \
    --cc=gentoo-project@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox