From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 07EA6138334 for ; Tue, 18 Dec 2018 18:37:21 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id C4C70E0C35; Tue, 18 Dec 2018 18:37:19 +0000 (UTC) Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 77918E0C31 for ; Tue, 18 Dec 2018 18:37:19 +0000 (UTC) Received: by mail-ed1-f44.google.com with SMTP id g22so8695849edr.7 for ; Tue, 18 Dec 2018 10:37:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=3Gd2AvJZt8phwJzo2NtTRTkVyBd+kCxeBzenTUVponU=; b=DOJB67npbtm/+nBGz0obbqsqKb58m6SCt0M0Q+RZ6H9exdaWXE3aqiJXq9899yTVtj KOr1FlrNegZXvqA9y6sEbbJdsYykT6uLYH4ZIbX/VoFagf0R5oOfn9J9ooTK/ow3Okdr 6Da3d4W40dJQ+RXD8fflRBnpjWscd7/+yA3jI9i+UWrS83QwavMCJR9jgg5NtyRkdogt SI+THeG8alyAyGzXiatlNg6hcmx3dvGcvpe9b1bJNCB7CJJpLEBVVGX0WwKrsXMNaJKx TT5OZQidGcZpoFZznFyiEdFfl7/n2Kt8m1bl8SzvLsVkkS/gAYXFVrWtvxOS4M00ppEX FG9A== X-Gm-Message-State: AA+aEWYdk3n3qvCp4jSAEk/P4NQIt/hcL3dRCJ5iamZaPI9i8upmVvqs qe5qeZNed5BMzwhRiGROhkYsFiAhm9ASILKdfe2Nk4ry60w= X-Google-Smtp-Source: AFSGD/VkTtugRRDEormaE6+oeQfjVIK8gRg5ncBto4l+Sc90SAGKB5HeS38vAIlMdrhTqBAJaGyWtri5Ol+4A6GMMZE= X-Received: by 2002:a50:9472:: with SMTP id q47mr17608399eda.251.1545158237518; Tue, 18 Dec 2018 10:37:17 -0800 (PST) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Project discussion list X-BeenThere: gentoo-project@lists.gentoo.org Reply-To: gentoo-project@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 References: <7da3ce86-d7c5-0336-886d-c43a6144a5fb@gentoo.org> In-Reply-To: <7da3ce86-d7c5-0336-886d-c43a6144a5fb@gentoo.org> From: Alec Warner Date: Tue, 18 Dec 2018 13:37:05 -0500 Message-ID: Subject: Re: [gentoo-project] RFC: Dropping rsync as a tree distribution method To: gentoo-project Content-Type: multipart/alternative; boundary="000000000000de941c057d502f43" X-Archives-Salt: e5e0b6e5-69d4-425f-b47f-c8a675b85ded X-Archives-Hash: 6f5a21d2b43aef4ee04b3ce16f56c2ba --000000000000de941c057d502f43 Content-Type: text/plain; charset="UTF-8" On Tue, Dec 18, 2018 at 1:15 PM Brian Evans wrote: > On 12/15/2018 11:15 PM, Alec Warner wrote: > > Hi, > > > > I am currently embarking on a plan to redo our existing rsync[0] mirror > > network. The current network has aged a bit. Its likely too large and is > > under-maintained. I think in the ideal case we would instead pivot this > > project to scaling out our git mirror capabilities and slowly migrate > > all consumers to pulling the git tree directly. To that end, I'm looking > > for blockers as to why various customers cannot switch to pulling the > > gentoo ebuild repository from git[1] instead of rsync. > > > > So for example: > > > > - bandwidth concerns (preferably with documentation / data.) > > - Firewall concerns > > - CPU concerns (e.g. rsync is great for tiny systems?) > > - Disk usage for git vs rsync > > - Other things i have not thought of. > > > > -A > > > > [0] This excludes emerge-webrsync; which I don't plan on touching. > > [1] Rich talked about some downsides earlier > > at https://lwn.net/Articles/759539/; but while these are challenges > > (some fixable) they are not necessarily blockers. > > I personally would be sad to see rsync go as I use the git developer > tree as my main repository on 2 machines. This is so I can develop and > update from the single source. These have no news or md5-cache and it > can be painful to generate metadata on one of them. > So my strawperson response is that you should have 2 repos. PORTDIR=https://gitweb.gentoo.org/repo/sync/gentoo.git/log/?h=master # a local copy of this thing. PORTDIR_OVERLAY=/path/to/your/checkout/of/gentoo.git I suspect however that this likely performs ...poorly, particularly in worst case situations as the 'overlay' would of course be massive in this configuration. > > I rely on scripts to pull down the rsync metadata to expedite this > process. eg. rsync /gentoo-portage/metadata/md5-cache/. Git has > no easy sub-tree download equivalent that I know of. > So I think overlaying the news and GSLA bits are easy (you have a post-sync script that cd's into various directories and clones the news and GSLA repos.) The costly bit is likely the metadata regeneration for your development branch of the tree. I'd be curious to see how much this costs (both cold and hot) for you to generate locally. -A > > Brian > > --000000000000de941c057d502f43 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Tue, Dec 18, 2018 at 1:15 PM Brian Evans <grknight@gentoo.org> wrote:
On 12/15/2018 11:15 PM, Ale= c Warner wrote:
> Hi,
>
> I am currently embarking on a plan to redo our existing rsync[0] mirro= r
> network. The current network has aged a bit. Its likely too large and = is
> under-maintained. I think in the ideal case we would instead pivot thi= s
> project to scaling out our git mirror capabilities and slowly migrate<= br> > all consumers to pulling the git tree directly. To that end, I'm l= ooking
> for blockers as to why various customers cannot switch to pulling the<= br> > gentoo ebuild repository from git[1] instead of rsync.
>
> So for example:
>
> - bandwidth concerns (preferably with documentation / data.)
> - Firewall concerns
> - CPU concerns (e.g. rsync is great for tiny systems?)
> - Disk usage for git vs rsync
> - Other things i have not thought of.
>
> -A
>
> [0] This excludes emerge-webrsync; which I don't plan on touching.=
> [1] Rich talked about some downsides earlier
> at=C2=A0https://lwn.net/Articles/759539/; but while these a= re challenges
> (some fixable) they are not necessarily blockers.

I personally would be sad to see rsync go as I use the git developer
tree as my main repository on 2 machines. This is so I can develop and
update from the single source.=C2=A0 These have no news or md5-cache and it=
can be painful to generate metadata on one of them.
So my strawperson response is that you should have 2 repos.

PORTDIR=3Dhttps= ://gitweb.gentoo.org/repo/sync/gentoo.git/log/?h=3Dmaster=C2=A0# a loca= l copy of this thing.
PORTDIR_OVERLAY=3D/path/to/your/checkout/of= /gentoo.git

I suspect however that this likely per= forms ...poorly, particularly in worst case situations as the 'overlay&= #39; would of course be massive in this configuration.
=C2=A0
=

I rely on scripts to pull down the rsync metadata to expedite this
process. eg. rsync <host>/gentoo-portage/metadata/md5-cache/.=C2=A0 G= it has
no easy sub-tree download equivalent that I know of.
<= br>
So I think overlaying the news and GSLA bits are easy (you ha= ve a post-sync script that cd's into various directories and clones the= news and GSLA repos.) The costly bit is likely the metadata regeneration f= or your development branch of the tree. I'd be curious to see how much = this costs (both cold and hot) for you to generate locally.

<= /div>
-A
=C2=A0

Brian

--000000000000de941c057d502f43--