* [gentoo-user] sync-type: rsync vs git @ 2022-04-27 14:22 Grant Edwards 2022-04-27 15:18 ` Rich Freeman 0 siblings, 1 reply; 5+ messages in thread From: Grant Edwards @ 2022-04-27 14:22 UTC (permalink / raw To: gentoo-user A while back I switched one of my machines sync-type for the gentoo repo from rsync to git using https://anongit.gentoo.org/git/repo/sync/gentoo.git because that machine is behind a firewall that stopped allowing rsync connections. Is there any advantage (either to me or the Gentoo community) to continue to use rsync and the rsync pool instead of switching the rest of my machines to git? I've been very impressed with the reliability and speed of sync operations using git they never take more than a few seconds. When using rsync, it seems like I regularly used to have to spend time trying different mirrors and hard-wiring one in my config file because the one I (or the pool) had chosen had fallen back to using a Bell-212 modem for its internet connection. Sync operations often used to take many minutes and would sometimes just hang. -- Grant ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] sync-type: rsync vs git 2022-04-27 14:22 [gentoo-user] sync-type: rsync vs git Grant Edwards @ 2022-04-27 15:18 ` Rich Freeman 2022-04-27 16:24 ` [gentoo-user] " Grant Edwards 2022-04-27 20:07 ` [gentoo-user] " Wol 0 siblings, 2 replies; 5+ messages in thread From: Rich Freeman @ 2022-04-27 15:18 UTC (permalink / raw To: gentoo-user On Wed, Apr 27, 2022 at 10:22 AM Grant Edwards <grant.b.edwards@gmail.com> wrote: > > Is there any advantage (either to me or the Gentoo community) to > continue to use rsync and the rsync pool instead of switching the > rest of my machines to git? > > I've been very impressed with the reliability and speed of sync > operations using git they never take more than a few seconds. With git you might need to occasionally wipe your repository to delete history if you don't want it to accumulate (I don't think there is a way to do that automatically but if you can tell git to drop history let me know). Of course that history can come in handy if you need to revert something/etc. If you sync infrequently - say once a month or less frequently, then I'd expect rsync to be faster. This is because git has to fetch every single set of changes since the last sync, while rsync just compares everything at a file level. Over a long period of time that means that if a package was revised 4 times and old versions were pruned 4 times, then you end up fetching and ignoring 2-3 versions of the package that would just never be fetched at all with rsync. That can add up if it has been a long time. On the other hand, if you sync frequently (especially daily or more often), then git is FAR less expensive in both IO and CPU on both your side and on the server side. Your git client and the server just communicate what revision they're at, the server can see all the versions you're missing, and send the history in-between. Then your client can see what objects it is missing that it wants and fetch them. Since it is all de-duped by its design anything that hasn't changed or which the repo has already seen will not need to be transferred. With rsync you need to scan the entire filesystem metadata at least on both ends to figure out what has changed, and if your metadata isn't trustworthy you need to hash all the file contents (which isn't done by default). Since git is content-hashed you basically get more data integrity than the default level for rsync and the only thing that needs to be read is the git metadata, which is packed efficiently. Bottom line is that I think git just makes more sense these days for the typical gentoo user, who is far more likely to be interested in things like changelogs and commit histories than users of other distros. I'm not saying it is always the best choice for everybody, but you should consider it and improve your git-fu if you need to. Oh, and if you want the equivalent of an old changelog, just go into a directory and run "git whatchanged ." -- Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
* [gentoo-user] Re: sync-type: rsync vs git 2022-04-27 15:18 ` Rich Freeman @ 2022-04-27 16:24 ` Grant Edwards 2022-04-27 20:10 ` Wol 2022-04-27 20:07 ` [gentoo-user] " Wol 1 sibling, 1 reply; 5+ messages in thread From: Grant Edwards @ 2022-04-27 16:24 UTC (permalink / raw To: gentoo-user On 2022-04-27, Rich Freeman <rich0@gentoo.org> wrote: > On Wed, Apr 27, 2022 at 10:22 AM Grant Edwards ><grant.b.edwards@gmail.com> wrote: >> >> Is there any advantage (either to me or the Gentoo community) to >> continue to use rsync and the rsync pool instead of switching the >> rest of my machines to git? >> >> I've been very impressed with the reliability and speed of sync >> operations using git they never take more than a few seconds. > > With git you might need to occasionally wipe your repository to > delete history if you don't want it to accumulate (I don't think > there is a way to do that automatically but if you can tell git to > drop history let me know). I don't think I have any history. I use sync-depth=1 and clone-depth=1. Both git log and git whatchanged only show one commit. > Of course that history can come in handy if you need to revert > something/etc. Perhaps I should keep a few levels of history... > If you sync infrequently - say once a month or less frequently, then > I'd expect rsync to be faster. I generally sync several times a week, and git is often very much faster than rsync. Git is always done in a few seconds. The time required for rsync varies widely from a handfull of seconds to tens of minutes. > This is because git has to fetch every single set of changes since > the last sync, while rsync just compares everything at a file level. > [...] > That can add up if it has been a long time. AFAICT, the emerge repo git "depth" settings of 1 prevent that: the intermediate versions are discarded on the server side as is previous local history. The end result is similar to rsync: you fetch only the current version of what's changed since the last "sync", and there's no local history. > Bottom line is that I think git just makes more sense these days for > the typical gentoo user, who is far more likely to be interested in > things like changelogs and commit histories than users of other > distros. I'm not saying it is always the best choice for everybody, > but you should consider it and improve your git-fu if you need to. > Oh, and if you want the equivalent of an old changelog, just go into a > directory and run "git whatchanged ." Right now with a depth of 1, git log/whatchanged don't provide any information (they think all files were new as of the last "sync"). What I should figure out is what settings will preserver a few levels of changes that have been made to my local repo, without preserving intermediate changes to the master repo that never got used locally. IOW, I want all the changes made during a single "sync" to go into my local repo as a single commit regardless of how many commits have been made to the master repo since my previous "sync". I think git can do that -- whether the emerge sync settings in /etc/portage/repos.conf/gentoo.conf allow me to tell emerge to tell git to do that is the question. -- Grant ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] Re: sync-type: rsync vs git 2022-04-27 16:24 ` [gentoo-user] " Grant Edwards @ 2022-04-27 20:10 ` Wol 0 siblings, 0 replies; 5+ messages in thread From: Wol @ 2022-04-27 20:10 UTC (permalink / raw To: gentoo-user On 27/04/2022 17:24, Grant Edwards wrote: > IOW, I want all the changes made during a single "sync" to go into my > local repo as a single commit regardless of how many commits have been > made to the master repo since my previous "sync". I think git can do > that -- whether the emerge sync settings in /etc/portage/repos.conf/gentoo.conf > allow me to tell emerge to tell git to do that is the question. I don't know as that will do you any good. Just use git tags, every time you do a "sync; emerge", just tag the repository with the date. So when you list the tags you'll see all the dates you did an update, and by branching to that tag, you'll be able to go back to that date. I just use "lvm snapshot" :-) Cheers, Wol ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] sync-type: rsync vs git 2022-04-27 15:18 ` Rich Freeman 2022-04-27 16:24 ` [gentoo-user] " Grant Edwards @ 2022-04-27 20:07 ` Wol 1 sibling, 0 replies; 5+ messages in thread From: Wol @ 2022-04-27 20:07 UTC (permalink / raw To: gentoo-user On 27/04/2022 16:18, Rich Freeman wrote: > On Wed, Apr 27, 2022 at 10:22 AM Grant Edwards > <grant.b.edwards@gmail.com> wrote: >> Is there any advantage (either to me or the Gentoo community) to >> continue to use rsync and the rsync pool instead of switching the >> rest of my machines to git? >> >> I've been very impressed with the reliability and speed of sync >> operations using git they never take more than a few seconds. > With git you might need to occasionally wipe your repository to delete > history if you don't want it to accumulate (I don't think there is a > way to do that automatically but if you can tell git to drop history > let me know). Look into "git pack". It won't get rid of old versions, but I think it compresses all the old stuff. But once the repository has been packed, I gather it's normal for the old packed stuff to take up less space than the current stuff. Cheers, Wol ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-04-27 20:10 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-04-27 14:22 [gentoo-user] sync-type: rsync vs git Grant Edwards 2022-04-27 15:18 ` Rich Freeman 2022-04-27 16:24 ` [gentoo-user] " Grant Edwards 2022-04-27 20:10 ` Wol 2022-04-27 20:07 ` [gentoo-user] " Wol
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox