From: Duncan <1i5t5.duncan@cox.net>
To: gentoo-portage-dev@lists.gentoo.org
Subject: [gentoo-portage-dev] Re: How to have several gentoo repos on one machine?
Date: Thu, 22 Oct 2015 11:26:51 +0000 (UTC) [thread overview]
Message-ID: <pan$10876$c7dac759$895e9925$b575df20@cox.net> (raw)
In-Reply-To: 1445496485.31293.42.camel@transmode.se
Joakim Tjernlund posted on Thu, 22 Oct 2015 06:48:06 +0000 as excerpted:
> On Thu, 2015-10-22 at 02:29 +0000, Duncan wrote:
>> Joakim Tjernlund posted on Wed, 21 Oct 2015 11:08:02 +0000 as
>> excerpted:
>>
>> > I need to more than one gentoo repo in my computer.
>> > this did not work as "portageq repositories_configuration /"
>> > complains:
>> > !!! Section 'tm-cusfpv3' in repos.conf has name different from
>> > repository name 'gentoo' set inside repository
>> >
>> > I figured the name in repos.conf would just override
>> > /usr/local/portage/tm-cusfpv3/profiles/repo_name ?
>>
>> While it's not quite clear to me either why you'd need two identical
>> gentoo repos[...]
>
> I use one for my host and the other for cross building our products root
> FS and they are not in sync. That rules out the aliases I guess?
I think so, yes. However, as a user I'd really like to understand
aliases, their purpose, and at high level how they work, and the current
manpage doesn't help so much there. Without that I really don't know
enough about aliases to say anything further.
But meanwhile, I was sort of in your situation for awhile as I was
building for my main amd64 system and in a 32-bit chroot for a 32-bit-
only netbook, with a separate portage config for each, and while in my
case they both pointed at the same gentoo repo and overlays using bind-
mounts into the 32-bit chroot, without those bind-mounts it would have
been two parallel and separate portage installations, one configured for
32-bit x86 in the chroot, one configured for amd64 outside the chroot.
And that's what I'd use in your case, two separate portage installations,
which could then of course have separate configs.
That said, while I understand the principle of stability, and if it's
private there shouldn't be legal issues, I still wonder at the idea. One
of the reasons I could and did use bind-mounts and thus literally the
same repos in my case, was that the gentoo repo is the gentoo repo, and
other than the possibility of snapshotting it for archiving purposes (and
of using one of those snapshots should it be needed, say because I left
the netbook unupgraded for too long and it could no longer jump from the
version on it to current), I considered the gentoo repo the gentoo repo,
and a local copy that wasn't synced would no longer represent the present
state of the gentoo repo.
If I were to un-sync for other than very temporary recovery purposes, I'd
thus want to call the repo something other than gentoo, since it would no
longer represent the current state of the true gentoo repo.
And if I made changes to that unsynced repo, say to stabilize it further
(and if I wasn't doing so, what would be the purpose of keeping it
unsynced for so long), that'd be even /more/ reason to call it something
other than gentoo, because then it would no longer properly represent
that state of the true gentoo repo at /any/ time.
But having the git repo available changes the way that works
dramatically, see below...
> I don't plan on renaming anything in the repo_name file, it should just
> be ignored and the name I have select in repos.conf should used.
>
> I don't see any value in repo_name file now that we have the new
> repos.conf, possibly it could be a fallback only for PORTDIR users.
The portage devs are welcome to contradict me if they like, but AFAIK, it
still serves the useful purpose of double-checking that you don't for
instance have two repos accidently syncing to the same place, and that
the names used to refer to the repo stay consistent. (Again, part of the
need for consistency would be due to the metadata and thus metadata cache
being repo-specific, automatically invalidating the cache if the remote
name and local name don't agree. Locally regenerating the metadata cache
will go a long way to avoiding that problem, but it's an expensive
operation that most users won't want to do, and keeping the names in sync
helps avoid inadvertent cache invalidation.)
>> I actually use gentoo's git-based usersync
>> repo on github, now, and thus don't rsync any repos all any more, here,
>> and git of course has its git-ignore feature/files, which I use now.
>> But I used rsync's exclude as suggested above, for years. Worked fine.
>> =:^)
>
> Nice, I am heading the same was, using git all the way but I not there
> yet.
> One problem is that using git is disk space I think. Files are just
> ignored but still present in the repo so syncing to our embedded target
> will take a lot more space.
> Any thoughts on that?
Well, at least once your trailing target (presumably the embedded repo)
is safely past the git repo's epoc (the date imported from cvs, for our
purposes), git flexibility will let you checkout older versions on-
demand, then checkout HEAD once again.
In a scenario where both copies aren't likely to be used at once, you can
use a single local git repo and just checkout the version of it you want
dynamically.
In a concurrent-use scenario, there's a few ways you could go. What I'd
probably do would be two git repos, one synced to gentoo-remote,
presumably with full git history (or at least git history back to the
other checkout), the other locally checked out from the "current" repo,
at the checkout of interest.
If you're doing this sort of thing then the sort of space the git repo
takes up shouldn't be a big concern, but in case it is, it's worth noting
that given the right filesystem and dedup tools, there will only actually
be the one copy of "common" data on-storage, with each of those two git
repos reflinking (think a lower-level hard-link) data that's common
between them, which will be pretty much everything in the earlier one
since the current one will have the earlier one as history.
I'm a regular on the btrfs list, for instance, and on btrfs, a very space
efficient solution would be to originally do an initial git checkout of
the older, presumably embedded target repo, create a btrfs snapshot out
of it, and then (in the working copy, not the snapshot) git-pull from the
remote to update to current. The btrfs snapshot will have locked in
place the older version in the snapshot, while the git pull in the
working copy will create any new files, delete any remote-deleted ones
(but they'll still be in the btrfs snapshot), reflink any old files, and
reflink but then cow (copy-on-write) any updated files. For this
scenario you wouldn't even need any additional dedup tools, tho if you
had them, they'd probably save even more space (multiple versions of the
same package often have very nearly the same ebuilds, for instance,
differing in little more than name, and dedup would catch and dedup these
as well, while the pure native btrfs snapshot method probably wouldn't).
Of course I'm conservative enough that I only call btrfs "stabilizing and
maturing, but not fully stable or mature yet", for various reasons you'll
see enumerated in my posts on the btrfs list, but if you're following the
standard sysadmin backup rule, if it's not backed up, by definition you
value it less than the time/resource necessary for doing the backup,
factored against the risk of actually needing the backup (thus nicely
dealing with second and third and Nth level backups as well, since the
risk of actually needing them drops accordingly, but they may well be
worth it out to some higher value of N for very highly valued data), then
in general I and others have found it stable /enough/.
I guess xfs and ext4 both have dedup features as well, but I went
straight from reiserfs to btrfs and am thus not really familiar with
them. (And zfs of course is the more mature btrfs, but there's some down
sides like needing loads of ecc-strongly-recommended ram, as well as
license concerns for people like me, that may eliminate it from
consideration even if it'd otherwise really be a stable and reliable
version of where btrfs is still headed, but hasn't yet arrived.)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2015-10-22 11:27 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-21 11:08 [gentoo-portage-dev] How to have several gentoo repos on one machine? Joakim Tjernlund
2015-10-22 2:29 ` [gentoo-portage-dev] " Duncan
2015-10-22 6:48 ` Joakim Tjernlund
2015-10-22 11:26 ` Duncan [this message]
2015-10-22 11:59 ` Joakim Tjernlund
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$10876$c7dac759$895e9925$b575df20@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=gentoo-portage-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox