Pardon bluntness; don't mean offense, just specifically picking the hell out of this proposal to make a point (lucky you) :) On Sat, Dec 17, 2005 at 03:17:44AM -0500, Andrew Muraco wrote: > Attached is a draft of a glep for formalizing multiple-repository support I appreciate trying to chip in, but frankly this glep needs a lot more thought put into it. Further, I _really_ do not see the point of glepping this either. Puking up proposals due to folks making noise is a waste of time- don't document/propose just because folks are making noise- do it for large scale changes, or conflict, not because someone requires a glep/spec before they're willing to listen to the _developers_ of a project about how to integrate a new feature into _their_ project. > This is far from ideal in many ways, but i'm too tired and I drank too > much caffine to be sane. > > Comments, objections, anything consructive is welcome. Inlined... > Abstract > ======== > To implement a functional and expandable method for Portage to support multiple repositories. > > Motivation > ========== > Multiple Repository support is needed, this GLEP is to address this need. define multiple repository. We _have_ multi repo already (binpkg and portdir, let alone overlays). > Specification > ============= > > Portage will make use of two (2) ways to address repositories: > * A User-defined name, which is likely to be used as a convinance in most situations - this will be referred to as REPO_NAME in this GLEP > * A hard-coded repository-id which will be found in the repository tree at: metadata/repo_id - this will be referred to as REPO_ID in this GLEP > Both names will contain no spaces, and only standard characters [TODO: references] Portage externally will use user defined, internally it will do it's own thing. > Repositories > ------------ > > Each repository will contain: > * the repo name in metadata/repo_id > * repo information such as maintainer of the repo, notes on who hosts it, etc will be contained metadata/repo_info > * unique packages.mask which will only apply to ebuilds within that specific repo. > > The REPO_ID must match the name that will be used for rsync > Therefore, rsync://MyServer.tdl/REPO_ID/ No. It's arbitrary, and invalid to assume rsync is the only sync uri that's going to be used- this isn't even getting into _remote_ repos. *ANY* unique id tagged into a repo is just that, a magic constant in it's metadata. Just that. No mandates about SYNC, file layout, etc, will fly that bind to the metadata id. > /etc/portage/* > ------------- > > In order to provide users with the current set of options and extend them so they can be customized to each repository, the structure of /etc/portage > will remain similar with these changes: > * /etc/portage/REPO_NAME/* will be the location of repository-specific portage files. > * /etc/portage/ will continue to function over all repos > ** ex) =sys-devel/gcc-4 -* in /etc/portage/package.keywords would use the latest gcc-4 regardless of what tree it comes from. > > The following new files will be added to /etc/portage: > * /etc/portage/repositories.perfer - will contain each REPO_NAME in order of preferance, higher is more perfered. (Each REPO_NAME will be on a seperate line) yuck. This is a bit of a waste for a single file. > ** In the absence of this file portage should use repositories in alphabetical order. directory returned ordering, not alpha- no ordering == set of repositories, thus don't try to induce a fallback ordering. > * /etc/portage/REPO_NAME/repository.id - contains the specific REPO_ID which REPO_NAME applies to. This is going to induce more slowdown for any config instantiation- more directories/files to scan. Iow, python -c 'import portage' (which instantiates a config obj), is going to get slower, which will piss off the slower system folk even further (hell, even with 1.5ghz and decent IO it still is a 10s import for me uncached). > */etc/portage/REPO_NAME/repository.conf - will contain any repository-specific options, which can include, but is not limited to, FEATURES="" C[XX]FLAGS="". > ** This will also include a new variable; OPTIONS="" of which is similar to FEATURES, but modifies the way portage will handle that specific repository. > A few examples of options which could be useful: This seems a bit arbitrary. > *** EXCLUDESYNC - Prevents portage from doing a sync on this repo. And how are you going to specify the sync method for that repo? > *** EXCLUDEUPDATE - Prevents portage from using ebuilds in this repo as updates for packages which currently reside in a different repo. > *** EXCLUSIVEUPDATE - forces any update to any package which is from this repository to a newer version which resides inside of this repo. And this is implemented how? Why is it specifying resolver directives as repo attributes? When is this forced -U going to be triggered? Sync? Next random emerge call? How is this going to be bound to the resolver, let alone the question of how build configuration data is going to be bound to the actual build? yes, leading question there. :) Where in this proposal, is it extending similar capabilities to binpkgs? > *** et al. > > All of the repository rsync URIs will be stored in /etc/make.conf > SYNC="rsync://myfavoriterepo.org/myportage \ rsync://rsync.namerica.gentoo.org/gentoo-portage" No. No no no no no.... at least this explains the metadata ID being part of SYNC comment thought :) > The Tree: /usr/portage -> /var/repositories/REPO_ID/ > ---------------------- > The repository tree will need to be moved, each repository will have its own folder: /var/repositories/REPO_ID/. This directly castrates the user's ability to store the tree wherever they want, loss of that long standing capability is an issue. No go there. > For compatibility reasons, /usr/portage will be treated as /var/repositories/gentoo-portage Building hacks into portage isn't going to fly; no special cases. The need for this is a sign that the forced FHS compliant path (instead of letting the user do whatever they want) is a bad idea. This beast can be accomplished without sacrificing our existing flexibility. > Ebuilds > ------- > > Ebuilds will now be able to have dependencies based on packages from specific repositories. > > * DEP Atoms now support the following format: =REPO_ID:SLOTNUM:CAT/EBUILD-X.Y.Z > ** Ex1) >=MyRepo:2:sys-devel/gcc-4.0 > ** Ex2) ** Ex3 ::foo/bar > Dependency atoms that lack the new format (::) or do not have a REPO_ID will then just use any ebuild which fulfills the requirements. Backwards compatibility? Protection for portage version that see this and aren't capable of handling it? Why are you proposing the use/slot additions be changed so slot is prefixed rather then postfixed? > Backwards Compatibility > ======================= > /usr/portage will be treated as /var/repositories/gentoo-portage so it would be possible to function with no changes after the upgrade. This is a bit short... see above. lot more to it, especially since you're trying to use stable. Glep doesn't provide any way for knowing what repo's are active... nor what repo's are overlaying each other, nor extending the capabilties to anything beyond ebuild trees. Bintree? VDB? (yes, extending slaving capabilities to vdb has uses). The shortcomings I'm pointing at are historical- you're trying to shoehorn into the existing portage configuration, thus I'm pointing out the failings of it :) Basically... this glep is exactly opposite of what I'm after, and what I've already written in saviour. Saviour's configuration *currently* is ini based, although that isn't a requirement (pluggable config parser is there, although marienz is working on making my initial prototype not suck). Make.conf support will be there (backwards compatibility); just will require a config parser written to convert make.conf format into the internal config definitions. The existing stable configuration is inherintly single master repo set, single configuration, single domain, single root, single sync. There's no groupping in make.conf (nor aux files), thus it's pretty fricking hard to try and make it groupped. Further, no control over the individual objects- no way to specify that a repo is remote fex. All the make.conf support would do for saviour crap is translate it into a single domain, single repo + overlays, etc. If folks need more power (seperate caches, seperate syncs for repos, N domains/roots...), they need to use a more powerful config format. Via ini, you have repo definitions, defining the repo class to use, location (if applicable), host (if applicable), cache reference (cache instance to use, defined via it's own configuration item), slaving, multiplexing together, package.* specifications (state the filepath)... etc. Plus, user label (section label for that repo definition). The framework in place allows for a helluva lot more crazy stuff (your own repo classes, say a strict R->L union of repos), I'm just rambling off the obvious targets. The configuration parser/handler actually addresses all of this crap on it's own without forcing us to introduce hacks to try and shoehorn N repo's into the current single repo config. Old doc of details/intentions, http://dev.gentoo.org/~ferringb/portage/3.0/dev-notes/framework/config.txt . It's generalized and flexible- prior to screaming off with his head for the ini usage, please read through http://dev.gentoo.org/~ferringb/portage/3.0/dev-notes ; there are a _lot_ of good reasons, jotting down of intention, layout/design of portage 3, etc. Do the reading prior to the screaming ;) Either way... here's how it *should* proceed from where I'm sitting. Introduce full PORTDIR capabilities to all repos available- vdb, binpkg, portdir/portdir_overlay. That right there is a major request people have been pushing for, and in doing so the framework gets hammered on/debugged. Introducing resolver level constraints (match strictly from this repo) requires atom extension among other things, and introduces it's own class of problems. Introduce the capabilities while treating existing repos as a single repo set, then extend it with what's needed for true stand alone repos. Either way, while discussion of standalone repos is good, I'm going to kindly remind y'all that glep42 just needs to pass in a repo id- it's not blocked by anything but minor api changes to portageq so that the portage devs aren't forced to fix someone elses proposal down the line (in the process breaking crap for users when we do so). True stand alone repository capabilities aren't required/bound to glep42, all that's required out of glep42 is that the syncing repo id be used now (even if it may seem superfluous). Iow, news *should* work regardless if it's an overlay or a stand alone repo. Please keep that in mind. ~harring