From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.43) id 1E8YsL-0004GD-It for garchives@archives.gentoo.org; Fri, 26 Aug 2005 07:43:46 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.4/8.13.4) with SMTP id j7Q7fAGx016909; Fri, 26 Aug 2005 07:41:10 GMT Received: from smtp.gentoo.org (smtp.gentoo.org [134.68.220.30]) by robin.gentoo.org (8.13.4/8.13.4) with ESMTP id j7Q7Zmd6028574; Fri, 26 Aug 2005 07:35:48 GMT Received: from cpe-65-26-255-237.wi.res.rr.com ([65.26.255.237] helo=nightcrawler) by smtp.gentoo.org with esmtpa (Exim 4.43) id 1E8Ylw-00070y-Ub; Fri, 26 Aug 2005 07:37:09 +0000 Date: Fri, 26 Aug 2005 02:35:29 -0500 From: Brian Harring To: gentoo-dev@lists.gentoo.org Cc: gentoo-portage-dev@lists.gentoo.org Subject: Re: [gentoo-dev] EBUILD_FORMAT support Message-ID: <20050826073529.GP1701@nightcrawler> References: <20050707002002.GH20687@lightning.stealer.net> <200508231520.16966.pauldv@gentoo.org> <20050823160045.GJ10816@nightcrawler> <200508251234.00876.pauldv@gentoo.org> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@gentoo.org Reply-to: gentoo-dev@lists.gentoo.org Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="7ZMy3ZKywLyoHonN" Content-Disposition: inline In-Reply-To: <200508251234.00876.pauldv@gentoo.org> User-Agent: Mutt/1.5.8i X-Archives-Salt: 9889288b-9239-443f-8df8-996c4b54506b X-Archives-Hash: 217a2e6b70bcaac495e1d25b77b0b918 --7ZMy3ZKywLyoHonN Content-Type: text/plain; charset=utf8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Pardon the delay, been putting this one off since it's going to be a=20 fun one to address, and will be a bit long :) On Thu, Aug 25, 2005 at 12:34:00PM +0200, Paul de Vrieze wrote: > What I mean is compatibility with current portage versions. Current=20 > versions do not understand EAPI. There would be a good chance that they= =20 > could choke on packages with all kinds of new features, even in the sync= =20 > phase. A different extension would ensure that those portage versions=20 > would still work (crippled) on a new tree. Of course such an extension=20 > change should only be done once. Once the API versions are available this= =20 > is not an issue. General portage stance towards EAPI is unset EAPI =3D=3D 0 (current stable= =20 ebuild format); if EAPI > then portage internal EAPI, unable to merge,=20 which should be able to be detected during buildplan. Current portage doesn't know about EAPI; boned in that respect I'll=20 admit, but it's the case for all new features rolled out- three options=20 for dealing with this 1) Usual method, deploy support, N months later use support. 2) tweak stable so it's aware and can complain. Still requires=20 people to upgrade, just makes it so that they're not forced into=20 upgrading to 3.x; this is mainly a benefit for those who may don't=20 care to try the first few releases of 3.x when it hits (akin to=20 people dodging the first release or two of a gcc release). Worth noting that one rather visibile aspect of EAPI=3D1 is that=20 (assuming the council votes on it, and yay's it) glep33 *will* result=20 in current eclasses being effectively abandoned w/in the N months=20 after an EAPI capable portage is released. Sound kind of bad, but people will have to upgrade for the=20 capabilities. If EAPI was pegged into portage/ebuilds already=20 it wouldn't be an issue, issues could be detected prior. =20 Unfortunately it's not, and introduction of it (and use of it) is=20 going to involve a few road bumps. Plus side, once it's in, portage *will* know if the ebuild is=20 incompatible with the pythonic/bash ebuild code, and portage/the UI=20 can act accordingly. Meanwhile, the changes that are being pushed into EAPI are addition of=20 configure phase (broken out from compile), elib addition, and eclass2=20 support (same beast, different rules due to env save/restoration). The potential for horkage on sync'ing isn't there due to the fact=20 that's purely python side; ebuild*sh doesn't play into it. Re: regen, issue isn't really there either; if you try and merge an=20 eapi=3D0 on a non eapi aware portage, it works, same as it did before. If you try to merge an eapi=3D1 ebuild you hit either an issue with=20 inherit, or a bail immediately in src_compile, due to the fact eapi=3D1=20 ebuilds will seperate configure out from compile (eapi=3D0 portage won't=20 know to call it; no configure =3D=3D failed compile). That said, there also quite likely is a change coming down the pipe to=20 the tree's cache; the change will shift the rsync'd metadata cache=20 over to a key/val based cache. Why oh why, yet another cache change? Simple. The change moves away=20 =66rom list based format to key:value pairs; in short it's a change that=20 once made, means keys can be added to the cache from that point on=20 without causing cache complaints on sync'ing. Last cache breakage, I=20 swear :P EAPI addition being the next key tagged in; stable (not surprising)=20 needs to be released with a version capable of reading both old and=20 new format; once that's done, time for the usual "yo, upgrade people,=20 something's coming down the line". Same version that supports=20 old and new cache format can also include rudimentary eapi awareness. At least that's what I'm thinking. It's roughly inline with the=20 previous forced cache breakages, just in this case slipping in some=20 extra support in the process. Notices obviously would go out prior to moving on this also, along=20 with a good chunk of waiting. > > > ps. I would also suggest requiring that EAPI can be retrieved by a > > > simple line by line parsing without using bash. (This allows for > > > changing the parsing system) > > > > No, that yanks EAPI setting away from eclasses. >=20 > If the eclasses follow similar rules that would be easilly parseable.=20 > (taking inherit ...) into account is easy as long as the inherit line is= =20 > on one line of it's own. (unconditionally) These rules that would=20 > allready be followed out of style reasons would make various kinds of=20 > parsers able to parse them. while it's insane, people *can* use indirection (eg inherit $var) for=20 inherit's as long as it's deterministic, always the same inherit call=20 for that ebuild's data. Don't see a good reason to ixnay that, which=20 means we'd have to parse the whole enchilada, eclasses and the ebuild. Effectively, raiding a single var out wouldn't fly; eclasses could=20 override an ebuild's eapi setting for example, just like any other=20 metadata key (imo). A *true* format change, moving away from bash for example or moving to=20 an executing design of ebuilds would require an extension change; such=20 a change must imo anyways, since it's not a change of the ebuild env's=20 template/hooks; either it's a fundamentally different model for=20 ebuilds- either via no longer being bash based, or moving away from our=20 declarative design of ebuilds. > > Only time this would be required is if we move away from bash; if that > > occurs, then I'd think a new extension would be required. contradicting myself via above, above is correct > > It would allow to for example restrict the ebuild format such that initia= l=20 > parsing is not done by bash (but the files are still parseable by bash).= =20 > If we perform changes I think it should be done right in the first place. Elaborate please > > As is, shifting the 'template' loaded for an ebuild can be done in > > ebd's init_environ easy enough, so no reason to add the extra > > restrictions/changes. >=20 > One of the issues of ebuilds is the cache/metadata stuff. Parsing an=20 > ebuild for basic information takes a lot of time. This can be done lots= =20 > faster with a less featured parser (I've written one some day) that=20 > accepts 98% of all current ebuilds, just doesn't like dynamic features in= =20 > the toplevel. Such a parser could be a python plugin and as such easy to= =20 > use from python. However to ensure compatibility with a faster parser the= =20 > EAPI variable should be there in a way that is a little more strict than= =20 > the other variables. And such a restriction is in practice not a=20 > restriction. Any parser that doesn't support full bash syntax isn't acceptable from=20 where I sit; re: slow down, 2.1 is around 33% faster sourcing the=20 whole tree (some cases 60% faster, some 5%, etc). The speed up's are=20 also what allow template's to be swapped, the eapi concept. I'd note limiting the bash capabilities is a restriction that=20 transcends anything EAPI should supply; changes to what's possible in=20 the language (a subset of bash syntax as you're suggesting) are a=20 seperate format from where I draw the line in the sand. Mainly, limiting the syntax has the undesired affect of deviating from=20 what users/devs know already; mistakes *will* occur. QA tools can be=20 written, but people are fallable; both in writing a QA tool, and=20 abiding by the syntax subset allowed. > The restriction I propose would be: > - If EAPI is defined in the ebuild it should be unconditional, on it's own > line in the toplevel of the ebuild before any functions are defined. > (preferably the first element after the comments and whitespace) >=20 > - If EAPI is not defined in the ebuild, but in an eclass, the inherit > chain should be unconditional and direct. Further more in the eclass the > above rules should be followed. >=20 > Please note that many of the conditions are allready true for current=20 > ebuilds, just portage can "handle" more. inherit chain must be unconditional anyways. re: eapi placement, I=20 would view that as somewhat arbitrary; the question is what gain it=20 would give. I'd wonder about the parsing speed of your parser; the difference=20 between parsing ebuilds and running from cache metadata is several=20 orders of magnitude differant- the current cache backend flat_list=20 and portage design properly corrected ought to widen the gap too. General cache lookup is slow due to-=20 A) bad call patterns, allowed by the api; N calls to get different=20 bits of metadata from a cpv, resulting in potentially N to disk set=20 of ops. B) default cache requires opening/closing a file per cpv lookup; syscall's= =20 are killer here. C) every metadata lookup incurs 2 stats, ebuild and cache file. Getting to the point; cache is 100x to 400x faster then sourcing for=20 <=3D2.0.51. Haven't tested it under 2.1, should be different due to=20 cache and regen fixups/rewrites. Back to the point, essentially, EAPI matters in two places;=20 1) metadata transfer from the ebuild env into python side during=20 depends phase; has to know what to transfer key wise. 2) actual ebuild build phase executions; if it isn't the depends phase,=20 eapi being required so that the parser can swap drop in the appropriate= =20 ebuild env template. The restrictions suggested for EAPI would only make sense if eyeing=20 #1, an alternative parser; no reason to drop the cache unless the=20 parser is capable of hitting the same runtime performance the cache=20 can hit (frankly, it's not possible from where I'm sitting although=20 the gap can be narrowed). So... the EAPI limitations, not much for due to the conclusion above. =20 Interested in the parser however, since ebd is effectively a pipe=20 hack so that pythonic portage can control ebuild.sh. I (and others)=20 have been after a bashlib for a while, just no one has crunched down=20 and done it (easier said then done I suspect). My 2 cents at least. ~harring --7ZMy3ZKywLyoHonN Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDDsZBvdBxRoA3VU0RAmZRAKDo8oinCiyEQfaUdizkzSF7ALfnQQCguk99 QlO3zKSabvnBnfp2Eu/Qdts= =R8IN -----END PGP SIGNATURE----- --7ZMy3ZKywLyoHonN-- -- gentoo-dev@gentoo.org mailing list