public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] A new glep: Ebuild format and metadata handling
@ 2009-05-31 13:56 Patrick Lauer
  2009-05-31 22:09 ` Richard Freeman
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Patrick Lauer @ 2009-05-31 13:56 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

Hello people,

as the discussion about glep55 has gone in circles long enough I decided to 
collect the various ideas presented around that theme complex and compare them 
in a way that might allow us to reach a sane decision.

Since it has become quite a lot of text I've kept some parts in sentence 
fragments and bullet points. If anyone feels the need to change that into more 
verbose wording feel free to do so, I hope the idea is clear enough. I feel it 
is still a draft and could use some massaging.

If I should have forgotten any approach or misrepresented one I'd appreciate 
an updated or rephrased section so it can be easily updated.

For anyone not interested in reading the whole thing, the conclusion is that 
we want to have the eapi in an easily parsed form in the ebuilds. The 
versioning rule change discussion (mostly glep54) should happen in an 
independent discussion.  

hth,

Patrick

[-- Attachment #2: glep-xxx.txt --]
[-- Type: text/plain, Size: 11356 bytes --]

GLEP: xxx
Title: Ebuild format and metadata handling
Version: $Revision: 1.0 $
Last-Modified: $Date $
Author: Patrick Lauer <patrick@gentoo.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Obsoletes: GLEP55
Created: 31-May-2009
Post-History: 31-May-2009

Problem statement
=================

As ebuild formats evolve there are potentially disruptive changes that are
technically easy to implement, but may break backwards compatibility. To
mitigate these issues in the future multiple proposals have been brought
forward.

Their common goal is to reduce the negative impact of changes on users,
especially in terms of upgrade paths and error reporting. 

The issues mentioned in GLEP55 are eapi discovery and backwards-incompatible
structural changes (global scope functions, per-package eclasses etc)

A completely independent issue is the change in versioning rules. It is
debatable if such a change is even wanted or needed and as such should be
discussed in an independent GLEP. 


Eapi discovery - proposals
==========================

There are currently at least four "big" proposals with various small variations
being discussed:

"haubi"
For lack of a better name we have labeled this one after the person who brought
it up the last few times, haubi. He proposes to use an eapi.eclass and define
eapi as a function. If the eclass discovers an older package manager it aborts
cleanly with a nice error message.

To quote from the initial email:
"""
To fulfill this requirement, and to make it easy for the PM to query the
EAPI without sourcing, we could specify to have the EAPI definition be
the first non-comment line, and to look like:

    inherit eapi 4

Now when the PM is capable of pre-source EAPI detection, it will set
EAPI before sourcing, eapi.eclass can see EAPI already being set and not
do the 'exit' in global scope. Or even the PM's inherit-implementation
expects to be first called with arguments "eapi 4", and not reading the
eapi.eclass at all, so the 'eapi.eclass' does not need to check for
anything, just needs to 'exit' when inherited.

After that 'inherit eapi X' line, we can specify EAPI X to do whatever
we want. It even does not need to be bash-sourceable.
Yes, it is a compromise, but it looks acceptable to me.
"""

nihilist proposal: Keep things as they are. Things have worked acceptably well
in the past. The need to change things is overstated and assumes that the
current state is broken. New EAPIs will be reasonably close to current EAPIs so
that compatibility (forward as well as backwards) can be sustained without
losing too many potential features.
Any disruptive changes can either be kept out of the main tree or be
implemented by shifting the main tree to a different location so that old
package managers won't see the incompatibilities (elaborated in detail further
below)

parsers: The current practise of putting the eapi definition near the top of
the ebuild, combined with the need to state it for all non-EAPI0 ebuilds,
suggests that it can be parsed without having to source the ebuild. It enforces
some minor limitations, for example EAPI needs to be unique and cannot be
overridden by eclasses. These limitations are only enforcing current behaviour
and make QA easier.
Specific suggestion: 

""" The EAPI value shall be the righthand side of the first expression starting
with a string matching "^EAPI=". """

This definition does not allow multiple redefinitions of the eapi value and
ignores comments and malformed lines. It also disallows setting eapi in
eclasses or through other indirect methods.


glep55: See GLEP55. To summarize: The eapi is put into the file name so that
the package manager knows the EAPI (and thus how to handle this file format).
While it simplifies the eapi discovery this comes at a high price as there is
no reliable way to find and validate all ebuilds. Some people also see it as bad
design as it exposes file internals in the filename.



EAPI discovery - Discussion 
===========================

nihilist: 
  + no compatibility problems, things stay as they are
  + smallest impact - nothing changes
  (+-) prevents some form of disruptive changes
	but do we need them?
  - doesn't fix some of the perceived issues

haubi:
  + simple and clean solution
  + easy to extend 
  (- needs package manager support to expose which eapis are supported)
  - format change, meaning of inherit changes
  

parsers:
  + small impact, only codifies current practise
  + good backwards compatibility
  (-) needs some support tools written
  (+-) enforces some restriction on the possible changes in future EAPIs

glep55:
  + allows to change everything (file format, versioning rules)
  - allows to change everything (makes QA impossible, allows adding non-ebuild
    formats, makes version sorting potentially impossible)
  - cannot be reversed in the near future
      if for any reason we decide eapi-in-filename is a bad idea we're stuck
      with it unless we are willing to break backwards compatibility in the
      same way glep55 tries to avoid
  - has not been accepted after over a year of discussion
  - exposes extra metadata in the filename 
      this has been considered to be bad design by many

EAPI discovery - Performance
============================
Mostly irrelevant anyway - on a reasonably fast machine with ~1200 packages:
 emerge -upNDv world
with hot cache: ~10sec 
with cold cache: ~75 seconds
without cache: ~15 minutes 
Sourcing ebuilds is exquisitely slow, the biggest slowdowns are IO because of
(i) lack of metadata cache (needs sourcing the ebuild) and
(ii) inefficient metadata cache (one file per ebuild is easy to work with, but
inherently inefficient)

nihilist: known "bad" performance. It's slow, but we've come to accept it
somehow.

haubi: slightly better performance on early abort, still needs to partially
source the ebuild and start a full bash. The expected speedup is negligible.

parsers: one open() per file. Only IO-heavy, parsing is cheap. 

glep55: one stat() per file. Saves one open() compared to the parsers, but that
open happens later when that file is sourced in the case of a valid ebuild
anyway.

performance discussion - caching
--------------------------------

- best case: Full valid cache. Moderately fast. Changing the metadata cache
format to improve performance is possible. All 4 proposals have no impact on
the base performance. GLEP55 has the potential so save a few metadata cache
open()s because the eapi can be determined from the filename.

- worst case: No metadata cache. Very slow. GLEP55 saves time on EAPI
discovery, but the relative impact of sourcing ebuilds later must be taken into
account.
Parser is slower than glep55 as it needs to open the file.

- average case: ??? We lack information to decide that.

Metadata extraction:
GLEP55 saves time on eapi extraction, but metadata extraction is still slow.
Rough calculation:
disk seek is 10ms. 10 ebuilds versions available. 

worst case: last ebuild only readable by package manager

g55: 
  10ms readdir to find versions
  10ms seek for ebuild content 
  100ms sourcing 
  x ms eclass seek 
  --> 120ms+

nihil: 
  10ms readdir to find versions 
  10x10ms seek for ebuild content
  no extra seek for sourcing 
  10x100ms sourcing 
  --> 1110ms+

haubi:
same as nihil plus one eclass read (eapi eclass)
  --> 1120ms+

parser: 
  10ms readdir, 
  10x10ms seek for ebuild content
  100ms source for one ebuild 
  x ms eclasses 
  --> 210ms+

best case: first one
g55: readdir, seek, source -> 120ms+
nihil: readdir, seek, source -> 120ms+
haubi: readdir, seek, source, one eclass -> 130ms+
parser: readdir, seek, source -> 120ms+

This ignores visibility checks (package.mask etc.), these should have a
constant overhead that does not change the basic figures much.

performance discussion - conclusion
-----------------------------------
speedup of 5-10x for the worst case with glep55 and parser. Takes time from 15
minutes to 90-180 seconds for such a scenario. Best case (valid complete
cache) no performance difference. Average case depends on too many assumptions.

If performance is considered important the parser proposal would be the
preferred non-glep55 solution. The decision then is limited to parser or g55.

Conclusion
==========

There are multiple options how the ebuild format can evolve in the future. The
easiest path would be to collect ideas and not to change anything now (the
nihilist approach). 
In terms of backwards compatibility and error reporting the
haubi method is the most elegant as it can provide clean notices independent of
the package manager. 
If performance is valued more the parser approach is an excellent compromise in
terms of compatibility and efficiency. It does not change the semantics of
inherit while still allowing quite large changes.
GLEP55 might offer the largest flexibility, but that comes at a high price. It
has been very controversial, so implementing it also has a high social price.
There are many points in GLEP55 that have not been defined well enough so that
it is not a good base to start from.

As it seems to cover the widest range of issues (performance, compatibility,
subjective aesthetics, simplicity) we suggest implementing a parser-based
approach. 
 
Versioning changes - ideas
==========================

One of the potential issues mentioned in GLEP55 is the option to extend the
version syntax. One example for such extensions is GLEP54.

Advantages:
  - More expressive syntax for special cases (live ebuilds)
  - Adding more syntactic sugar (-rc instead of _rc)

Disadvantages
  - more rules, more complex
  - per-eapi versioning can potentially be inconsistent
    (either force strict subset/superset between EAPIs or risk incomparable
    versions)

GLEP55 is the only proposal that allows easy per-eapi changes of versioning
rules. One downside is that a priori no limitations of eapis exist, so adding
random formats that are unreadable by the official package manager(s) is
possible and cannot be detected by QA tools. Also the versioning rules per EAPI
can be changed to arbitrary things, so keeping the rules consistent still needs
a defined process that could be used to change the global rules instead.

All other proposals disallow extending the versioning rules as it would
potentially break older package managers that are not aware of that versioning
scheme yet. One way to still change the versioning would be a freeze of the
current repository and migrating to a new repository location on each
incompatible change. This would allow an upgrade path to the last state before
that change (with package managers that know how to handle the new format
available) for all users without ever exposing "new-format" ebuilds to older
package managers. 

Versioning changes should be discussed independently of the EAPI / ebuild
sourcing problem complex. See glep54 and lu_zero's proposal.


References
==========

.. [#GLEP55] GLEP 55, Use EAPI-suffixed ebuilds (.ebuild-EAPI)
    (http://glep.gentoo.org/glep-0055.html)


.. [#GLEP54] GLEP 54, scm package version suffix
    (http://glep.gentoo.org/glep-0054.html)

.. [#lu_zero] glep54 counterproposal
    (http://dev.gentoo.org/~lu_zero/glep/liveebuild.rst)

.. [#haubi] glep55 counterproposal
 
 
(http://archives.gentoo.org/gentoo-dev/msg_348f84690f03e597fb14d6602337c45f.xml)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-06-04  1:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-31 13:56 [gentoo-dev] A new glep: Ebuild format and metadata handling Patrick Lauer
2009-05-31 22:09 ` Richard Freeman
2009-06-04  0:52   ` Wyatt Epp
2009-06-04  1:03     ` Nirbheek Chauhan
2009-06-01 10:35 ` Thilo Bangert
2009-06-01 15:26 ` Markos Chandras
2009-06-01 15:49   ` Nirbheek Chauhan
2009-06-03 10:02 ` Michael Haubenwallner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox