* [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes
@ 2019-07-25 12:20 Michał Górny
2019-07-25 19:57 ` Zac Medico
2019-07-26 6:49 ` Fabian Groffen
0 siblings, 2 replies; 6+ messages in thread
From: Michał Górny @ 2019-07-25 12:20 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Tim Harder
[-- Attachment #1: Type: text/plain, Size: 1443 bytes --]
Hi,
TL;DR: I'd like to make it possible for ebuilds to define additional
variables that will be stored in md5-cache. This would be useful for CI
and other tooling that right now has to parse ebuilds for other data.
The idea is to add a new incremental ebuild/eclass variable (technical
name: QA_EXTRA_CACHE_VARS) that would define additional data to be
stored in cache. For example, python*-r1 eclasses would define
'PYTHON_COMPAT', acct-user would define 'ACCT_USER_ID', etc.
When regenerating cache, the PM would read this variable, and store
the values of all defined variables into md5-cache. As a result,
programs needing those variables can get them straight from cache
without having to attempt to run or parse ebuilds (which is both slow
and prone to bugs).
This would benefit e.g. gpyutils that right now need to attempt to parse
PYTHON_COMPAT from ebuilds. It would also benefit writing future
pkgcheck checks for user/group ID collisions.
Notes:
- since md5-cache uses key-value format and allows for future
extensions, the new values can be added without breaking anything;
- md5-cache is not specified in the PMS, and the whole thing can be
implemented without need for EAPI bump,
- I would like to have this implemented consistently both in Portage
and pkgcore,
- we will need to clearly define how to dump arrays.
What do you think?
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes
2019-07-25 12:20 [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes Michał Górny
@ 2019-07-25 19:57 ` Zac Medico
2019-07-25 20:29 ` Michał Górny
2019-07-26 6:49 ` Fabian Groffen
1 sibling, 1 reply; 6+ messages in thread
From: Zac Medico @ 2019-07-25 19:57 UTC (permalink / raw
To: gentoo-portage-dev, Michał Górny; +Cc: Tim Harder
[-- Attachment #1.1: Type: text/plain, Size: 1829 bytes --]
On 7/25/19 5:20 AM, Michał Górny wrote:
> Hi,
>
> TL;DR: I'd like to make it possible for ebuilds to define additional
> variables that will be stored in md5-cache. This would be useful for CI
> and other tooling that right now has to parse ebuilds for other data.
>
>
> The idea is to add a new incremental ebuild/eclass variable (technical
> name: QA_EXTRA_CACHE_VARS) that would define additional data to be
> stored in cache. For example, python*-r1 eclasses would define
> 'PYTHON_COMPAT', acct-user would define 'ACCT_USER_ID', etc.
>
> When regenerating cache, the PM would read this variable, and store
> the values of all defined variables into md5-cache. As a result,
> programs needing those variables can get them straight from cache
> without having to attempt to run or parse ebuilds (which is both slow
> and prone to bugs).
>
> This would benefit e.g. gpyutils that right now need to attempt to parse
> PYTHON_COMPAT from ebuilds. It would also benefit writing future
> pkgcheck checks for user/group ID collisions.
>
>
> Notes:
>
> - since md5-cache uses key-value format and allows for future
> extensions, the new values can be added without breaking anything;
>
> - md5-cache is not specified in the PMS, and the whole thing can be
> implemented without need for EAPI bump,
>
> - I would like to have this implemented consistently both in Portage
> and pkgcore,
>
> - we will need to clearly define how to dump arrays.
>
>
> What do you think?
Sounds good. Some thoughts:
* Maybe omit QA from the variable name, since it can be could be
generally useful for things that are unrelated to QA.
* In the md5-cache entry, maybe use a common prefix like EXT_ for the
extra keys in order to distinguish them from normal keys.
--
Thanks,
Zac
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 981 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes
2019-07-25 19:57 ` Zac Medico
@ 2019-07-25 20:29 ` Michał Górny
2019-07-25 22:53 ` Michael Orlitzky
0 siblings, 1 reply; 6+ messages in thread
From: Michał Górny @ 2019-07-25 20:29 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Tim Harder
[-- Attachment #1: Type: text/plain, Size: 2068 bytes --]
On Thu, 2019-07-25 at 12:57 -0700, Zac Medico wrote:
> On 7/25/19 5:20 AM, Michał Górny wrote:
> > Hi,
> >
> > TL;DR: I'd like to make it possible for ebuilds to define additional
> > variables that will be stored in md5-cache. This would be useful for CI
> > and other tooling that right now has to parse ebuilds for other data.
> >
> >
> > The idea is to add a new incremental ebuild/eclass variable (technical
> > name: QA_EXTRA_CACHE_VARS) that would define additional data to be
> > stored in cache. For example, python*-r1 eclasses would define
> > 'PYTHON_COMPAT', acct-user would define 'ACCT_USER_ID', etc.
> >
> > When regenerating cache, the PM would read this variable, and store
> > the values of all defined variables into md5-cache. As a result,
> > programs needing those variables can get them straight from cache
> > without having to attempt to run or parse ebuilds (which is both slow
> > and prone to bugs).
> >
> > This would benefit e.g. gpyutils that right now need to attempt to parse
> > PYTHON_COMPAT from ebuilds. It would also benefit writing future
> > pkgcheck checks for user/group ID collisions.
> >
> >
> > Notes:
> >
> > - since md5-cache uses key-value format and allows for future
> > extensions, the new values can be added without breaking anything;
> >
> > - md5-cache is not specified in the PMS, and the whole thing can be
> > implemented without need for EAPI bump,
> >
> > - I would like to have this implemented consistently both in Portage
> > and pkgcore,
> >
> > - we will need to clearly define how to dump arrays.
> >
> >
> > What do you think?
>
> Sounds good. Some thoughts:
>
> * Maybe omit QA from the variable name, since it can be could be
> generally useful for things that are unrelated to QA.
>
> * In the md5-cache entry, maybe use a common prefix like EXT_ for the
> extra keys in order to distinguish them from normal keys.
Yeah, I was thinking of something like '__ext_foo', or '__ext[foo]'.
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes
2019-07-25 20:29 ` Michał Górny
@ 2019-07-25 22:53 ` Michael Orlitzky
0 siblings, 0 replies; 6+ messages in thread
From: Michael Orlitzky @ 2019-07-25 22:53 UTC (permalink / raw
To: gentoo-portage-dev
On 7/25/19 4:29 PM, Michał Górny wrote:
>>
>> * In the md5-cache entry, maybe use a common prefix like EXT_ for the
>> extra keys in order to distinguish them from normal keys.
>
> Yeah, I was thinking of something like '__ext_foo', or '__ext[foo]'.
>
What are the pros/cons of this? The names refer to global variables, so
they should already be safely namespaced, right?.
There is a possibility that an eclass variable name (e.g. PATCHES) could
become standardized at a later date. If that happens, we could wind up
with both FOO and __ext_FOO in the cache, and tools would have to figure
out what to do with zero, one, or both present. (This has happened in
email/web protocols when an X-Foo header was standardized.) It's not the
end of the world, but someone would have to stop and think about it.
Finally, just having the name be predictable so that I can grep '^FOO='
without having to care where it came from is nice.
OTOH for testing, and for figuring out why these weird variables are
showing up in my cache, the prefix would help.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes
2019-07-25 12:20 [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes Michał Górny
2019-07-25 19:57 ` Zac Medico
@ 2019-07-26 6:49 ` Fabian Groffen
2019-07-28 21:26 ` Zac Medico
1 sibling, 1 reply; 6+ messages in thread
From: Fabian Groffen @ 2019-07-26 6:49 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 2049 bytes --]
Hi,
On 25-07-2019 14:20:50 +0200, Michał Górny wrote:
> Hi,
>
> TL;DR: I'd like to make it possible for ebuilds to define additional
> variables that will be stored in md5-cache. This would be useful for CI
> and other tooling that right now has to parse ebuilds for other data.
Only downside I can think of, is a diskspace increase for the md5-cache.
Not sure if this is going to be substantial, but given things like
PYTHON_COMPAT, perhaps a quick calculation of extra "cost" can be made.
Should diskspace become a problem, one could consider to use a separate
file/dir, that users could rsync-exclude, since Portage won't need it to
operate properly.
Thanks,
Fabian
>
>
> The idea is to add a new incremental ebuild/eclass variable (technical
> name: QA_EXTRA_CACHE_VARS) that would define additional data to be
> stored in cache. For example, python*-r1 eclasses would define
> 'PYTHON_COMPAT', acct-user would define 'ACCT_USER_ID', etc.
>
> When regenerating cache, the PM would read this variable, and store
> the values of all defined variables into md5-cache. As a result,
> programs needing those variables can get them straight from cache
> without having to attempt to run or parse ebuilds (which is both slow
> and prone to bugs).
>
> This would benefit e.g. gpyutils that right now need to attempt to parse
> PYTHON_COMPAT from ebuilds. It would also benefit writing future
> pkgcheck checks for user/group ID collisions.
>
>
> Notes:
>
> - since md5-cache uses key-value format and allows for future
> extensions, the new values can be added without breaking anything;
>
> - md5-cache is not specified in the PMS, and the whole thing can be
> implemented without need for EAPI bump,
>
> - I would like to have this implemented consistently both in Portage
> and pkgcore,
>
> - we will need to clearly define how to dump arrays.
>
>
> What do you think?
>
> --
> Best regards,
> Michał Górny
>
--
Fabian Groffen
Gentoo on a different level
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes
2019-07-26 6:49 ` Fabian Groffen
@ 2019-07-28 21:26 ` Zac Medico
0 siblings, 0 replies; 6+ messages in thread
From: Zac Medico @ 2019-07-28 21:26 UTC (permalink / raw
To: gentoo-portage-dev, Fabian Groffen
[-- Attachment #1.1: Type: text/plain, Size: 979 bytes --]
On 7/25/19 11:49 PM, Fabian Groffen wrote:
> Hi,
>
> On 25-07-2019 14:20:50 +0200, Michał Górny wrote:
>> Hi,
>>
>> TL;DR: I'd like to make it possible for ebuilds to define additional
>> variables that will be stored in md5-cache. This would be useful for CI
>> and other tooling that right now has to parse ebuilds for other data.
>
> Only downside I can think of, is a diskspace increase for the md5-cache.
> Not sure if this is going to be substantial, but given things like
> PYTHON_COMPAT, perhaps a quick calculation of extra "cost" can be made.
> Should diskspace become a problem, one could consider to use a separate
> file/dir, that users could rsync-exclude, since Portage won't need it to
> operate properly.
Yes, using a separate directory from md5-cache will provide useful
isolation. There's a lot of potential for bloat here, and by keeping it
separate we can easily render the bloat harmless.
> Thanks,
> Fabian--
Thanks,
Zac
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 981 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-07-28 21:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-25 12:20 [gentoo-portage-dev] [RFC] Adding extra vars to md5-cache, for QA&tooling purposes Michał Górny
2019-07-25 19:57 ` Zac Medico
2019-07-25 20:29 ` Michał Górny
2019-07-25 22:53 ` Michael Orlitzky
2019-07-26 6:49 ` Fabian Groffen
2019-07-28 21:26 ` Zac Medico
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox