* [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
@ 2014-12-24 1:51 Zac Medico
2014-12-24 5:16 ` Matthew Thode
2014-12-27 14:25 ` [gentoo-portage-dev] " Rick "Zero_Chaos" Farina
0 siblings, 2 replies; 13+ messages in thread
From: Zac Medico @ 2014-12-24 1:51 UTC (permalink / raw
To: gentoo-portage-dev
Hi,
As discussed in bug 150031 [1], it would be useful if PKGDIR could
accommodate multiple binary packages built from the same source ebuild.
Use cases for preserving multiple builds typically involve supporting
multiple clients (with partially compatible configurations) from a
single unified binhost. In this context, some of the reasons to retain
multiple builds are:
* Different USE flag combinations enabled (--newuse/--binpkg-respect-use
needed)
* Different versions of installed dependencies (EAPI 5 slot := operators
needed)
* Different repositories/overlays, with variance in the time of the last
sync (--changed-deps/--binpkg-changed-deps needed if dependencies change
due to eclass changes or ebuild modifications without revbump)
Given the above variety of reasons to retain previous builds, a simple
counter (1, 2, 3,...) seems like a reasonable means to generate unique
file names.
In order to avoid having too many files in a directory, we can use a
separate directory for each ${CATEGORY}/${PN}, like we do for the source
ebuild repositories.
In order to avoid having to deal with multiple file extensions for
different compression types, we can simply use .xpak for the file
extension [2], since that's the name of the format that we use to append
metadata to our existing tbz2 files. We can simply probe the first few
bytes of the file in order to determine the compression type:
gzip: 1f 8b
bzip2: 42 5a 68 39
xz: fd 37 7a 58 5a 00
Users will be able change their compression settings at any time, but
the .xpak file extension will remain constant regardless of that
setting. It won't matter if they have a mixture of files compressed with
different compressors.
A tool like eclean-pkg will be needed to clean up old binary packages
based on user preferences. We might also provide a variety of on-the-fly
garbage collection settings.
Based on the above discussion, the location of any particular binary
package can be expressed as follows:
${PKGDIR}/${CATEGORY}/${PN}/${PF}-${COUNTER}.xpak
The existing format of the ${PKGDIR}/Packages index will work fine,
since it allows each package to specify a PATH attribute which
corresponds to the path of the file relative to the base directory. If
the .xpak files use bzip2 compression, it will even be compatible with
existing clients (though they won't be able to intelligently choose
between multiple packages of the same version). If all the packages of a
given version are ordered by ${COUNTER}, then existing clients will
simply download the latest build.
[1] https://bugs.gentoo.org/show_bug.cgi?id=150031
[2] http://dev.gentoo.org/~zmedico/portage/doc/man/xpak.5.html
--
Thanks,
Zac
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-24 1:51 [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts Zac Medico
@ 2014-12-24 5:16 ` Matthew Thode
2014-12-24 8:13 ` Zac Medico
2014-12-27 14:25 ` [gentoo-portage-dev] " Rick "Zero_Chaos" Farina
1 sibling, 1 reply; 13+ messages in thread
From: Matthew Thode @ 2014-12-24 5:16 UTC (permalink / raw
To: gentoo-portage-dev
On 12/23/2014 07:51 PM, Zac Medico wrote:
> Hi,
>
> As discussed in bug 150031 [1], it would be useful if PKGDIR could
> accommodate multiple binary packages built from the same source ebuild.
> Use cases for preserving multiple builds typically involve supporting
> multiple clients (with partially compatible configurations) from a
> single unified binhost. In this context, some of the reasons to retain
> multiple builds are:
>
> * Different USE flag combinations enabled (--newuse/--binpkg-respect-use
> needed)
>
> * Different versions of installed dependencies (EAPI 5 slot := operators
> needed)
>
> * Different repositories/overlays, with variance in the time of the last
> sync (--changed-deps/--binpkg-changed-deps needed if dependencies change
> due to eclass changes or ebuild modifications without revbump)
>
> Given the above variety of reasons to retain previous builds, a simple
> counter (1, 2, 3,...) seems like a reasonable means to generate unique
> file names.
>
> In order to avoid having too many files in a directory, we can use a
> separate directory for each ${CATEGORY}/${PN}, like we do for the source
> ebuild repositories.
>
> In order to avoid having to deal with multiple file extensions for
> different compression types, we can simply use .xpak for the file
> extension [2], since that's the name of the format that we use to append
> metadata to our existing tbz2 files. We can simply probe the first few
> bytes of the file in order to determine the compression type:
>
> gzip: 1f 8b
> bzip2: 42 5a 68 39
> xz: fd 37 7a 58 5a 00
>
> Users will be able change their compression settings at any time, but
> the .xpak file extension will remain constant regardless of that
> setting. It won't matter if they have a mixture of files compressed with
> different compressors.
>
> A tool like eclean-pkg will be needed to clean up old binary packages
> based on user preferences. We might also provide a variety of on-the-fly
> garbage collection settings.
>
> Based on the above discussion, the location of any particular binary
> package can be expressed as follows:
>
> ${PKGDIR}/${CATEGORY}/${PN}/${PF}-${COUNTER}.xpak
>
> The existing format of the ${PKGDIR}/Packages index will work fine,
> since it allows each package to specify a PATH attribute which
> corresponds to the path of the file relative to the base directory. If
> the .xpak files use bzip2 compression, it will even be compatible with
> existing clients (though they won't be able to intelligently choose
> between multiple packages of the same version). If all the packages of a
> given version are ordered by ${COUNTER}, then existing clients will
> simply download the latest build.
>
> [1] https://bugs.gentoo.org/show_bug.cgi?id=150031
> [2] http://dev.gentoo.org/~zmedico/portage/doc/man/xpak.5.html
>
I like this (and it has been a long time coming). What format are we
going to store the metadata of the use flag combinations and the rest?
I guess that's already stored since portage knows not to use binpkgs if
those change.
Also, would this change be a good time to change to store that metadata
externally? Running portage over NFS with binpkgs takes forever, I
don't think a binhost makes it faster either. If there were some way to
get all the info for the binpkgs into one file (so it could be run on
cron or something), this could mean that I'd only have to do one file
request for all that metadata and would be much quicker than inspecting
all those files.
--
-- Matthew Thode (prometheanfire)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-24 5:16 ` Matthew Thode
@ 2014-12-24 8:13 ` Zac Medico
2014-12-24 12:01 ` vivo75
2014-12-25 10:03 ` [gentoo-portage-dev] " Duncan
0 siblings, 2 replies; 13+ messages in thread
From: Zac Medico @ 2014-12-24 8:13 UTC (permalink / raw
To: gentoo-portage-dev
On 12/23/2014 09:16 PM, Matthew Thode wrote:
> I like this (and it has been a long time coming). What format are we
> going to store the metadata of the use flag combinations and the rest?
The current approach is to store the data in an xpak segment that is
appended to the end of the tbz2 file. The $PKGDIR/Packages files serves
as a cache for the essential parts of the xpak data that are used in
dependency calculations.
> I guess that's already stored since portage knows not to use binpkgs if
> those change.
>
> Also, would this change be a good time to change to store that metadata
> externally?
That's why we have the $PKGDIR/Packages cache, which is validated using
stat.st_size and stat.st_mtime
> Running portage over NFS with binpkgs takes forever, I
It's probably all of the readdir and stat calls. If we simply assumed
that $PKGDIR/Packages was valid, we could eliminate the readdir and stat
calls. Binhost clients operate under this assumption.
> don't think a binhost makes it faster either. If there were some way to
> get all the info for the binpkgs into one file (so it could be run on
> cron or something), this could mean that I'd only have to do one file
> request for all that metadata and would be much quicker than inspecting
> all those files.
That's what $PKGDIR/Packages is.
--
Thanks,
Zac
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-24 8:13 ` Zac Medico
@ 2014-12-24 12:01 ` vivo75
2014-12-24 16:07 ` Zac Medico
2014-12-25 10:03 ` [gentoo-portage-dev] " Duncan
1 sibling, 1 reply; 13+ messages in thread
From: vivo75 @ 2014-12-24 12:01 UTC (permalink / raw
To: gentoo-portage-dev
Il 24/12/2014 09:13, Zac Medico ha scritto:
>> I like this (and it has been a long time coming). What format are we
>> > going to store the metadata of the use flag combinations and the rest?
> The current approach is to store the data in an xpak segment that is
> appended to the end of the tbz2 file. The $PKGDIR/Packages files serves
> as a cache for the essential parts of the xpak data that are used in
> dependency calculations.
>
I'd like to see the xpak data being put in it's own file at the
_beginning_ of the tar file.
tar -Jcf \
${PKGDIR}/${CATEGORY}/${PN}/${PF}-${COUNTER}.xpak \
tmp/${CATEGORY}:${PN}:${PF}-${COUNTER}.xpak \
*all_the_other_stuff*
this way reading it could be faster on some media and filesystem and it
would not deviate from the standard tar.
Being in /tmp/ is only for commodity but the place is debatable.
Instead the fact it _must_ be the first file it's not, in a sequential
archive file like tar some things depend on it.
seem to be the right time to do the change, since tool need to be
rewritten anyway, but I'll leave to you analyze the fallout of this change.
Best regards,
Francesco Riosa
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-24 12:01 ` vivo75
@ 2014-12-24 16:07 ` Zac Medico
2014-12-24 18:36 ` vivo75
0 siblings, 1 reply; 13+ messages in thread
From: Zac Medico @ 2014-12-24 16:07 UTC (permalink / raw
To: gentoo-portage-dev
On 12/24/2014 04:01 AM, vivo75@gmail.com wrote:
> Il 24/12/2014 09:13, Zac Medico ha scritto:
>>> I like this (and it has been a long time coming). What format are we
>>>> going to store the metadata of the use flag combinations and the rest?
>> The current approach is to store the data in an xpak segment that is
>> appended to the end of the tbz2 file. The $PKGDIR/Packages files serves
>> as a cache for the essential parts of the xpak data that are used in
>> dependency calculations.
>>
> I'd like to see the xpak data being put in it's own file at the
> _beginning_ of the tar file.
>
> tar -Jcf \
> ${PKGDIR}/${CATEGORY}/${PN}/${PF}-${COUNTER}.xpak \
> tmp/${CATEGORY}:${PN}:${PF}-${COUNTER}.xpak \
> *all_the_other_stuff*
>
> this way reading it could be faster on some media and filesystem and it
> would not deviate from the standard tar.
There wouldn't be any benefit, because the data is practically always
read from the $PKGDIR/Packages cache anyway. The cache is generated when
the package is built, and the rate-limiting step there is the building
of the package.
> Being in /tmp/ is only for commodity but the place is debatable.
> Instead the fact it _must_ be the first file it's not, in a sequential
> archive file like tar some things depend on it.
With the current approach, the xpak segment is not part of the tar file.
The tar file is compressed, and the xpak segment is appended to the end
of the resulting bzip2 file.
> seem to be the right time to do the change, since tool need to be
> rewritten anyway, but I'll leave to you analyze the fallout of this change.
There will be zero benefits from doing that.
--
Thanks,
Zac
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-24 16:07 ` Zac Medico
@ 2014-12-24 18:36 ` vivo75
2014-12-24 19:17 ` Zac Medico
0 siblings, 1 reply; 13+ messages in thread
From: vivo75 @ 2014-12-24 18:36 UTC (permalink / raw
To: gentoo-portage-dev
Il 24/12/2014 17:07, Zac Medico ha scritto:
> On 12/24/2014 04:01 AM, vivo75@gmail.com wrote:
>> Il 24/12/2014 09:13, Zac Medico ha scritto:
>>>> I like this (and it has been a long time coming). What format are we
>>>>> going to store the metadata of the use flag combinations and the rest?
>>> The current approach is to store the data in an xpak segment that is
>>> appended to the end of the tbz2 file. The $PKGDIR/Packages files serves
>>> as a cache for the essential parts of the xpak data that are used in
>>> dependency calculations.
>>>
>> I'd like to see the xpak data being put in it's own file at the
>> _beginning_ of the tar file.
>>
>> tar -Jcf \
>> ${PKGDIR}/${CATEGORY}/${PN}/${PF}-${COUNTER}.xpak \
>> tmp/${CATEGORY}:${PN}:${PF}-${COUNTER}.xpak \
>> *all_the_other_stuff*
>>
>> this way reading it could be faster on some media and filesystem and it
>> would not deviate from the standard tar.
> There wouldn't be any benefit, because the data is practically always
> read from the $PKGDIR/Packages cache anyway. The cache is generated when
> the package is built, and the rate-limiting step there is the building
> of the package.
ack, and what about emerge on destination host?
>> Being in /tmp/ is only for commodity but the place is debatable.
>> Instead the fact it _must_ be the first file it's not, in a sequential
>> archive file like tar some things depend on it.
> With the current approach, the xpak segment is not part of the tar file.
> The tar file is compressed, and the xpak segment is appended to the end
> of the resulting bzip2 file.
>
>> seem to be the right time to do the change, since tool need to be
>> rewritten anyway, but I'll leave to you analyze the fallout of this change.
> There will be zero benefits from doing that.
I see at least two however admittedly not too big
1) I see having a canonical tarball as an advantage, opposed to be
forced to use /usr/bin/{qtbz2,qxpak}
2) there is a small benefit in space (which increase using xz) for
bigger packages it would be smaller in percentage.
-rw-r--r-- 1 root root 8720 24 dic 18.25 linuxtv-dvb-headers-5.8.tar.bz2
-rw-r--r-- 1 root root 9691 24 dic 18.07 linuxtv-dvb-headers-5.8.tbz2
this has been obtained unpacking the xpak _and_ environment.bz2 (to
avoid double compression) and putting them in a subdirectory
ls -l tmp/linuxtv-dvb-headers-5.8/ usr
tmp/linuxtv-dvb-headers-5.8/:
total 80
-rw-r--r-- 1 root root 11 Dec 24 18:20 BUILD_TIME
-rw-r--r-- 1 root root 8 Dec 24 18:20 CATEGORY
-rw-r--r-- 1 root root 2 Dec 24 18:20 DEFINED_PHASES
-rw-r--r-- 1 root root 52 Dec 24 18:20 DESCRIPTION
-rw-r--r-- 1 root root 2 Dec 24 18:20 EAPI
-rw-r--r-- 1 root root 394 Dec 24 18:20 FEATURES
-rw-r--r-- 1 root root 1 Dec 24 18:20 IUSE
-rw-r--r-- 1 root root 30 Dec 24 18:20 KEYWORDS
-rw-r--r-- 1 root root 24 Dec 24 18:20 PF
-rw-r--r-- 1 root root 31 Dec 24 18:20 RDEPEND
-rw-r--r-- 1 root root 2 Dec 24 18:20 SIZE
-rw-r--r-- 1 root root 2 Dec 24 18:20 SLOT
-rw-r--r-- 1 root root 65 Dec 24 18:20 USE
-rw-r--r-- 1 root root 22859 Dec 24 18:21 environment
-rw-r--r-- 1 root root 443 Dec 24 18:20 linuxtv-dvb-headers-5.8.ebuild
-rw-r--r-- 1 root root 7 Dec 24 18:20 repository
usr:
total 9
drwxr-xr-x 3 root root 3 Dec 5 02:42 src
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-24 18:36 ` vivo75
@ 2014-12-24 19:17 ` Zac Medico
0 siblings, 0 replies; 13+ messages in thread
From: Zac Medico @ 2014-12-24 19:17 UTC (permalink / raw
To: gentoo-portage-dev
On 12/24/2014 10:36 AM, vivo75@gmail.com wrote:
> Il 24/12/2014 17:07, Zac Medico ha scritto:
>> On 12/24/2014 04:01 AM, vivo75@gmail.com wrote:
>>> Il 24/12/2014 09:13, Zac Medico ha scritto:
>>>>> I like this (and it has been a long time coming). What format are we
>>>>>> going to store the metadata of the use flag combinations and the rest?
>>>> The current approach is to store the data in an xpak segment that is
>>>> appended to the end of the tbz2 file. The $PKGDIR/Packages files serves
>>>> as a cache for the essential parts of the xpak data that are used in
>>>> dependency calculations.
>>>>
>>> I'd like to see the xpak data being put in it's own file at the
>>> _beginning_ of the tar file.
>>>
>>> tar -Jcf \
>>> ${PKGDIR}/${CATEGORY}/${PN}/${PF}-${COUNTER}.xpak \
>>> tmp/${CATEGORY}:${PN}:${PF}-${COUNTER}.xpak \
>>> *all_the_other_stuff*
>>>
>>> this way reading it could be faster on some media and filesystem and it
>>> would not deviate from the standard tar.
>> There wouldn't be any benefit, because the data is practically always
>> read from the $PKGDIR/Packages cache anyway. The cache is generated when
>> the package is built, and the rate-limiting step there is the building
>> of the package.
> ack, and what about emerge on destination host?
The destination host uses a downloaded copy of $PKGDIR/Packages for
dependency calculations.
Also, I suspect that you're drastically over-estimating the cost of
accessing the xpak metadata, since otherwise you wouldn't be asking this
question.
>>> Being in /tmp/ is only for commodity but the place is debatable.
>>> Instead the fact it _must_ be the first file it's not, in a sequential
>>> archive file like tar some things depend on it.
>> With the current approach, the xpak segment is not part of the tar file.
>> The tar file is compressed, and the xpak segment is appended to the end
>> of the resulting bzip2 file.
>>
>>> seem to be the right time to do the change, since tool need to be
>>> rewritten anyway, but I'll leave to you analyze the fallout of this change.
>> There will be zero benefits from doing that.
> I see at least two however admittedly not too big
>
> 1) I see having a canonical tarball as an advantage, opposed to be
> forced to use /usr/bin/{qtbz2,qxpak}
You can already use tar to unpack our existing tbz2/xpak files. You only
need special tools if you want to access the package metadata, and I
think it's reasonable to expect anyone who wants to access the metadata
to have the appropriate tools.
> 2) there is a small benefit in space (which increase using xz) for
> bigger packages it would be smaller in percentage.
>
> -rw-r--r-- 1 root root 8720 24 dic 18.25 linuxtv-dvb-headers-5.8.tar.bz2
> -rw-r--r-- 1 root root 9691 24 dic 18.07 linuxtv-dvb-headers-5.8.tbz2
It think this savings is negligible. The environment.bz2 file accounts
for most of the space consumed by the xpak segment, and it is compressed
in order to save space.
> this has been obtained unpacking the xpak _and_ environment.bz2 (to
> avoid double compression) and putting them in a subdirectory
>
> ls -l tmp/linuxtv-dvb-headers-5.8/ usr
> tmp/linuxtv-dvb-headers-5.8/:
> total 80
> -rw-r--r-- 1 root root 11 Dec 24 18:20 BUILD_TIME
> -rw-r--r-- 1 root root 8 Dec 24 18:20 CATEGORY
> -rw-r--r-- 1 root root 2 Dec 24 18:20 DEFINED_PHASES
> -rw-r--r-- 1 root root 52 Dec 24 18:20 DESCRIPTION
> -rw-r--r-- 1 root root 2 Dec 24 18:20 EAPI
> -rw-r--r-- 1 root root 394 Dec 24 18:20 FEATURES
> -rw-r--r-- 1 root root 1 Dec 24 18:20 IUSE
> -rw-r--r-- 1 root root 30 Dec 24 18:20 KEYWORDS
> -rw-r--r-- 1 root root 24 Dec 24 18:20 PF
> -rw-r--r-- 1 root root 31 Dec 24 18:20 RDEPEND
> -rw-r--r-- 1 root root 2 Dec 24 18:20 SIZE
> -rw-r--r-- 1 root root 2 Dec 24 18:20 SLOT
> -rw-r--r-- 1 root root 65 Dec 24 18:20 USE
> -rw-r--r-- 1 root root 22859 Dec 24 18:21 environment
> -rw-r--r-- 1 root root 443 Dec 24 18:20 linuxtv-dvb-headers-5.8.ebuild
> -rw-r--r-- 1 root root 7 Dec 24 18:20 repository
We could save some more space in the xpak segment if we omitted the
ebuild, since environment.bz2 contains everything that we need from the
ebuild. We could also omit FEATURES, since it's non-essential and it's
included in environment.bz2 anyway. However, these are trivial
micro-optimizations.
--
Thanks,
Zac
^ permalink raw reply [flat|nested] 13+ messages in thread
* [gentoo-portage-dev] Re: [RFC] New file layout for PKGDIR and binhosts
2014-12-24 8:13 ` Zac Medico
2014-12-24 12:01 ` vivo75
@ 2014-12-25 10:03 ` Duncan
2014-12-25 11:04 ` Zac Medico
2015-01-07 5:32 ` Brian Dolbec
1 sibling, 2 replies; 13+ messages in thread
From: Duncan @ 2014-12-25 10:03 UTC (permalink / raw
To: gentoo-portage-dev
Zac Medico posted on Wed, 24 Dec 2014 00:13:57 -0800 as excerpted:
>> If there were some way to
>> get all the info for the binpkgs into one file (so it could be run on
>> cron or something), this could mean that I'd only have to do one file
>> request for all that metadata and would be much quicker than inspecting
>> all those files.
>
> That's what $PKGDIR/Packages is.
This is a good excuse to ask a question that has bothered me for some
time, plus make a request for the new eclean-pkg replacement...
Normally I like to keep several old binpkgs around for troubleshooting
reference or quick-install, but the combined set of kde packages, for
instance, can get pretty big, and with monthly iterations, they build up
pretty fast.
But eclean-pkg doesn't have an easy way to say clean up /just/ kde-base/
*, leaving the currently installed version and one previous version for
reference, cleaning out all others. (Sure I could put /everything/ else
on the exclude list, but what's really needed is an include list, plus a
"keep N more" option.)
So I often end up cleaning packages like that out manually by simply
deleting them. My question is thus, does the remaining index/db/cache
entry get cleaned properly (when?) or am I leaving uncleanable garbage
behind when I do this?
Which of course translates into a couple of feature requests for the new
eclean-pkg:
1) Make it possible to clean only selected pkgs or categories.
2) Have an option to clean up and/or regenerate the cache/index/db/
whatever files, re-syncing them with what's actually there.
Meanwhile, another possible usage for multiple binpkgs per ebuild:
I commonly run live-builds (mostly kde, tho I'm not ATM), and would
/love/ to be able to keep about three generations (current plus two back)
around, ideally IDed by their git-commit or the like. Running live
packages can mean even more risk than usual of something not working out,
but currently, if the package builds, by the time you find that out,
you've obliterated your previous *9999 binpkg and even figuring out which
git commit it was isn't easy, let alone doing a quick binpkg rollback,
like you'd do with an ordinary version upgrade. Were multiple live-
binpkg-versions kept around, IDed by the git-commit or at least with that
info in the metadata somewhere, it'd make things /so/ much easier! =:^)
But of course that does make having an automated cleaner that can keep
just the last N package versions around while deleting the others, that
much more important.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] Re: [RFC] New file layout for PKGDIR and binhosts
2014-12-25 10:03 ` [gentoo-portage-dev] " Duncan
@ 2014-12-25 11:04 ` Zac Medico
2014-12-26 5:20 ` Duncan
2015-01-07 5:32 ` Brian Dolbec
1 sibling, 1 reply; 13+ messages in thread
From: Zac Medico @ 2014-12-25 11:04 UTC (permalink / raw
To: gentoo-portage-dev
On 12/25/2014 02:03 AM, Duncan wrote:
> Zac Medico posted on Wed, 24 Dec 2014 00:13:57 -0800 as excerpted:
>
>>> If there were some way to
>>> get all the info for the binpkgs into one file (so it could be run on
>>> cron or something), this could mean that I'd only have to do one file
>>> request for all that metadata and would be much quicker than inspecting
>>> all those files.
>>
>> That's what $PKGDIR/Packages is.
>
> This is a good excuse to ask a question that has bothered me for some
> time, plus make a request for the new eclean-pkg replacement...
>
> Normally I like to keep several old binpkgs around for troubleshooting
> reference or quick-install, but the combined set of kde packages, for
> instance, can get pretty big, and with monthly iterations, they build up
> pretty fast.
>
> But eclean-pkg doesn't have an easy way to say clean up /just/ kde-base/
> *, leaving the currently installed version and one previous version for
> reference, cleaning out all others. (Sure I could put /everything/ else
> on the exclude list, but what's really needed is an include list, plus a
> "keep N more" option.)
>
> So I often end up cleaning packages like that out manually by simply
> deleting them. My question is thus, does the remaining index/db/cache
> entry get cleaned properly (when?) or am I leaving uncleanable garbage
> behind when I do this?
You can run 'emaint --fix binhost' to sync $PKGDIR/Packages up with the
existing tbz2 files.
If you don't run that emaint command, the packages that you removed will
be pruned from $PKGDIR/Packages the next time that you run an 'emerge
--usepkg' command with sufficient privileges.
> Which of course translates into a couple of feature requests for the new
> eclean-pkg:
>
> 1) Make it possible to clean only selected pkgs or categories.
>
> 2) Have an option to clean up and/or regenerate the cache/index/db/
> whatever files, re-syncing them with what's actually there.
eclean-pkg calls 'emaint --fix binhost' to fix $PKGDIR/Packages after it
removes some tbz2 files.
> Meanwhile, another possible usage for multiple binpkgs per ebuild:
>
> I commonly run live-builds (mostly kde, tho I'm not ATM), and would
> /love/ to be able to keep about three generations (current plus two back)
> around, ideally IDed by their git-commit or the like. Running live
> packages can mean even more risk than usual of something not working out,
> but currently, if the package builds, by the time you find that out,
> you've obliterated your previous *9999 binpkg and even figuring out which
> git commit it was isn't easy, let alone doing a quick binpkg rollback,
FWIW, you can use 'cp -rl $PKGDIR $PKGDIR.backup' to make a hardlink
snapshot of $PKGDIR. Portage will break hardlinks whenever it needs to
update tbz2 files, so you don't have to worry about updates in $PKGDIR
affecting the files in $PKGDIR.backup.
> like you'd do with an ordinary version upgrade. Were multiple live-
> binpkg-versions kept around, IDed by the git-commit or at least with that
> info in the metadata somewhere, it'd make things /so/ much easier! =:^)
That's going to require an EAPI extension [1], in order to establish a
protocol for the ebuild to communicate the commit id to the package manager.
> But of course that does make having an automated cleaner that can keep
> just the last N package versions around while deleting the others, that
> much more important.
Yes, and we might integrate some on-the-fly garbage collection directly
into portage, or provide a hook for this purpose.
[1] https://bugs.gentoo.org/show_bug.cgi?id=182028
--
Thanks,
Zac
^ permalink raw reply [flat|nested] 13+ messages in thread
* [gentoo-portage-dev] Re: [RFC] New file layout for PKGDIR and binhosts
2014-12-25 11:04 ` Zac Medico
@ 2014-12-26 5:20 ` Duncan
0 siblings, 0 replies; 13+ messages in thread
From: Duncan @ 2014-12-26 5:20 UTC (permalink / raw
To: gentoo-portage-dev
Zac Medico posted on Thu, 25 Dec 2014 03:04:20 -0800 as excerpted:
> FWIW, you can use 'cp -rl $PKGDIR $PKGDIR.backup' to make a hardlink
> snapshot of $PKGDIR. Portage will break hardlinks whenever it needs to
> update tbz2 files, so you don't have to worry about updates in $PKGDIR
> affecting the files in $PKGDIR.backup.
Thanks. Your portage knowledge, and more to the point your calm patience
explaining its workings and fixing its bugs, always amaze me. =:^) I
sure missed it while you were gone and sure am glad you're back, with
more help now. =:^)
I had overlooked emaint --fix binhost (a hazard for those who have been
around long enough to have tools seriously evolve after the original
learning period), and hadn't thought at all about hardlinks (or btrfs
reflinks, since I'm using it for that partition). I expect I'll find
this quite useful. =:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-24 1:51 [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts Zac Medico
2014-12-24 5:16 ` Matthew Thode
@ 2014-12-27 14:25 ` Rick "Zero_Chaos" Farina
2014-12-29 3:53 ` Zac Medico
1 sibling, 1 reply; 13+ messages in thread
From: Rick "Zero_Chaos" Farina @ 2014-12-27 14:25 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 3448 bytes --]
On 12/23/14 20:51, Zac Medico wrote:
> Hi,
>
> As discussed in bug 150031 [1], it would be useful if PKGDIR could
> accommodate multiple binary packages built from the same source ebuild.
> Use cases for preserving multiple builds typically involve supporting
> multiple clients (with partially compatible configurations) from a
> single unified binhost. In this context, some of the reasons to retain
> multiple builds are:
>
> * Different USE flag combinations enabled (--newuse/--binpkg-respect-use
> needed)
>
> * Different versions of installed dependencies (EAPI 5 slot := operators
> needed)
>
> * Different repositories/overlays, with variance in the time of the last
> sync (--changed-deps/--binpkg-changed-deps needed if dependencies change
> due to eclass changes or ebuild modifications without revbump)
I'm not saying don't take this into account, but in reality, this isn't
a problem we should have to deal with. if users want to rely on binpkgs
they should be syncing to the same rev the binhost used to generate
them. it's a reasonably trivial task to do this, even as simple as
daily webrsync or something. To handle users with a different class or
ebuild version will prove difficult I believe, and worse, it will make
possibly dozens of extra binpkg revs for basically no reason.
-Zero_Chaos
PS> This is so exciting....
>
> Given the above variety of reasons to retain previous builds, a simple
> counter (1, 2, 3,...) seems like a reasonable means to generate unique
> file names.
>
> In order to avoid having too many files in a directory, we can use a
> separate directory for each ${CATEGORY}/${PN}, like we do for the source
> ebuild repositories.
>
> In order to avoid having to deal with multiple file extensions for
> different compression types, we can simply use .xpak for the file
> extension [2], since that's the name of the format that we use to append
> metadata to our existing tbz2 files. We can simply probe the first few
> bytes of the file in order to determine the compression type:
>
> gzip: 1f 8b
> bzip2: 42 5a 68 39
> xz: fd 37 7a 58 5a 00
>
> Users will be able change their compression settings at any time, but
> the .xpak file extension will remain constant regardless of that
> setting. It won't matter if they have a mixture of files compressed with
> different compressors.
>
> A tool like eclean-pkg will be needed to clean up old binary packages
> based on user preferences. We might also provide a variety of on-the-fly
> garbage collection settings.
>
> Based on the above discussion, the location of any particular binary
> package can be expressed as follows:
>
> ${PKGDIR}/${CATEGORY}/${PN}/${PF}-${COUNTER}.xpak
>
> The existing format of the ${PKGDIR}/Packages index will work fine,
> since it allows each package to specify a PATH attribute which
> corresponds to the path of the file relative to the base directory. If
> the .xpak files use bzip2 compression, it will even be compatible with
> existing clients (though they won't be able to intelligently choose
> between multiple packages of the same version). If all the packages of a
> given version are ordered by ${COUNTER}, then existing clients will
> simply download the latest build.
>
> [1] https://bugs.gentoo.org/show_bug.cgi?id=150031
> [2] http://dev.gentoo.org/~zmedico/portage/doc/man/xpak.5.html
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts
2014-12-27 14:25 ` [gentoo-portage-dev] " Rick "Zero_Chaos" Farina
@ 2014-12-29 3:53 ` Zac Medico
0 siblings, 0 replies; 13+ messages in thread
From: Zac Medico @ 2014-12-29 3:53 UTC (permalink / raw
To: gentoo-portage-dev
On 12/27/2014 06:25 AM, Rick "Zero_Chaos" Farina wrote:
> On 12/23/14 20:51, Zac Medico wrote:
>> Hi,
>>
>> As discussed in bug 150031 [1], it would be useful if PKGDIR could
>> accommodate multiple binary packages built from the same source ebuild.
>> Use cases for preserving multiple builds typically involve supporting
>> multiple clients (with partially compatible configurations) from a
>> single unified binhost. In this context, some of the reasons to retain
>> multiple builds are:
>>
>> * Different USE flag combinations enabled (--newuse/--binpkg-respect-use
>> needed)
>>
>> * Different versions of installed dependencies (EAPI 5 slot := operators
>> needed)
>>
>> * Different repositories/overlays, with variance in the time of the last
>> sync (--changed-deps/--binpkg-changed-deps needed if dependencies change
>> due to eclass changes or ebuild modifications without revbump)
>
> I'm not saying don't take this into account, but in reality, this isn't
> a problem we should have to deal with.
Who is we? How do you know that whatever practices you use will also be
utilized by everyone else out there?
> if users want to rely on binpkgs
> they should be syncing to the same rev the binhost used to generate
> them. it's a reasonably trivial task to do this, even as simple as
> daily webrsync or something.
The --changed-deps/--binpkg-changed-deps are also useful for the same
reason that emerge --dynamic-deps is enabled by default:
* On the binhost server side, --changed-deps is an easy way to rebuild
packages so the resulting binary packages have the latest deps, which
may be necessary in order for the dependencies to be *satisfiable*.
* On the binhost client side, --binpkg-changed-deps can be used to
reject binary packages that haven't been rebuilt with the latest
dependency specifications, avoiding inconsistent dependencies that may
not be *satisfiable*.
Sure, the --binpkg-changed-deps thing may not be needed for whatever
limited use cases you are thinking of, but lets not force the same
limits on everyone else.
> To handle users with a different class or
> ebuild version will prove difficult I believe, and worse, it will make
> possibly dozens of extra binpkg revs for basically no reason.
As discussed the above, these "extra binpkg revs" may be needed in order
for the dependencies to be *satisfiable*. The cost of rebuilding
packages can be considered negligible in comparison to the time that
people would otherwise have to spend in order to manually resolve issues
involving dependencies that are unsatisfiable.
--
Thanks,
Zac
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-portage-dev] Re: [RFC] New file layout for PKGDIR and binhosts
2014-12-25 10:03 ` [gentoo-portage-dev] " Duncan
2014-12-25 11:04 ` Zac Medico
@ 2015-01-07 5:32 ` Brian Dolbec
1 sibling, 0 replies; 13+ messages in thread
From: Brian Dolbec @ 2015-01-07 5:32 UTC (permalink / raw
To: gentoo-portage-dev
On Thu, 25 Dec 2014 10:03:05 +0000 (UTC)
Duncan <1i5t5.duncan@cox.net> wrote:
> Zac Medico posted on Wed, 24 Dec 2014 00:13:57 -0800 as excerpted:
>
> >> If there were some way to
> >> get all the info for the binpkgs into one file (so it could be run
> >> on cron or something), this could mean that I'd only have to do
> >> one file request for all that metadata and would be much quicker
> >> than inspecting all those files.
> >
> > That's what $PKGDIR/Packages is.
>
> This is a good excuse to ask a question that has bothered me for some
> time, plus make a request for the new eclean-pkg replacement...
>
> Normally I like to keep several old binpkgs around for
> troubleshooting reference or quick-install, but the combined set of
> kde packages, for instance, can get pretty big, and with monthly
> iterations, they build up pretty fast.
>
> But eclean-pkg doesn't have an easy way to say clean up /just/
> kde-base/ *, leaving the currently installed version and one previous
> version for reference, cleaning out all others. (Sure I could
> put /everything/ else on the exclude list, but what's really needed
> is an include list, plus a "keep N more" option.)
>
> So I often end up cleaning packages like that out manually by simply
> deleting them. My question is thus, does the remaining
> index/db/cache entry get cleaned properly (when?) or am I leaving
> uncleanable garbage behind when I do this?
>
> Which of course translates into a couple of feature requests for the
> new eclean-pkg:
>
> 1) Make it possible to clean only selected pkgs or categories.
>
> 2) Have an option to clean up and/or regenerate the cache/index/db/
> whatever files, re-syncing them with what's actually there.
>
>
...
> But of course that does make having an automated cleaner that can
> keep just the last N package versions around while deleting the
> others, that much more important.
>
Please file a feature request bug for gentoolkit/eclean.
If I can ever get caught up with other things or we get a gsoc student
that needs some extra work...
Also with eclean and emaint rewritten (by me) so that eclean could use
emaint modules directly. I have thought a few times we might migrate
eclean from gentoolkit to emaint module(s) They can even be installed
separately from portage via ebuild. The emaint system is fully modular
with a plug-in system.
--
Brian Dolbec <dolsen>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-01-07 5:32 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-24 1:51 [gentoo-portage-dev] [RFC] New file layout for PKGDIR and binhosts Zac Medico
2014-12-24 5:16 ` Matthew Thode
2014-12-24 8:13 ` Zac Medico
2014-12-24 12:01 ` vivo75
2014-12-24 16:07 ` Zac Medico
2014-12-24 18:36 ` vivo75
2014-12-24 19:17 ` Zac Medico
2014-12-25 10:03 ` [gentoo-portage-dev] " Duncan
2014-12-25 11:04 ` Zac Medico
2014-12-26 5:20 ` Duncan
2015-01-07 5:32 ` Brian Dolbec
2014-12-27 14:25 ` [gentoo-portage-dev] " Rick "Zero_Chaos" Farina
2014-12-29 3:53 ` Zac Medico
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox