public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Rich Freeman <rich0@gentoo.org>
To: Zac Medico <zmedico@gentoo.org>
Cc: gentoo-dev <gentoo-dev@lists.gentoo.org>, binhost@gentoo.org
Subject: Re: [gentoo-dev] New project: binhost
Date: Sun, 14 Feb 2021 08:30:35 -0500	[thread overview]
Message-ID: <CAGfcS_==dFEcSSc_3ZE78uH+GRmkSZVQHpgHp2BJmM1Hqc6v9A@mail.gmail.com> (raw)
In-Reply-To: <818c2e65-2501-9429-a9e7-95868c2d1c96@gentoo.org>

On Sat, Feb 13, 2021 at 8:51 PM Zac Medico <zmedico@gentoo.org> wrote:
>
> > 2.  Generate a hash of the file contents - this can go in the filename
> > so that the file can co-exist with other files, and be located
> > assuming you have a full matching set of metadata.
>
> For FEATURES=binpkg-multi-instance we currently use an integer BUILD_ID
> ensure that file names are unique.
>
> > 3.  Start dropping attributes from the file based on a list of
> > priorities and generate additional hashes.  Create symlinked files to
> > the original file using these hashes (overwriting or not existing
> > symlinks based on policy).  This allows the binary package to be found
> > using either an exact set of attributes or a subset of higher-priority
> > attributes.  This is analogous to shared object symlinking.
> > 4.  The package manager will look for a binary package first using the
> > user's full config, and then by dropping optional elements of the
> > config (so maybe it does the search without CFLAGs, then without USE
> > flags).  Eventually it aborts based on user prefs (maybe the user only
> > wants an exact match, or is willing to accept alternate CFLAGs but not
> > USE flags, or maybe anything for the arch is selected> 5.  As always the final selected binary package still gets evaluated
> > like any other binary package to ensure it is usable.
> >
> > Such a system can identify whether a potentially usable file exists
> > using only filename, cutting down on fetching.  In the interests of
> > avoiding useless fetches we would only carry step 3 reasonably far -
> > packages would have to match based on architecture and any dynamic
> > linking requirements.  So we wouldn't generate hashes that didn't
> > include at least those minimums, and the package manager wouldn't
> > search for them.
> >
> > Obviously you could do more (if you have 5 combinations of use flags,
> > look for the set that matches most closely).  That couldn't be done
> > using hashes alone in an efficient way.  You could have a small
> > manifest file alongside the binary package that could be fetched
> > separately if the package manager wants to narrow things down and
> > fetch a few of those to narrow it down further.
>
> All of the above is oriented toward multi-profile binhosts, so we'll
> have to do a cost/benefit analysis to determine whether it's worth the
> effort to introduce the complexity that multi-profile binhosts add.

The hash label on the filenames was also considered around
multi-profiles.  I figured that if you're going to be building
variants of packages you'd want to parallelize and hashes work better
for that.  Plus at least in concept you could potentially identify and
fetch files by hash using info already in the local repo without
having to sync additional metadata from the binhost.  User-contributed
binaries would also work better in such a world though for obvious
security issues that might just take the form of local user-generated
repos (allowing users to supplement the upstream repo with local
builds for a cluster, without having to mirror/reporoduce the entire
upstream.

I do get that multi-profiles aren't entirely an essential feature, but
when you consider stuff like X11 support or stable/unstable it seems
like we're probably going to have to provide at least a few variants
on packages for this to be practical.  You could just put each profile
in a separate repo, but then anything that doesn't actually change
across profiles gets built multiple times.  The hash-based solution is
also a form of deduping.

But, hey, it is great to see anything like this being done at all.
Walking before running isn't a bad thing!

-- 
Rich


  reply	other threads:[~2021-02-14 13:30 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-10 17:57 [gentoo-dev] New project: binhost Andreas K. Hüttel
2021-02-10 18:51 ` Lars Wendler
2021-02-11  9:17   ` Michał Górny
2021-02-14  0:37     ` Zac Medico
2021-02-14  0:53       ` Zac Medico
2021-02-21  4:17         ` Zac Medico
2021-02-21 16:57           ` Andreas K. Huettel
2021-02-23 19:46           ` Zac Medico
2021-02-23 20:05             ` Zac Medico
2021-02-23 20:33               ` Zac Medico
2021-02-24 10:29                 ` Zac Medico
2021-02-24 12:13                   ` Zac Medico
2021-02-14  0:29   ` Zac Medico
2021-02-16  1:50   ` Francesco Riosa
2021-02-16  7:58     ` Jaco Kroon
2021-02-10 18:51 ` Aisha Tammy
2021-02-10 19:11 ` Rich Freeman
2021-02-10 19:15   ` Andreas K. Hüttel
2021-02-16  1:37     ` Francesco Riosa
2021-02-10 19:16   ` Aisha Tammy
2021-02-10 23:19   ` Alexey Sokolov
2021-02-11  9:41   ` Jaco Kroon
2021-02-14  1:51   ` Zac Medico
2021-02-14 13:30     ` Rich Freeman [this message]
2021-02-23 19:53     ` Zac Medico
2021-02-10 20:04 ` Frédéric Pierret
2021-03-13 12:25   ` Frédéric Pierret
2021-02-10 21:34 ` Toralf Förster
2021-02-11 14:00 ` Andrew Ammerlaan
2021-02-13  3:32 ` Marc Schiffbauer
2021-02-13  4:21 ` Aisha Tammy
2021-02-13  9:13 ` Michael Jones
2021-02-13 12:08 ` m1027
2021-02-15 14:20 ` [gentoo-dev] " Michael Haubenwallner
2021-03-13  8:43 ` [gentoo-dev] " Torokhov Sergey
2021-03-13 10:47   ` Marco Scardovi
2021-03-13 11:58     ` Wolfgang E. Sanyer
2021-03-14 21:12     ` Torokhov Sergey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGfcS_==dFEcSSc_3ZE78uH+GRmkSZVQHpgHp2BJmM1Hqc6v9A@mail.gmail.com' \
    --to=rich0@gentoo.org \
    --cc=binhost@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    --cc=zmedico@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox