From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 717181382C5 for ; Sun, 14 Feb 2021 13:30:57 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 4E141E088D; Sun, 14 Feb 2021 13:30:53 +0000 (UTC) Received: from mail-oi1-f179.google.com (mail-oi1-f179.google.com [209.85.167.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 0B619E0886 for ; Sun, 14 Feb 2021 13:30:52 +0000 (UTC) Received: by mail-oi1-f179.google.com with SMTP id v193so5078766oie.8 for ; Sun, 14 Feb 2021 05:30:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6zPY4LEiAykQZ3aXXmBVkRvaRYTTnyxFlrhEeoozb+o=; b=mY5j9TJnq7GY8Cr1lZcejSkJUbGLVKmSup7pbB9O9xqlY9+aaX16UQd3kCM3KP/GsO WcdaLYA1m1BO9Lj6dW/tRK5f/tRQ7VVmFLOCJZDEktdbOvbQS5N5Knn2msObleOepZzo /ZvpX2RjGieT/xMJWMTYEHoS3gLKQvhS6r4ZF+0SrwqCug63zSuD4/JOeYCsiv+IAUbc UcbkWzjJ4CXPRsHltGrpq7lcq4UrjDqhu+2facBMQi0hehU+IjdGUDMx0bGOxyRktwJh rkt5caNRj4ODxsGAYXOc4PkPE0buFIIGMTb2M4PARvyNYIpAdzQ16bGUeY8EIbexAcCP lruA== X-Gm-Message-State: AOAM532KSKNJR5nO5B2FvBwQLF3NrZfgeu2ac0zLOmPDQOJxGZAr2AhF 0YtfK1r/IyBNl2GkTmeTpl1Er+RXNTYRvkxhA8Se8CogprA= X-Google-Smtp-Source: ABdhPJwBFMGKWQL8t2b7+kssyDoij7bzoc9n/IjUPzjjIGRHdRvh7yj7MKivNxMfWvgUoHXVxRC0C1sHBg77CpzSm6Y= X-Received: by 2002:aca:314c:: with SMTP id x73mr5348464oix.85.1613309452159; Sun, 14 Feb 2021 05:30:52 -0800 (PST) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 References: <24819743.1r3eYUQgxm@farino> <818c2e65-2501-9429-a9e7-95868c2d1c96@gentoo.org> In-Reply-To: <818c2e65-2501-9429-a9e7-95868c2d1c96@gentoo.org> From: Rich Freeman Date: Sun, 14 Feb 2021 08:30:35 -0500 Message-ID: Subject: Re: [gentoo-dev] New project: binhost To: Zac Medico Cc: gentoo-dev , binhost@gentoo.org Content-Type: text/plain; charset="UTF-8" X-Archives-Salt: 2d1b954f-bda7-4e03-8fdd-42bf0a577c52 X-Archives-Hash: b3459bd93c41cd5f33eeda36821bd335 On Sat, Feb 13, 2021 at 8:51 PM Zac Medico wrote: > > > 2. Generate a hash of the file contents - this can go in the filename > > so that the file can co-exist with other files, and be located > > assuming you have a full matching set of metadata. > > For FEATURES=binpkg-multi-instance we currently use an integer BUILD_ID > ensure that file names are unique. > > > 3. Start dropping attributes from the file based on a list of > > priorities and generate additional hashes. Create symlinked files to > > the original file using these hashes (overwriting or not existing > > symlinks based on policy). This allows the binary package to be found > > using either an exact set of attributes or a subset of higher-priority > > attributes. This is analogous to shared object symlinking. > > 4. The package manager will look for a binary package first using the > > user's full config, and then by dropping optional elements of the > > config (so maybe it does the search without CFLAGs, then without USE > > flags). Eventually it aborts based on user prefs (maybe the user only > > wants an exact match, or is willing to accept alternate CFLAGs but not > > USE flags, or maybe anything for the arch is selected> 5. As always the final selected binary package still gets evaluated > > like any other binary package to ensure it is usable. > > > > Such a system can identify whether a potentially usable file exists > > using only filename, cutting down on fetching. In the interests of > > avoiding useless fetches we would only carry step 3 reasonably far - > > packages would have to match based on architecture and any dynamic > > linking requirements. So we wouldn't generate hashes that didn't > > include at least those minimums, and the package manager wouldn't > > search for them. > > > > Obviously you could do more (if you have 5 combinations of use flags, > > look for the set that matches most closely). That couldn't be done > > using hashes alone in an efficient way. You could have a small > > manifest file alongside the binary package that could be fetched > > separately if the package manager wants to narrow things down and > > fetch a few of those to narrow it down further. > > All of the above is oriented toward multi-profile binhosts, so we'll > have to do a cost/benefit analysis to determine whether it's worth the > effort to introduce the complexity that multi-profile binhosts add. The hash label on the filenames was also considered around multi-profiles. I figured that if you're going to be building variants of packages you'd want to parallelize and hashes work better for that. Plus at least in concept you could potentially identify and fetch files by hash using info already in the local repo without having to sync additional metadata from the binhost. User-contributed binaries would also work better in such a world though for obvious security issues that might just take the form of local user-generated repos (allowing users to supplement the upstream repo with local builds for a cluster, without having to mirror/reporoduce the entire upstream. I do get that multi-profiles aren't entirely an essential feature, but when you consider stuff like X11 support or stable/unstable it seems like we're probably going to have to provide at least a few variants on packages for this to be practical. You could just put each profile in a separate repo, but then anything that doesn't actually change across profiles gets built multiple times. The hash-based solution is also a form of deduping. But, hey, it is great to see anything like this being done at all. Walking before running isn't a bad thing! -- Rich