From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 30089138334 for ; Sun, 20 Oct 2019 06:51:42 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 71A4FE08C0; Sun, 20 Oct 2019 06:51:38 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id D40D5E08AE for ; Sun, 20 Oct 2019 06:51:37 +0000 (UTC) Received: from pomiot (c134-66.icpnet.pl [85.221.134.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: mgorny) by smtp.gentoo.org (Postfix) with ESMTPSA id E850034BE55; Sun, 20 Oct 2019 06:51:35 +0000 (UTC) Message-ID: <752be6c75f337df8ee8124a804247d2fb27e73b4.camel@gentoo.org> Subject: Re: [gentoo-dev] New distfile mirror layout From: =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?= To: gentoo-dev@lists.gentoo.org Date: Sun, 20 Oct 2019 08:51:31 +0200 In-Reply-To: <2d15507e-98ad-9466-75b7-7e8268ef2eb9@gentoo.org> References: <4c7465824f1fb69924c826f6bbe3ee73afa08ec8.camel@gentoo.org> <2d15507e-98ad-9466-75b7-7e8268ef2eb9@gentoo.org> Organization: Gentoo Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="=-7oclxBSTHPW5KY055pas" User-Agent: Evolution 3.32.4 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 X-Archives-Salt: e23429f8-d4a6-488c-ba90-a5d0fb863221 X-Archives-Hash: ae6be1a52c13efa28dd2b1c3e59e4384 --=-7oclxBSTHPW5KY055pas Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, 2019-10-19 at 19:24 -0400, Joshua Kinard wrote: > On 10/18/2019 09:41, Micha=C5=82 G=C3=B3rny wrote: > > Hi, everybody. > >=20 > > It is my pleasure to announce that yesterday (EU) evening we've switche= d > > to a new distfile mirror layout. Users will be switching to the new > > layout either as they upgrade Portage to 2.3.77 or -- if they upgraded > > already -- as their caches expire (24hrs). > >=20 > > The new layout is mostly a bow towards mirror admins, for some of whom > > having a 60000+ files in a single directory have been a problem.=20 > > However, I suppose some of you also found e.g. the directory index > > hardly usable due to its size. > >=20 > > Throughout a transitional period (whose exact length hasn't been decide= d > > yet), both layouts will be available. Afterwards, the old layout will > > be removed from mirrors. This has a few implications: > >=20 > > 1. Users who don't upgrade their package managers in time will lose > > the ability of fetching from Gentoo mirrors. This shouldn't be that > > much of a problem given that the core software needed to upgrade Portag= e > > should all have reliable upstream SRC_URIs. > >=20 > > 2. mirror://gentoo/file URIs will stop working. While technically you > > could use mirror://gentoo/XX/file, I'd rather recommend finally > > discarding its usage and moving distfiles to devspace. > >=20 > > 3. Directly fetching files from distfiles.gentoo.org will become > > a little harder. To fetch a distfile named 'foo-1.tar.gz', you'd have > > to use something like: > >=20 > > $ printf '%s' foo-1.tar.gz | b2sum | cut -c1-2 > > 1b > > $ wget http://distfiles.gentoo.org/distfiles/1b/foo-1.tar.gz > > ... > >=20 > >=20 > > Alternatively, you can: > >=20 > > $ wget http://distfiles.gentoo.org/distfiles/INDEX > >=20 > > and grep for the right path there. This INDEX is also a more > > lightweight alternative to HTML indexes generated by the servers. > >=20 > >=20 > > If you're interested in more background details and some plots, see [1]= . > >=20 > > [1] https://dev.gentoo.org/~mgorny/articles/improving-distfile-mirror-s= tructure.html > >=20 >=20 > So the answer I didn't really see directly stated here is, where do new > distfiles need to go //now//? E.g., if on woodpecker, I currently cp a > distfile to /space/distfiles-local. What is the new directory I need to > use? And if mirror://gentoo/${FOO} is going away, for the new distfiles > target, what would be the applicable prefix to use? >=20 > Directly using devspace seems like a bad idea, IMHO. Once long ago, we a= ll > got chastised for doing exactly that. Too much possibility of fragmentat= ion > as devs retire or package maintainership changes hands. Today you get chastised for using /space/distfiles-local and not following policy changes. The devmanual states that it's deprecated since at least 2011, and talks of using d.g.o [1]. > I looked at the whitepaper'ish-like writeup, and I kinda don't like using= a > hash-based naming scheme on the new distfiles layout. I really kind pref= er > breaking the directories up based on the first letter of the distfiles in > question, factoring case-sensitivity in (so you'd have 52 top-level > directories for A-Z and a-z, plus 10 more for 0-9). Under each of those > directories, additional subdirectories for the next few letters (say, > letters 2-3). Yes, this leads to some orphan cases where a distfile migh= t > live on its own, but from a direct navigation standpoint, it's easy to fi= nd > for someone browsing the distfiles server and easy to predict where a > distfile is at. >=20 > No math, statistical analysis, or deep-rooted knowledge of filesystems > behind that paragraph. Just a plain old unfiltered opinion. Sometimes, = I > need to go get a distfile off the Gentoo mirrors, and being able to quick= ly > find it in the mirror root is great. Having to do hash calculations to w= ork > out the file path will be *really* annoying. Your solution still doesn't solve the problem of having 8k-24k files in a single directory, even if you use 7 letters of prefix. So it just creates a lot of tiny directory noise for no practical gain. [1] https://devmanual.gentoo.org/general-concepts/mirrors/index.html#suitab= le-download-hosts --=20 Best regards, Micha=C5=82 G=C3=B3rny --=-7oclxBSTHPW5KY055pas Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQGTBAABCgB9FiEEx2qEUJQJjSjMiybFY5ra4jKeJA4FAl2sA/RfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEM3 NkE4NDUwOTQwOThEMjhDQzhCMjZDNTYzOUFEQUUyMzI5RTI0MEUACgkQY5ra4jKe JA65GggAkRU4c7UdzFtgGzstpKtjdndXQXJGKWAKaDPESDdJLeHhMDypTDLkjIzh Q780qIW3UR5B7xRNqfRS+xfKsVnbsp7dWot5n5YfM/nItHqKDd13AgCRYOx9y59y tpgsHawwZxvsPjI2nHjDFt0+IU6owPpg4Kw8C1F5vG+YaAAEHdU2MFDmFoSiQNBO peSmTwDzq/b5V9Mo4G6tbKwjToi7g4wl3FjAk4r9B21ECGy0gzB0Ryhko6bkCUeD 1/oEWyf+hvi9lZ49MOd2eiqZQEgZ7DalYEgAW24+N9SD/5OaHM7xYxhSx3eGgRMT KLgXFrSL+cpZhmIRov+OWaV8rfFBIw== =tdOi -----END PGP SIGNATURE----- --=-7oclxBSTHPW5KY055pas--