From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 12D51138334 for ; Mon, 21 Oct 2019 00:05:48 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id C8B9DE0D04; Mon, 21 Oct 2019 00:05:44 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [IPv6:2001:470:ea4a:1:5054:ff:fec7:86e4]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 6FA58E0CE2 for ; Mon, 21 Oct 2019 00:05:44 +0000 (UTC) Received: from [192.168.1.13] (c-76-114-240-162.hsd1.md.comcast.net [76.114.240.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: kumba) by smtp.gentoo.org (Postfix) with ESMTPSA id 2EB8934C0CA for ; Mon, 21 Oct 2019 00:05:43 +0000 (UTC) Subject: Re: [gentoo-dev] New distfile mirror layout To: gentoo-dev@lists.gentoo.org References: <4c7465824f1fb69924c826f6bbe3ee73afa08ec8.camel@gentoo.org> <2d15507e-98ad-9466-75b7-7e8268ef2eb9@gentoo.org> <752be6c75f337df8ee8124a804247d2fb27e73b4.camel@gentoo.org> <100ae6ba-fdd3-b697-0ccc-860c9b8e4521@gentoo.org> <01086c53bfbf7702dac10b75a25927b62ef90b53.camel@gentoo.org> From: Joshua Kinard Openpgp: preference=signencrypt Message-ID: Date: Sun, 20 Oct 2019 20:05:40 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Archives-Salt: b5c1a8b1-1b74-447b-9240-9415d7ba5e27 X-Archives-Hash: 2614af3b836f70962d01fcc8cd7c9d4d On 10/20/2019 16:57, Joshua Kinard wrote:> On 10/20/2019 05:44, Michal Górny wrote: >> On Sun, 2019-10-20 at 05:21 -0400, Joshua Kinard wrote: >>> On 10/20/2019 04:32, Michal Górny wrote: [snip] >> You believe it to be a problem. Don't expect others to bother upstream >> with your preferences. > > Hah. So you consider texlive having 16k+ distfiles to be completely within > operating norms then? > > I did a quick look, and it looks like the TeX project has a fairly > comprehensive mirroring system distributed around the world. In fact, it > looks like they emulate Perl's CPAN system with "CTAN": > > https://ctan.org/ > > I don't know the history of the texlive and other associated tex packages in > Gentoo, but my guess is instead of doing what our Perl packages do, someone > just decided to mirror the CTAN archive directly on the Gentoo distfiles > system. It seems to me that what should actually happen is that we leverage > CTAN itself, much like CPAN, and use their mirroring system instead of > burdening our infrastructure as an unofficial CTAN archive. > > I know we've got a ton of Perl packages for the core set of Perl modules, > but doesn't the CPAN eclass also have the capability to auto-generate an > ebuild package for virtually any Perl package distributed via CPAN? Can > that logic be used with the CTAN system in its own eclass and then we remove > the 16k+ texlive modules off of our mirrors completely? Or at the worst, we > might just have to generate ebuilds for texlive modules and treat them as > discrete, installed packages. So looking at texlive-latexextra-2019-r2.ebuild, it defines three variables: - TEXLIVE_MODULE_CONTENTS, with 1,241 space-delimited module names - TEXLIVE_MODULE_DOC_CONTENTS, with 1,227 space-delimited doc names - TEXLIVE_MODULE_SRC_CONTENTS, with 745 space-delimited src names Then, in texlive-module.eclass, there's these loops: for i in ${TEXLIVE_MODULE_CONTENTS}; do SRC_URI="${SRC_URI} mirror://gentoo/texlive-module-${i}-${PV}.${PKGEXT}" done # Forge doc SRC_URI [ -n "${TEXLIVE_MODULE_DOC_CONTENTS}" ] && SRC_URI="${SRC_URI} doc? (" for i in ${TEXLIVE_MODULE_DOC_CONTENTS}; do SRC_URI="${SRC_URI} mirror://gentoo/texlive-module-${i}-${PV}.${PKGEXT}" done [ -n "${TEXLIVE_MODULE_DOC_CONTENTS}" ] && SRC_URI="${SRC_URI} )" # Forge source SRC_URI if [ -n "${TEXLIVE_MODULE_SRC_CONTENTS}" ] ; then SRC_URI="${SRC_URI} source? (" for i in ${TEXLIVE_MODULE_SRC_CONTENTS}; do SRC_URI="${SRC_URI} mirror://gentoo/texlive-module-${i}-${PV}.${PKGEXT}" done SRC_URI="${SRC_URI} )" fi I think this is definitely an issue with how this package is laying out its needed distfiles. It really should be leveraging CTAN system at a minimum to fetch all of the needed distfiles so we can get them off of our distfiles mirror. Then it would be interesting to re-run the math on the distfiles distribution using the different schemes highlighted in the GLEP-75 paper. Longer-term, I think this entire approach should be revisited by the TeX team to make it behave more like Perl or Python packages by having discrete ebuilds for these modules. That's not exactly a small undertaking, but this current approach feels very kludgy in its design and is probably asking for trouble. I looked at several of the modules on CTAN, and they each have their own version and even have different licenses. E.g., - altfont is licensed under "GNU General Public License" (version ??) - achemso is licensed under "The LaTeX Project Public License 1.3c" - arraysort is licensed under "The LaTeX Project Public License 1.2" - amsfonts is licensed under "The SIL Open Font License" - a0poster is licensed under "The LaTeX Project Public License" (ver ??) - arydshln is licensed under "The LaTeX Project Public License 1" - aurl is licensed under "Public Domain Software" That's just a random selection from the 'a' category. Do we have copies of those licenses in the tree? Do they allow redistribution of the distfiles? For the users that want "free" software, do any of the licenses in any of the TeX modules put up any disagreeable restrictions? Etc... -- Joshua Kinard Gentoo/MIPS kumba@gentoo.org rsa6144/5C63F4E3F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic