* [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates @ 2025-01-12 12:56 Michał Górny 2025-01-12 13:15 ` Agostino Sarubbo 2025-01-13 9:40 ` Florian Schmaus 0 siblings, 2 replies; 10+ messages in thread From: Michał Górny @ 2025-01-12 12:56 UTC (permalink / raw To: gentoo-dev; +Cc: Michał Górny Emit a QA warning suggesting the use of crate tarball, when the package in question uses 300 crates or more. Such a long crate lists cause ebuilds and Manifests to grow very fast, causing significant space consumption on end user systems (including users who are not using the package in question) and git history growth. On top of that, fetching that many crates takes significant time. The number of 300 is pretty arbitrary, chosen approximately to match Manifests that are over 100 KiB in size. We should probably look into lowering in the future, as more packages are transitioned. --- eclass/cargo.eclass | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass index b1285e13a5b2..c8dd7c51bcfe 100644 --- a/eclass/cargo.eclass +++ b/eclass/cargo.eclass @@ -527,6 +527,12 @@ cargo_src_unpack() { done < <(sha256sum -z "${crates[@]}" || die) popd >/dev/null || die + + if [[ ${#crates[@]} -ge 300 ]]; then + eqawarn "This package uses a very large number of CRATES. Please provide" + eqawarn "a crate tarball instead and fetch it via SRC_URI. You can use" + eqawarn "'pycargoebuild --crate-tarball' to create one." + fi fi cargo_gen_config -- 2.48.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-12 12:56 [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates Michał Górny @ 2025-01-12 13:15 ` Agostino Sarubbo 2025-01-12 14:30 ` Alexey Sokolov 2025-01-13 9:40 ` Florian Schmaus 1 sibling, 1 reply; 10+ messages in thread From: Agostino Sarubbo @ 2025-01-12 13:15 UTC (permalink / raw To: gentoo-dev; +Cc: Michał Górny [-- Attachment #1: Type: text/plain, Size: 534 bytes --] On domenica 12 gennaio 2025 13:56:39 CET Michał Górny wrote: > + if [[ ${#crates[@]} -ge 300 ]]; then > + eqawarn "This package uses a very large number of > CRATES. Please provide" + eqawarn "a crate tarball > instead and fetch it via SRC_URI. You can use" + > eqawarn "'pycargoebuild --crate-tarball' to create one." + fi I would like to suggest to use "QA Notice: " prefix if you want to have them reported. Agostino [-- Attachment #2: Type: text/html, Size: 1782 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-12 13:15 ` Agostino Sarubbo @ 2025-01-12 14:30 ` Alexey Sokolov 2025-01-12 21:20 ` Ionen Wolkens 0 siblings, 1 reply; 10+ messages in thread From: Alexey Sokolov @ 2025-01-12 14:30 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 562 bytes --] 12.01.2025 13:15, Agostino Sarubbo пишет: > On domenica 12 gennaio 2025 13:56:39 CET Michał Górny wrote: > >> + if [[ ${#crates[@]} -ge 300 ]]; then > >> + eqawarn "This package uses a very large number of > >> CRATES. Please provide" + eqawarn "a crate tarball > >> instead and fetch it via SRC_URI. You can use" + > >> eqawarn "'pycargoebuild --crate-tarball' to create one." + fi > > I would like to suggest to use "QA Notice: " prefix if you want to have them reported. > > Agostino Side question: maybe eqawarn should add such prefix automatically? [-- Attachment #2: Type: text/html, Size: 1935 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-12 14:30 ` Alexey Sokolov @ 2025-01-12 21:20 ` Ionen Wolkens 0 siblings, 0 replies; 10+ messages in thread From: Ionen Wolkens @ 2025-01-12 21:20 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 1444 bytes --] On Sun, Jan 12, 2025 at 02:30:10PM +0000, Alexey Sokolov wrote: > 12.01.2025 13:15, Agostino Sarubbo пишет: > > > On domenica 12 gennaio 2025 13:56:39 CET Michał Górny wrote: > > > >> + if [[ ${#crates[@]} -ge 300 ]]; then > > > >> + eqawarn "This package uses a very large number of > > > >> CRATES. Please provide" + eqawarn "a crate tarball > > > >> instead and fetch it via SRC_URI. You can use" + > > > >> eqawarn "'pycargoebuild --crate-tarball' to create one." + fi > > > > I would like to suggest to use "QA Notice: " prefix if you want to have them reported. > > > > Agostino > > Side question: maybe eqawarn should add such prefix automatically? In the context of automatically filing bugs, sometimes we also want to warn for low priority things (e.g. either just something to be aware of or something to ideally fix on bump when happen to see the warning) without filing a hundred bugs. So question is more whether we want this to happen here or not and put pressure on maintainers (incl. proxied) to fix it asap. From a technical standpoint, eqawarn would need to know when it's the "header" of a notice (like optfeature_header) given we often have several eqawarn in a row and "QA Notice:" for each line would be weird. This means needing to modify all usage of it anyway which doesn't bring much vs just inlining it unless we wanted to do something more special with this. -- ionen [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-12 12:56 [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates Michał Górny 2025-01-12 13:15 ` Agostino Sarubbo @ 2025-01-13 9:40 ` Florian Schmaus 2025-01-13 13:23 ` orbea 2025-01-13 13:36 ` Michał Górny 1 sibling, 2 replies; 10+ messages in thread From: Florian Schmaus @ 2025-01-13 9:40 UTC (permalink / raw To: gentoo-dev, Michał Górny On 12/01/2025 13.56, Michał Górny wrote: > Emit a QA warning suggesting the use of crate tarball, when the package > in question uses 300 crates or more. Such a long crate lists cause > ebuilds and Manifests to grow very fast, causing significant space > consumption on end user systems (including users who are not using > the package in question) and git history growth. On top of that, > fetching that many crates takes significant time. > > The number of 300 is pretty arbitrary, chosen approximately to match > Manifests that are over 100 KiB in size. We should probably look into > lowering in the future, as more packages are transitioned. Thanks for your proposal. I know you wrote it because Gentoo is important to you. I am sorry, however, but the arbitrary limit you propose is harmful, and its necessity is questionable. It is unnecessary, at least in its current form, because the size growth of Gentoo's package repository is manageable. See the previous analysis for EGO_SUM [1]. What is more worrisome, however, is that it is harmful. First, switching from individual crates to a single crate tarball disallows inter-package crate archive reuse. Often, users will already have the required crates downloaded because another installed package used them. With an artificial create count limit, users must download rather large crate tarballs, causing unnecessary traffic and increasing the disk space on Gentoo's mirrors and end-user systems. The crate tarballs quickly eat away the saved disk space in the ebuild repository. Even worse, crate tarballs negatively impact the security of Gentoo users as they make it harder to audit ebuilds, and third-party crate tarballs add a further distinct party that can inject malicious code. Considering the recent supply chain attacks, this alone is a show-stopper. Why is this warning suddenly necessary? Did a user run into an issue caused by more than 300 entries? - Flow 1: https://public-inbox.gentoo.org/gentoo-dev/6ed0f286-f9eb-9e93-4fec-296646f79871@gentoo.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-13 9:40 ` Florian Schmaus @ 2025-01-13 13:23 ` orbea 2025-01-13 16:10 ` Ionen Wolkens 2025-01-13 13:36 ` Michał Górny 1 sibling, 1 reply; 10+ messages in thread From: orbea @ 2025-01-13 13:23 UTC (permalink / raw To: gentoo-dev On Mon, 13 Jan 2025 10:40:30 +0100 Florian Schmaus <flow@gentoo.org> wrote: > On 12/01/2025 13.56, Michał Górny wrote: > > Emit a QA warning suggesting the use of crate tarball, when the > > package in question uses 300 crates or more. Such a long crate > > lists cause ebuilds and Manifests to grow very fast, causing > > significant space consumption on end user systems (including users > > who are not using the package in question) and git history growth. > > On top of that, fetching that many crates takes significant time. > > > > The number of 300 is pretty arbitrary, chosen approximately to match > > Manifests that are over 100 KiB in size. We should probably look > > into lowering in the future, as more packages are transitioned. > Thanks for your proposal. I know you wrote it because Gentoo is > important to you. > > I am sorry, however, but the arbitrary limit you propose is harmful, > and its necessity is questionable. Its worth pointing out that is already being done in Gentoo, see dev-util/maturin for one example. > > It is unnecessary, at least in its current form, because the size > growth of Gentoo's package repository is manageable. See the previous > analysis for EGO_SUM [1]. > > What is more worrisome, however, is that it is harmful. > > First, switching from individual crates to a single crate tarball > disallows inter-package crate archive reuse. Often, users will > already have the required crates downloaded because another installed > package used them. With an artificial create count limit, users must > download rather large crate tarballs, causing unnecessary traffic and > increasing the disk space on Gentoo's mirrors and end-user systems. > The crate tarballs quickly eat away the saved disk space in the > ebuild repository. > > Even worse, crate tarballs negatively impact the security of Gentoo > users as they make it harder to audit ebuilds, and third-party crate > tarballs add a further distinct party that can inject malicious code. > Considering the recent supply chain attacks, this alone is a > show-stopper. > > Why is this warning suddenly necessary? Did a user run into an issue > caused by more than 300 entries? > > - Flow > > 1: > https://public-inbox.gentoo.org/gentoo-dev/6ed0f286-f9eb-9e93-4fec-296646f79871@gentoo.org/ > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-13 13:23 ` orbea @ 2025-01-13 16:10 ` Ionen Wolkens 0 siblings, 0 replies; 10+ messages in thread From: Ionen Wolkens @ 2025-01-13 16:10 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 1495 bytes --] On Mon, Jan 13, 2025 at 05:23:54AM -0800, orbea wrote: > On Mon, 13 Jan 2025 10:40:30 +0100 > Florian Schmaus <flow@gentoo.org> wrote: > > > On 12/01/2025 13.56, Michał Górny wrote: > > > Emit a QA warning suggesting the use of crate tarball, when the > > > package in question uses 300 crates or more. Such a long crate > > > lists cause ebuilds and Manifests to grow very fast, causing > > > significant space consumption on end user systems (including users > > > who are not using the package in question) and git history growth. > > > On top of that, fetching that many crates takes significant time. > > > > > > The number of 300 is pretty arbitrary, chosen approximately to match > > > Manifests that are over 100 KiB in size. We should probably look > > > into lowering in the future, as more packages are transitioned. > > Thanks for your proposal. I know you wrote it because Gentoo is > > important to you. > > > > I am sorry, however, but the arbitrary limit you propose is harmful, > > and its necessity is questionable. > > Its worth pointing out that is already being done in Gentoo, see > dev-util/maturin for one example. ftr this is something I was planning to do either way, but kept procrastinating given that package needs special handling to handle crates used by tests (it builds separate rust packages for its tests with their own crates). This just prompted me to finally have a look before a potential warning hits. -- ionen [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-13 9:40 ` Florian Schmaus 2025-01-13 13:23 ` orbea @ 2025-01-13 13:36 ` Michał Górny 2025-01-14 16:56 ` Florian Schmaus 1 sibling, 1 reply; 10+ messages in thread From: Michał Górny @ 2025-01-13 13:36 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 1769 bytes --] On Mon, 2025-01-13 at 10:40 +0100, Florian Schmaus wrote: > First, switching from individual crates to a single crate tarball > disallows inter-package crate archive reuse. Often, users will already > have the required crates downloaded because another installed package > used them. With an artificial create count limit, users must download > rather large crate tarballs, causing unnecessary traffic and increasing > the disk space on Gentoo's mirrors and end-user systems. The crate > tarballs quickly eat away the saved disk space in the ebuild repository. I'm sure you've also done a thorough analysis on how much crate reuse actually happens, as well as of the impact of adding thousands of tiny files to Gentoo mirrors, the inefficiency of fetching them one by one, and especially how badly crates.io actually handles that. I'm also sure you've done a thorough analysis of actual disk space use, that also takes into consideration the space wasted by thousands of tiny, inefficiently compressed files, compared to crate tarballs that benefit both from much stronger compression algorithm, as well as the opportunity to process much larger data blocks. > Even worse, crate tarballs negatively impact the security of Gentoo > users as they make it harder to audit ebuilds, and third-party crate > tarballs add a further distinct party that can inject malicious code. > Considering the recent supply chain attacks, this alone is a show-stopper. `cargo audit` does not care about how crates are delivered to Gentoo systems. > Why is this warning suddenly necessary? Did a user run into an issue > caused by more than 300 entries? It is not "sudden". It is an ongoing effort. -- Best regards, Michał Górny [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 512 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-13 13:36 ` Michał Górny @ 2025-01-14 16:56 ` Florian Schmaus 2025-01-14 17:43 ` Michał Górny 0 siblings, 1 reply; 10+ messages in thread From: Florian Schmaus @ 2025-01-14 16:56 UTC (permalink / raw To: gentoo-dev, Michał Górny [-- Attachment #1.1.1: Type: text/plain, Size: 2488 bytes --] On 13/01/2025 14.36, Michał Górny wrote: > On Mon, 2025-01-13 at 10:40 +0100, Florian Schmaus wrote: >> First, switching from individual crates to a single crate tarball >> disallows inter-package crate archive reuse. Often, users will already >> have the required crates downloaded because another installed package >> used them. With an artificial create count limit, users must download >> rather large crate tarballs, causing unnecessary traffic and increasing >> the disk space on Gentoo's mirrors and end-user systems. The crate >> tarballs quickly eat away the saved disk space in the ebuild repository. > > I'm sure you've also done a thorough analysis on how much crate reuse > actually happens, as well as of the impact of adding thousands of tiny > files to Gentoo mirrors, the inefficiency of fetching them one by one, > and especially how badly crates.io actually handles that. > > I'm also sure you've done a thorough analysis of actual disk space use, > that also takes into consideration the space wasted by thousands of > tiny, inefficiently compressed files, compared to crate tarballs that > benefit both from much stronger compression algorithm, as well > as the opportunity to process much larger data blocks. If you have numbers backing up the claimed adverse effects, please share them. I have demonstrated my calculations regarding ::gentoo size growth and its negligible effect. I think I should *not* be the one to prove that your change is required. It is the responsibility of the person suggesting the change. >> Even worse, crate tarballs negatively impact the security of Gentoo >> users as they make it harder to audit ebuilds, and third-party crate >> tarballs add a further distinct party that can inject malicious code. >> Considering the recent supply chain attacks, this alone is a show-stopper. > > `cargo audit` does not care about how crates are delivered to Gentoo > systems. I was referring to "detecting malicious modifications" as auditing. What 'cargo audit' does is unrelated to this. >> Why is this warning suddenly necessary? Did a user run into an issue >> caused by more than 300 entries? > > It is not "sudden". It is an ongoing effort. It certainly feels like all of a sudden to me. At least, as far as I understand, there is no trigger event or similar. I am sorry, but instead, it appears that you have decided that today is the day when we need this. - Flow [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 21567 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates 2025-01-14 16:56 ` Florian Schmaus @ 2025-01-14 17:43 ` Michał Górny 0 siblings, 0 replies; 10+ messages in thread From: Michał Górny @ 2025-01-14 17:43 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 525 bytes --] On Tue, 2025-01-14 at 17:56 +0100, Florian Schmaus wrote: > > It certainly feels like all of a sudden to me. At least, as far as I > understand, there is no trigger event or similar. I am sorry, but > instead, it appears that you have decided that today is the day when we > need this. I know it's hard to imagine but some of us aren't paid to work on Gentoo, and have to earn our living + deal with other responsibilities, so we do things when we find time to do them. -- Best regards, Michał Górny [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 512 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-01-14 17:43 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-12 12:56 [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates Michał Górny 2025-01-12 13:15 ` Agostino Sarubbo 2025-01-12 14:30 ` Alexey Sokolov 2025-01-12 21:20 ` Ionen Wolkens 2025-01-13 9:40 ` Florian Schmaus 2025-01-13 13:23 ` orbea 2025-01-13 16:10 ` Ionen Wolkens 2025-01-13 13:36 ` Michał Górny 2025-01-14 16:56 ` Florian Schmaus 2025-01-14 17:43 ` Michał Górny
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox