* [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking @ 2024-05-12 2:26 Michał Górny 2024-05-12 2:44 ` Sam James 2024-05-12 17:22 ` Florian Schmaus 0 siblings, 2 replies; 7+ messages in thread From: Michał Górny @ 2024-05-12 2:26 UTC (permalink / raw) To: gentoo-dev; +Cc: Michał Górny Unpack crates in parallel using xargs to utilize multicore systems better. Perform checksumming via a single sha256sum invocation. For dev-python/watchfiles, this speeds up unpacking on my machine from 2.6 s to 0.75 s (warm cache). Signed-off-by: Michał Górny <mgorny@gentoo.org> --- eclass/cargo.eclass | 56 ++++++++++++++++++++++++++------------------- 1 file changed, 33 insertions(+), 23 deletions(-) diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass index 0f2da982f60c..5a16d3a30528 100644 --- a/eclass/cargo.eclass +++ b/eclass/cargo.eclass @@ -329,40 +329,50 @@ _cargo_gen_git_config() { cargo_src_unpack() { debug-print-function ${FUNCNAME} "$@" - mkdir -p "${ECARGO_VENDOR}" || die - mkdir -p "${S}" || die + mkdir -p "${ECARGO_VENDOR}" "${S}" || die local archive shasum pkg + local crates=() for archive in ${A}; do case "${archive}" in *.crate) - # when called by pkgdiff-mg, do not unpack crates - [[ ${PKGBUMPING} == ${PVR} ]] && continue - - ebegin "Loading ${archive} into Cargo registry" - tar -xf "${DISTDIR}"/${archive} -C "${ECARGO_VENDOR}/" || die - # generate sha256sum of the crate itself as cargo needs this - shasum=$(sha256sum "${DISTDIR}"/${archive} | cut -d ' ' -f 1) - pkg=$(basename ${archive} .crate) - cat <<- EOF > ${ECARGO_VENDOR}/${pkg}/.cargo-checksum.json - { - "package": "${shasum}", - "files": {} - } - EOF - # if this is our target package we need it in ${WORKDIR} too - # to make ${S} (and handle any revisions too) - if [[ ${P} == ${pkg}* ]]; then - tar -xf "${DISTDIR}"/${archive} -C "${WORKDIR}" || die - fi - eend $? + crates+=( "${archive}" ) ;; *) - unpack ${archive} + unpack "${archive}" ;; esac done + if [[ ${PKGBUMPING} != ${PVR} ]]; then + pushd "${DISTDIR}" >/dev/null || die + + ebegin "Unpacking crates" + printf '%s\0' "${crates[@]}" | + xargs -0 -P "$(makeopts_jobs)" -n 1 -- \ + tar -x -C "${ECARGO_VENDOR}" -f + assert + eend $? + + while read -d '' -r shasum archive; do + pkg=${archive%.crate} + cat <<- EOF > ${ECARGO_VENDOR}/${pkg}/.cargo-checksum.json || die + { + "package": "${shasum}", + "files": {} + } + EOF + + # if this is our target package we need it in ${WORKDIR} too + # to make ${S} (and handle any revisions too) + if [[ ${P} == ${pkg}* ]]; then + tar -xf "${archive}" -C "${WORKDIR}" || die + fi + done < <(sha256sum -z "${crates[@]}" || die) + + popd >/dev/null || die + fi + cargo_gen_config } -- 2.45.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking 2024-05-12 2:26 [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking Michał Górny @ 2024-05-12 2:44 ` Sam James 2024-05-12 17:22 ` Florian Schmaus 1 sibling, 0 replies; 7+ messages in thread From: Sam James @ 2024-05-12 2:44 UTC (permalink / raw) To: Michał Górny; +Cc: gentoo-dev Michał Górny <mgorny@gentoo.org> writes: > Unpack crates in parallel using xargs to utilize multicore systems > better. Perform checksumming via a single sha256sum invocation. > > For dev-python/watchfiles, this speeds up unpacking on my machine > from 2.6 s to 0.75 s (warm cache). > > Signed-off-by: Michał Górny <mgorny@gentoo.org> For completeness (acked on PR), lgtm. I assume tested by building all cargo inheritees (nearly said 'heirs' but.. lol)? Give a chance for others to look though. Also, thank you! > --- > eclass/cargo.eclass | 56 ++++++++++++++++++++++++++------------------- > 1 file changed, 33 insertions(+), 23 deletions(-) > > diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass > index 0f2da982f60c..5a16d3a30528 100644 > --- a/eclass/cargo.eclass > +++ b/eclass/cargo.eclass > @@ -329,40 +329,50 @@ _cargo_gen_git_config() { > cargo_src_unpack() { > debug-print-function ${FUNCNAME} "$@" > > - mkdir -p "${ECARGO_VENDOR}" || die > - mkdir -p "${S}" || die > + mkdir -p "${ECARGO_VENDOR}" "${S}" || die > > local archive shasum pkg > + local crates=() > for archive in ${A}; do > case "${archive}" in > *.crate) > - # when called by pkgdiff-mg, do not unpack crates > - [[ ${PKGBUMPING} == ${PVR} ]] && continue > - > - ebegin "Loading ${archive} into Cargo registry" > - tar -xf "${DISTDIR}"/${archive} -C "${ECARGO_VENDOR}/" || die > - # generate sha256sum of the crate itself as cargo needs this > - shasum=$(sha256sum "${DISTDIR}"/${archive} | cut -d ' ' -f 1) > - pkg=$(basename ${archive} .crate) > - cat <<- EOF > ${ECARGO_VENDOR}/${pkg}/.cargo-checksum.json > - { > - "package": "${shasum}", > - "files": {} > - } > - EOF > - # if this is our target package we need it in ${WORKDIR} too > - # to make ${S} (and handle any revisions too) > - if [[ ${P} == ${pkg}* ]]; then > - tar -xf "${DISTDIR}"/${archive} -C "${WORKDIR}" || die > - fi > - eend $? > + crates+=( "${archive}" ) > ;; > *) > - unpack ${archive} > + unpack "${archive}" > ;; > esac > done > > + if [[ ${PKGBUMPING} != ${PVR} ]]; then > + pushd "${DISTDIR}" >/dev/null || die > + > + ebegin "Unpacking crates" > + printf '%s\0' "${crates[@]}" | > + xargs -0 -P "$(makeopts_jobs)" -n 1 -- \ > + tar -x -C "${ECARGO_VENDOR}" -f > + assert > + eend $? > + > + while read -d '' -r shasum archive; do > + pkg=${archive%.crate} > + cat <<- EOF > ${ECARGO_VENDOR}/${pkg}/.cargo-checksum.json || die > + { > + "package": "${shasum}", > + "files": {} > + } > + EOF > + > + # if this is our target package we need it in ${WORKDIR} too > + # to make ${S} (and handle any revisions too) > + if [[ ${P} == ${pkg}* ]]; then > + tar -xf "${archive}" -C "${WORKDIR}" || die > + fi > + done < <(sha256sum -z "${crates[@]}" || die) > + > + popd >/dev/null || die > + fi > + > cargo_gen_config > } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking 2024-05-12 2:26 [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking Michał Górny 2024-05-12 2:44 ` Sam James @ 2024-05-12 17:22 ` Florian Schmaus 2024-05-12 18:21 ` Michał Górny 1 sibling, 1 reply; 7+ messages in thread From: Florian Schmaus @ 2024-05-12 17:22 UTC (permalink / raw) To: gentoo-dev, Michał Górny [-- Attachment #1.1.1: Type: text/plain, Size: 3256 bytes --] On 12/05/2024 04.26, Michał Górny wrote: > Unpack crates in parallel using xargs to utilize multicore systems > better. Perform checksumming via a single sha256sum invocation. > > For dev-python/watchfiles, this speeds up unpacking on my machine > from 2.6 s to 0.75 s (warm cache). > > Signed-off-by: Michał Górny <mgorny@gentoo.org> > --- > eclass/cargo.eclass | 56 ++++++++++++++++++++++++++------------------- > 1 file changed, 33 insertions(+), 23 deletions(-) > > diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass > index 0f2da982f60c..5a16d3a30528 100644 > --- a/eclass/cargo.eclass > +++ b/eclass/cargo.eclass > @@ -329,40 +329,50 @@ _cargo_gen_git_config() { > cargo_src_unpack() { > debug-print-function ${FUNCNAME} "$@" > > - mkdir -p "${ECARGO_VENDOR}" || die > - mkdir -p "${S}" || die > + mkdir -p "${ECARGO_VENDOR}" "${S}" || die > > local archive shasum pkg > + local crates=() > for archive in ${A}; do > case "${archive}" in > *.crate) > - # when called by pkgdiff-mg, do not unpack crates > - [[ ${PKGBUMPING} == ${PVR} ]] && continue > - > - ebegin "Loading ${archive} into Cargo registry" > - tar -xf "${DISTDIR}"/${archive} -C "${ECARGO_VENDOR}/" || die > - # generate sha256sum of the crate itself as cargo needs this > - shasum=$(sha256sum "${DISTDIR}"/${archive} | cut -d ' ' -f 1) > - pkg=$(basename ${archive} .crate) > - cat <<- EOF > ${ECARGO_VENDOR}/${pkg}/.cargo-checksum.json > - { > - "package": "${shasum}", > - "files": {} > - } > - EOF > - # if this is our target package we need it in ${WORKDIR} too > - # to make ${S} (and handle any revisions too) > - if [[ ${P} == ${pkg}* ]]; then > - tar -xf "${DISTDIR}"/${archive} -C "${WORKDIR}" || die > - fi > - eend $? > + crates+=( "${archive}" ) > ;; > *) > - unpack ${archive} > + unpack "${archive}" > ;; > esac > done > > + if [[ ${PKGBUMPING} != ${PVR} ]]; then > + pushd "${DISTDIR}" >/dev/null || die > + > + ebegin "Unpacking crates" > + printf '%s\0' "${crates[@]}" | > + xargs -0 -P "$(makeopts_jobs)" -n 1 -- \ Consider using get_makeopts_jobs instead of makeopts_jobs, as it searches more variables for --jobs. N.B.: If this where asking for a load-average limit, then using get_makeopts_loadavg would be the ideal way to pick up portage's default wrt --load-average. Therefore we should IMHO encourage using the get_makeopts_* functions over the (legacy?) non-get_ variants. > + tar -x -C "${ECARGO_VENDOR}" -f > + assert > + eend $? > + > + while read -d '' -r shasum archive; do > + pkg=${archive%.crate} > + cat <<- EOF > ${ECARGO_VENDOR}/${pkg}/.cargo-checksum.json || die > + { > + "package": "${shasum}", > + "files": {} > + } > + EOF > + > + # if this is our target package we need it in ${WORKDIR} too > + # to make ${S} (and handle any revisions too) > + if [[ ${P} == ${pkg}* ]]; then > + tar -xf "${archive}" -C "${WORKDIR}" || die > + fi > + done < <(sha256sum -z "${crates[@]}" || die) > + > + popd >/dev/null || die > + fi > + > cargo_gen_config > } > [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 17797 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking 2024-05-12 17:22 ` Florian Schmaus @ 2024-05-12 18:21 ` Michał Górny 2024-05-13 14:00 ` Florian Schmaus 2024-05-16 6:55 ` [gentoo-dev] Obtaining values for --jobs and --load-average Florian Schmaus 0 siblings, 2 replies; 7+ messages in thread From: Michał Górny @ 2024-05-12 18:21 UTC (permalink / raw) To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 2715 bytes --] On Sun, 2024-05-12 at 19:22 +0200, Florian Schmaus wrote: > On 12/05/2024 04.26, Michał Górny wrote: > > Unpack crates in parallel using xargs to utilize multicore systems > > better. Perform checksumming via a single sha256sum invocation. > > > > For dev-python/watchfiles, this speeds up unpacking on my machine > > from 2.6 s to 0.75 s (warm cache). > > > > Signed-off-by: Michał Górny <mgorny@gentoo.org> > > --- > > eclass/cargo.eclass | 56 ++++++++++++++++++++++++++------------------- > > 1 file changed, 33 insertions(+), 23 deletions(-) > > > > diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass > > index 0f2da982f60c..5a16d3a30528 100644 > > --- a/eclass/cargo.eclass > > +++ b/eclass/cargo.eclass > > @@ -329,40 +329,50 @@ _cargo_gen_git_config() { > > cargo_src_unpack() { > > debug-print-function ${FUNCNAME} "$@" > > > > - mkdir -p "${ECARGO_VENDOR}" || die > > - mkdir -p "${S}" || die > > + mkdir -p "${ECARGO_VENDOR}" "${S}" || die > > > > local archive shasum pkg > > + local crates=() > > for archive in ${A}; do > > case "${archive}" in > > *.crate) > > - # when called by pkgdiff-mg, do not unpack crates > > - [[ ${PKGBUMPING} == ${PVR} ]] && continue > > - > > - ebegin "Loading ${archive} into Cargo registry" > > - tar -xf "${DISTDIR}"/${archive} -C "${ECARGO_VENDOR}/" || die > > - # generate sha256sum of the crate itself as cargo needs this > > - shasum=$(sha256sum "${DISTDIR}"/${archive} | cut -d ' ' -f 1) > > - pkg=$(basename ${archive} .crate) > > - cat <<- EOF > ${ECARGO_VENDOR}/${pkg}/.cargo-checksum.json > > - { > > - "package": "${shasum}", > > - "files": {} > > - } > > - EOF > > - # if this is our target package we need it in ${WORKDIR} too > > - # to make ${S} (and handle any revisions too) > > - if [[ ${P} == ${pkg}* ]]; then > > - tar -xf "${DISTDIR}"/${archive} -C "${WORKDIR}" || die > > - fi > > - eend $? > > + crates+=( "${archive}" ) > > ;; > > *) > > - unpack ${archive} > > + unpack "${archive}" > > ;; > > esac > > done > > > > + if [[ ${PKGBUMPING} != ${PVR} ]]; then > > + pushd "${DISTDIR}" >/dev/null || die > > + > > + ebegin "Unpacking crates" > > + printf '%s\0' "${crates[@]}" | > > + xargs -0 -P "$(makeopts_jobs)" -n 1 -- \ > > Consider using get_makeopts_jobs instead of makeopts_jobs, as it > searches more variables for --jobs. Whose bright idea was to add a second similarly named function that does roughly the same thing but apparently differently? It can hardly get more confusing. -- Best regards, Michał Górny [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 512 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking 2024-05-12 18:21 ` Michał Górny @ 2024-05-13 14:00 ` Florian Schmaus 2024-05-13 14:02 ` Michał Górny 2024-05-16 6:55 ` [gentoo-dev] Obtaining values for --jobs and --load-average Florian Schmaus 1 sibling, 1 reply; 7+ messages in thread From: Florian Schmaus @ 2024-05-13 14:00 UTC (permalink / raw) To: gentoo-dev, Michał Górny On 12/05/2024 20.21, Michał Górny wrote: > On Sun, 2024-05-12 at 19:22 +0200, Florian Schmaus wrote: >> On 12/05/2024 04.26, Michał Górny wrote: >>> + if [[ ${PKGBUMPING} != ${PVR} ]]; then >>> + pushd "${DISTDIR}" >/dev/null || die >>> + >>> + ebegin "Unpacking crates" >>> + printf '%s\0' "${crates[@]}" | >>> + xargs -0 -P "$(makeopts_jobs)" -n 1 -- \ >> >> Consider using get_makeopts_jobs instead of makeopts_jobs, as it >> searches more variables for --jobs. > > Whose bright idea was to add a second similarly named function that does > roughly the same thing but apparently differently? It can hardly get > more confusing. You are absolutely right, it sucks that we have two very similar methods. You are invited to suggest how the situation can be improved. However, rambling without presenting alternatives is not helpful in any way. Potentially, you will either discover that there is a reason why things are the way they are, or find a better solution. - Flow ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking 2024-05-13 14:00 ` Florian Schmaus @ 2024-05-13 14:02 ` Michał Górny 0 siblings, 0 replies; 7+ messages in thread From: Michał Górny @ 2024-05-13 14:02 UTC (permalink / raw) To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 1211 bytes --] On Mon, 2024-05-13 at 16:00 +0200, Florian Schmaus wrote: > On 12/05/2024 20.21, Michał Górny wrote: > > On Sun, 2024-05-12 at 19:22 +0200, Florian Schmaus wrote: > > > On 12/05/2024 04.26, Michał Górny wrote: > > > > + if [[ ${PKGBUMPING} != ${PVR} ]]; then > > > > + pushd "${DISTDIR}" >/dev/null || die > > > > + > > > > + ebegin "Unpacking crates" > > > > + printf '%s\0' "${crates[@]}" | > > > > + xargs -0 -P "$(makeopts_jobs)" -n 1 -- \ > > > > > > Consider using get_makeopts_jobs instead of makeopts_jobs, as it > > > searches more variables for --jobs. > > > > Whose bright idea was to add a second similarly named function that does > > roughly the same thing but apparently differently? It can hardly get > > more confusing. > > You are absolutely right, it sucks that we have two very similar methods. > > You are invited to suggest how the situation can be improved. However, > rambling without presenting alternatives is not helpful in any way. > My suggestion would be for the person who introduced new methods and implicitly claimed the old methods to be "legacy" to put an actual effort to migrate consumers. -- Best regards, Michał Górny [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 512 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* [gentoo-dev] Obtaining values for --jobs and --load-average 2024-05-12 18:21 ` Michał Górny 2024-05-13 14:00 ` Florian Schmaus @ 2024-05-16 6:55 ` Florian Schmaus 1 sibling, 0 replies; 7+ messages in thread From: Florian Schmaus @ 2024-05-16 6:55 UTC (permalink / raw) To: gentoo-dev, Michał Górny [-- Attachment #1.1.1: Type: text/plain, Size: 2626 bytes --] On 12/05/2024 20.21, Michał Górny wrote: > On Sun, 2024-05-12 at 19:22 +0200, Florian Schmaus wrote: >> On 12/05/2024 04.26, Michał Górny wrote: >>> + if [[ ${PKGBUMPING} != ${PVR} ]]; then >>> + pushd "${DISTDIR}" >/dev/null || die >>> + >>> + ebegin "Unpacking crates" >>> + printf '%s\0' "${crates[@]}" | >>> + xargs -0 -P "$(makeopts_jobs)" -n 1 -- \ >> >> Consider using get_makeopts_jobs instead of makeopts_jobs, as it >> searches more variables for --jobs. I'm sorry, I was wrong. I thought I remembered that we only extended the new get_makeopts_* functions so that they also search other relevant variables besides MAKEOPTS for --jobs/--load-avgerage. This is wrong, the already existing makeopts_* functions were also changed at that time so that they take GNUMAKEFLAGS & Co. into account. As a reminder, this change was due to newer Portage versions setting reasonable default values for --jobs and --load-average. And GNUMAKEFLAGS is used for --load-average. The problem is that the order of the parameters of the makeopts_* functions is unfavorable. First comes the (optional) opts/flags string, then the (optional) default value. This means that as soon as you want to specify the default value explicitly, you also have to specify the opts/flags string. As a result, it can easily happen that only $MAKEOPTS is specified, which can lead to the default value set by portage not being taken into account, especially --load-average. This is, for example, the case with app-i18n/mozc and dev-qt/qtwebengine. The get_makeopts_* functions do not suffer from this issue. > My suggestion would be for the person who introduced new methods > and implicitly claimed the old methods to be "legacy" to put an actual > effort to migrate consumers. That is probably sensible and I believe also possible. I could only find a single ebuild where makeopts_ was used with a custom make opts/flags string: sci-libs/vtk which has some logic for NINJAOPTS. However, I wonder if it is ever sensible to specify other make opts/flags besides what _get_all_makeopts() from multiprocessing.eclass considers. If your make.conf contains NINJAOPS="-j 4 -l 4 <some-ninja-specific-args>" then you most likely always also want to set MAKEOPTS="-j 4" otherwise, only ninja builds are done with "-j 4". Of course, we could also simply add $NINJAOPTS to _get_all_makeopts(). Consequentially this raises the question if other similar variables should also be added (SAMUFLAGS, maybe?). I'd love to hear some opinions on this. - Flow [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 17797 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 618 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-05-16 6:55 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-05-12 2:26 [gentoo-dev] [PATCH] cargo.eclass: Optimize crate unpacking Michał Górny 2024-05-12 2:44 ` Sam James 2024-05-12 17:22 ` Florian Schmaus 2024-05-12 18:21 ` Michał Górny 2024-05-13 14:00 ` Florian Schmaus 2024-05-13 14:02 ` Michał Górny 2024-05-16 6:55 ` [gentoo-dev] Obtaining values for --jobs and --load-average Florian Schmaus
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox