From: "Michał Górny" <mgorny@gentoo.org>
To: gentoo-portage-dev@lists.gentoo.org
Cc: Hidehiko Abe <hidehiko@chromium.org>
Subject: Re: [gentoo-portage-dev] [PATCH] Rewrite doins in python (bug 624526)
Date: Wed, 23 Aug 2017 16:51:36 +0200 [thread overview]
Message-ID: <1503499896.2219.1.camel@gentoo.org> (raw)
In-Reply-To: <20170814073921.20837-1-zmedico@gentoo.org>
W dniu pon, 14.08.2017 o godzinie 00∶39 -0700, użytkownik Zac Medico
napisał:
> From: Hidehiko Abe <hidehiko@chromium.org>
>
> doins is written in bash. However, specifically in case that
> too many files are installed, it is very slow.
> This CL rewrites the script in python for performance.
>
> BUG=chromium:712659
> TEST=time (./setup_board --forace && \
> ./build_package --withdev && \
> ./build_image --noenable_rootfs_verification test)
> ===Before===
> real 21m35.445s
> user 93m40.588s
> sys 21m31.224s
>
> ===After===
> real 17m30.106s
> user 94m1.812s
> sys 20m13.468s
>
> Change-Id: Ib10f623961ba316753d58397cff5e72fbc343339
> Reviewed-on: https://chromium-review.googlesource.com/559225
> X-Chromium-Bug: 712659
> X-Chromium-Bug-url: https://bugs.chromium.org/p/chromium/issues/detail?id=712659
> X-Gentoo-Bug: 624526
> X-Gentoo-Bug-url: https://bugs.gentoo.org/624526
> ---
> bin/doins.py | 341 +++++++++++++++++++++++++++++++++++++++++++++++
> bin/ebuild-helpers/doins | 123 ++---------------
> 2 files changed, 353 insertions(+), 111 deletions(-)
> create mode 100644 bin/doins.py
>
> diff --git a/bin/doins.py b/bin/doins.py
> new file mode 100644
> index 000000000..4a17287ca
> --- /dev/null
> +++ b/bin/doins.py
> @@ -0,0 +1,341 @@
> +#!/usr/bin/python -b
> +# Copyright 2017 The Chromium OS Authors. All rights reserved.
> +# Use of this source code is governed by a BSD-style license that can be
> +# found in the LICENSE file.
> +
> +from __future__ import print_function
> +
> +import argparse
> +import errno
> +import grp
> +import logging
> +import os
> +import pwd
> +import re
> +import shlex
> +import shutil
> +import stat
> +import subprocess
> +import sys
> +
> +from portage.util import movefile
> +
> +# Change back to original cwd _after_ all imports (bug #469338).
> +# See also bin/ebuild-helpers/doins, which sets the cwd.
> +os.chdir(os.environ["__PORTAGE_HELPER_CWD"])
> +
> +_PORTAGE_ACTUAL_DISTDIR = (
> + (os.environb if sys.version_info.major >= 3 else os.environ).get(
> + b'PORTAGE_ACTUAL_DISTDIR', b'') + b'/')
> +
> +
> +def _parse_install_options(
> + options, inprocess_runner_class, subprocess_runner_class):
> + """Parses command line arguments for install command."""
> + parser = argparse.ArgumentParser()
> + parser.add_argument('-g', '--group', default=-1, type=_parse_group)
> + parser.add_argument('-o', '--owner', default=-1, type=_parse_user)
> + # "install"'s --mode option is complicated. So here is partially
> + # supported.
> + parser.add_argument('-m', '--mode', type=_parse_mode)
> + split_options = shlex.split(options)
> + namespace, remaining = parser.parse_known_args(split_options)
> + if remaining:
> + print('Unknown install options: %s, %r' % (options, remaining),
> + file=sys.stderr)
> + if os.environ.get('DOINSSTRICTOPTION', '') == '1':
> + sys.exit(1)
> + print('Continue with falling back to \'install\' '
> + 'command execution, which can be slower.',
> + file=sys.stderr)
> + return subprocess_runner_class(split_options)
> + return inprocess_runner_class(namespace)
> +
> +
> +def _parse_group(group):
> + """Parses gid."""
> + g = grp.getgrnam(group)
> + if g:
> + return g.gr_gid
> + return int(group)
> +
> +
> +def _parse_user(user):
> + """Parses uid."""
> + u = pwd.getpwnam(user)
> + if u:
> + return u.pw_uid
> + return int(user)
> +
> +
> +def _parse_mode(mode):
> + # "install"'s --mode option is complicated. So here is partially
> + # supported.
> + # In Python 3, the prefix of octal int must be '0o' rather than '0'.
> + # So, set base explicitly in that case.
> + return int(mode, 8 if re.search(r'^0[0-7]*$', mode) else 0)
> +
> +
> +def _set_attributes(options, path):
> + """Sets attributes the file/dir at given |path|.
> +
> + Args:
> + options: object which has |owner|, |group| and |mode| fields.
> + |owner| is int value representing uid. Similary |group|
> + represents gid.
> + If -1 is set, just unchanged.
> + |mode| is the bits of permissions.
> + path: File/directory path.
> + """
> + if options.owner != -1 or options.group != -1:
> + os.lchown(path, options.owner, options.group)
> + if options.mode is not None:
> + os.chmod(path, options.mode)
> +
> +
> +class _InsInProcessInstallRunner(object):
> + def __init__(self, parsed_options):
> + self._parsed_options = parsed_options
> + self._copy_xattr = (
> + 'xattr' in os.environ.get('FEATURES', '').split())
> + if self._copy_xattr:
> + self._xattr_exclude = os.environ.get(
> + 'PORTAGE_XATTR_EXCLUDE',
> + 'security.* system.nfs4_acl')
> +
> + def run(self, source, dest_dir):
> + """Installs a file at |source| into |dest_dir| in process."""
> + dest = os.path.join(dest_dir, os.path.basename(source))
> + try:
> + # TODO: Consider to use portage.util.file_copy.copyfile
> + # introduced by
> + # https://gitweb.gentoo.org/proj/portage.git/commit/
> + # ?id=8ab5c8835931fd9ec098dbf4c5f416eb32e4a3a4
> + # after uprev.
> + shutil.copyfile(source, dest)
> + _set_attributes(self._parsed_options, dest)
> + if self._copy_xattr:
> + movefile._copyxattr(
> + source, dest,
> + exclude=self._xattr_exclude)
> + except Exception:
> + logging.exception(
> + 'Failed to copy file: '
> + '_parsed_options=%r, source=%r, dest_dir=%r',
> + self._parsed_options, source, dest_dir)
> + return False
> + return True
> +
> +
> +class _InsSubprocessInstallRunner(object):
> + def __init__(self, split_options):
> + self._split_options = split_options
> +
> + def run(self, source, dest_dir):
> + """Installs a file at |source| into |dest_dir| by 'install'."""
> + command = ['install'] + self._split_options + [source, dest_dir]
> + return subprocess.call(command) == 0
> +
> +
> +class _DirInProcessInstallRunner(object):
> + def __init__(self, parsed_options):
> + self._parsed_options = parsed_options
> +
> + def run(self, dest):
> + """Installs a dir into |dest| in process."""
> + try:
> + os.makedirs(dest)
> + except OSError as e:
> + if e.errno != errno.EEXIST or not os.path.isdir(dest):
> + raise
> + _set_attributes(self._parsed_options, dest)
> +
> +
> +class _DirSubprocessInstallRunner(object):
> + """Runs real 'install' command to install files or directories."""
> +
> + def __init__(self, split_options):
> + self._split_options = split_options
> +
> + def run(self, dest):
> + """Installs a dir into |dest| by 'install' command."""
> + command = ['install', '-d'] + self._split_options + [dest]
> + subprocess.call(command)
> +
> +
> +class _InstallRunner(object):
> + def __init__(self):
> + self._ins_runner = _parse_install_options(
> + os.environ.get('INSOPTIONS', ''),
> + _InsInProcessInstallRunner,
> + _InsSubprocessInstallRunner)
> + self._dir_runner = _parse_install_options(
> + os.environ.get('DIROPTIONS', ''),
> + _DirInProcessInstallRunner,
> + _DirSubprocessInstallRunner)
> +
> + def install_file(self, source, dest_dir):
> + """Installs a file at |source| into |dest_dir| directory."""
> + return self._ins_runner.run(source, dest_dir)
> +
> + def install_dir(self, dest):
> + """Creates a directory at |dest|."""
> + self._dir_runner.run(dest)
> +
> +
> +def _doins(args, install_runner, relpath, source_root):
> + """Installs a file as if 'install' command runs.
> +
> + Installs a file at |source_root|/|relpath| into |args.dest|/|relpath|.
> + If |args.preserve_symlinks| is set, creates symlink if the source is a
> + symlink.
> +
> + Args:
> + args: parsed arguments. It should have following fields.
> + - preserve_symlinks: bool representing whether symlinks
> + needs to be preserved.
> + - dest: Destination root directory.
> + - insoptions: options for 'install' command.
> + intall_runner: _InstallRunner instance for file install.
> + relpath: Relative path of the file being installed.
> + source_root: Source root directory.
> +
> + Returns: True on success.
> + """
> + source = os.path.join(source_root, relpath)
> + dest = os.path.join(args.dest, relpath)
> + if os.path.islink(source):
> + # Our fake $DISTDIR contains symlinks that should not be
> + # reproduced inside $D. In order to ensure that things like
> + # dodoc "$DISTDIR"/foo.pdf work as expected, we dereference
> + # symlinked files that refer to absolute paths inside
> + # $PORTAGE_ACTUAL_DISTDIR/.
> + try:
> + if (args.preserve_symlinks and
> + not os.readlink(source).startswith(
> + _PORTAGE_ACTUAL_DISTDIR)):
> + linkto = os.readlink(source)
> + shutil.rmtree(dest, ignore_errors=True)
> + os.symlink(linkto, dest)
> + return True
> + except Exception:
> + logging.exception(
> + 'Failed to create symlink: '
> + 'args=%r, relpath=%r, source_root=%r',
> + args, relpath, source_root)
> + return False
> +
> + return install_runner.install_file(source, os.path.dirname(dest))
> +
> +
> +def _parse_args():
> + """Parses the command line arguments."""
> + parser = argparse.ArgumentParser()
> + parser.add_argument(
> + '--recursive', action='store_true',
> + help='If set, installs files recursively. Otherwise, '
> + 'just skips directories.')
> + parser.add_argument(
> + '--preserve_symlinks', action='store_true',
> + help='If set, a symlink will be installed as symlink.')
> + # If helper is dodoc, it changes the behavior for the directory
> + # install without --recursive.
> + parser.add_argument('--helper', help='Name of helper.')
> + parser.add_argument(
> + '--dest',
> + help='Destination where the files are installed.')
> + parser.add_argument(
> + 'sources', nargs='*',
> + help='Source file/directory paths to be installed.')
> +
> + args = parser.parse_args()
> +
> + # Encode back to the original byte stream. Please see
> + # http://bugs.python.org/issue8776.
> + if sys.version_info.major >= 3:
> + args.dest = os.fsencode(args.dest)
> + args.sources = [os.fsencode(source) for source in args.sources]
> +
> + return args
> +
> +
> +def _install_dir(args, install_runner, source):
> + """Installs directory at |source|.
> +
> + Returns:
> + True on success, False on failure, or None on skipped.
> + """
> + if not args.recursive:
> + if args.helper == 'dodoc':
> + print('!!! %s: %s is a directory' % (
> + args.helper, source),
> + file=sys.stderr)
> + return False
> + # Neither success nor fail. Return None to indicate skipped.
> + return None
> +
> + # Strip trailing '/'s.
> + source = source.rstrip(b'/')
> + source_root = os.path.dirname(source)
> + dest_dir = os.path.join(args.dest, os.path.basename(source))
> + install_runner.install_dir(dest_dir)
> +
> + relpath_list = []
> + for dirpath, dirnames, filenames in os.walk(source):
> + for dirname in dirnames:
> + relpath = os.path.relpath(
> + os.path.join(dirpath, dirname),
> + source_root)
> + dest = os.path.join(args.dest, relpath)
> + install_runner.install_dir(dest)
> + relpath_list.extend(
> + os.path.relpath(
> + os.path.join(dirpath, filename), source_root)
> + for filename in filenames)
> +
> + if not relpath_list:
> + # NOTE: Even if only an empty directory is installed here, it
> + # still counts as success, since an empty directory given as
> + # an argument to doins -r should not trigger failure.
> + return True
> + success = True
> + for relpath in relpath_list:
> + if not _doins(args, install_runner, relpath, source_root):
> + success = False
> + return success
> +
> +
> +def main():
> + args = _parse_args()
> + install_runner = _InstallRunner()
> +
> + if not os.path.isdir(args.dest):
> + install_runner.install_dir(args.dest)
> +
> + any_success = False
> + any_failure = False
> + for source in args.sources:
> + if (os.path.isdir(source) and
> + (not args.preserve_symlinks or
> + not os.path.islink(source))):
> + ret = _install_dir(args, install_runner, source)
> + if ret is None:
> + continue
> + if ret:
> + any_success = True
> + else:
> + any_failure = True
> + else:
> + if _doins(
> + args, install_runner,
> + os.path.basename(source),
> + os.path.dirname(source)):
> + any_success = True
> + else:
> + any_failure = True
> +
> + return 0 if not any_failure and any_success else 1
> +
> +
> +if __name__ == '__main__':
> + sys.exit(main())
> diff --git a/bin/ebuild-helpers/doins b/bin/ebuild-helpers/doins
> index 93052c2ea..de559780c 100755
> --- a/bin/ebuild-helpers/doins
> +++ b/bin/ebuild-helpers/doins
> @@ -24,10 +24,10 @@ if [ $# -lt 1 ] ; then
> fi
>
> if [[ "$1" == "-r" ]] ; then
> - DOINSRECUR=y
> + RECURSIVE_OPTION='--recursive'
> shift
> else
> - DOINSRECUR=n
> + RECURSIVE_OPTION=''
> fi
>
> if ! ___eapi_has_prefix_variables; then
> @@ -44,116 +44,17 @@ if [[ ${INSDESTTREE#${ED}} != "${INSDESTTREE}" ]]; then
> fi
>
> if ___eapi_doins_and_newins_preserve_symlinks; then
> - PRESERVE_SYMLINKS=y
> + SYMLINK_OPTION='--preserve_symlinks'
> else
> - PRESERVE_SYMLINKS=n
> + SYMLINK_OPTION=''
> fi
>
> -export TMP=$(mktemp -d "${T}/.doins_tmp_XXXXXX")
> -# Use separate directories to avoid potential name collisions.
> -mkdir -p "$TMP"/{1,2}
> +# Use safe cwd, avoiding unsafe import for bug #469338.
> +export __PORTAGE_HELPER_CWD=${PWD}
> +cd "${PORTAGE_PYM_PATH:-/usr/lib/portage/pym}"
>
> -[[ ! -d ${ED}${INSDESTTREE} ]] && dodir "${INSDESTTREE}"
> -
> -_doins() {
> - local mysrc="$1" mydir="$2" cleanup="" rval
> -
> - if [ -L "$mysrc" ] ; then
> - # Our fake $DISTDIR contains symlinks that should
> - # not be reproduced inside $D. In order to ensure
> - # that things like dodoc "$DISTDIR"/foo.pdf work
> - # as expected, we dereference symlinked files that
> - # refer to absolute paths inside
> - # $PORTAGE_ACTUAL_DISTDIR/.
> - if [ $PRESERVE_SYMLINKS = y ] && \
> - ! [[ $(readlink "$mysrc") == "$PORTAGE_ACTUAL_DISTDIR"/* ]] ; then
> - rm -rf "${ED}$INSDESTTREE/$mydir/${mysrc##*/}" || return $?
> - cp -P "$mysrc" "${ED}$INSDESTTREE/$mydir/${mysrc##*/}"
> - return $?
> - else
> - cp "$mysrc" "$TMP/2/${mysrc##*/}" || return $?
> - mysrc="$TMP/2/${mysrc##*/}"
> - cleanup=$mysrc
> - fi
> - fi
> -
> - install ${INSOPTIONS} "${mysrc}" "${ED}${INSDESTTREE}/${mydir}"
> - rval=$?
> - [[ -n ${cleanup} ]] && rm -f "${cleanup}"
> - [ $rval -ne 0 ] && echo "!!! ${helper}: $mysrc does not exist" 1>&2
> - return $rval
> -}
> -
> -_xdoins() {
> - local -i failed=0
> - while read -r -d $'\0' x ; do
> - _doins "$x" "${x%/*}"
> - ((failed|=$?))
> - done
> - return $failed
> -}
> -
> -success=0
> -failed=0
> -
> -for x in "$@" ; do
> - if [[ $PRESERVE_SYMLINKS = n && -d $x ]] || \
> - [[ $PRESERVE_SYMLINKS = y && -d $x && ! -L $x ]] ; then
> - if [ "${DOINSRECUR}" == "n" ] ; then
> - if [[ ${helper} == dodoc ]] ; then
> - echo "!!! ${helper}: $x is a directory" 1>&2
> - ((failed|=1))
> - fi
> - continue
> - fi
> -
> - while [ "$x" != "${x%/}" ] ; do
> - x=${x%/}
> - done
> - if [ "$x" = "${x%/*}" ] ; then
> - pushd "$PWD" >/dev/null
> - else
> - pushd "${x%/*}" >/dev/null
> - fi
> - x=${x##*/}
> - x_orig=$x
> - # Follow any symlinks recursively until we've got
> - # a normal directory for 'find' to traverse. The
> - # name of the symlink will be used for the name
> - # of the installed directory, as discussed in
> - # bug #239529.
> - while [ -L "$x" ] ; do
> - pushd "$(readlink "$x")" >/dev/null
> - x=${PWD##*/}
> - pushd "${PWD%/*}" >/dev/null
> - done
> - if [[ $x != $x_orig ]] ; then
> - mv "$x" "$TMP/1/$x_orig"
> - pushd "$TMP/1" >/dev/null
> - fi
> - find "$x_orig" -type d -exec dodir "${INSDESTTREE}/{}" \;
> - find "$x_orig" \( -type f -or -type l \) -print0 | _xdoins
> - if [[ ${PIPESTATUS[1]} -eq 0 ]] ; then
> - # NOTE: Even if only an empty directory is installed here, it
> - # still counts as success, since an empty directory given as
> - # an argument to doins -r should not trigger failure.
> - ((success|=1))
> - else
> - ((failed|=1))
> - fi
> - if [[ $x != $x_orig ]] ; then
> - popd >/dev/null
> - mv "$TMP/1/$x_orig" "$x"
> - fi
> - while popd >/dev/null 2>&1 ; do true ; done
> - else
> - _doins "${x}"
> - if [[ $? -eq 0 ]] ; then
> - ((success|=1))
> - else
> - ((failed|=1))
> - fi
> - fi
> -done
> -rm -rf "$TMP"
> -[[ $failed -ne 0 || $success -eq 0 ]] && { __helpers_die "${helper} failed"; exit 1; } || exit 0
> +"${PORTAGE_PYTHON:-/usr/bin/python}" \
> + "${PORTAGE_BIN_PATH:-/usr/lib/portage/bin}"/doins.py \
> + ${RECURSIVE_OPTION} ${SYMLINK_OPTION} \
> + --helper "${helper}" --dest "${ED}${INSDESTTREE}" "$@" || \
> +{ __helpers_die "${helper} failed"; exit 1; }
To be honest, I don't like the idea of using more Python inside ebuild
helpers. But if you're sure this is safe and not going to collide with
ebuilds doing random stuff with Python, feel free to proceed with it.
--
Best regards,
Michał Górny
prev parent reply other threads:[~2017-08-23 14:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-14 7:39 [gentoo-portage-dev] [PATCH] Rewrite doins in python (bug 624526) Zac Medico
2017-08-14 19:49 ` M. J. Everitt
2017-08-14 20:31 ` Zac Medico
2017-08-22 14:21 ` Brian Dolbec
2017-08-23 14:51 ` Michał Górny [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1503499896.2219.1.camel@gentoo.org \
--to=mgorny@gentoo.org \
--cc=gentoo-portage-dev@lists.gentoo.org \
--cc=hidehiko@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox