From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 655BA1382C5 for ; Wed, 24 Mar 2021 15:09:45 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 88AEBE0863; Wed, 24 Mar 2021 15:09:40 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 2B1AFE0848 for ; Wed, 24 Mar 2021 15:09:40 +0000 (UTC) Received: (nullmailer pid 32637 invoked by uid 1000); Wed, 24 Mar 2021 15:09:35 -0000 Date: Wed, 24 Mar 2021 10:09:35 -0500 From: William Hubbs To: gentoo-dev@lists.gentoo.org Cc: mgorny@gentoo.org Subject: Re: [gentoo-dev] rfc: usrmerge script Message-ID: Mail-Followup-To: gentoo-dev@lists.gentoo.org, mgorny@gentoo.org References: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="JdPdzKOsAAqRYX2S" Content-Disposition: inline In-Reply-To: X-Archives-Salt: 4635e455-452e-4ee7-b879-80d2787b0549 X-Archives-Hash: 10509c5265b5972237b6dad3f9389694 --JdPdzKOsAAqRYX2S Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Mar 24, 2021 at 08:48:41AM +0100, Micha=C5=82 G=C3=B3rny wrote: > On Wed, 2021-03-24 at 00:20 -0500, William Hubbs wrote: > > On Tue, Mar 23, 2021 at 10:23:11AM +0100, Micha=C5=82 G=C3=B3rny wrote: > > > On Sun, 2021-03-21 at 12:39 -0500, William Hubbs wrote: > > > > All, > > > >=20 > > > > the following is a script which will migrate a Gentoo system to the= usr > > > > merge layout. This is similar to the unsymlink-lib tool used to mig= rate > > > > a system from the 17.0 to the 17.1 profiles. > > > >=20 > > > > I'm attaching it here to get some comments before I package it, so > > > > please let me know if I have missed something. > > >=20 > > > To be honest, I don't think critical system modifications should be d= one > > > in shell script, and especially not via a fringe non-standards compla= int > > > shell implementation in busybox. Even if you can assume you make no > > > mistakes, shell is unreliable by design. For example, your script may > > > start behaving in unexpected ways if you run out of space. > >=20 > > I went with busybox shell in this case because busybox is a static > > binary and I figured with everything being moved around on the system it > > would be a good choice. > >=20 > > I could pretty easily go with straight sh/bash, I just was focusing on > > bb since the commands are built in. > >=20 > > You are right about shell behaving in weird ways if you run out of > > space, but that's true for anything that happens to be running when a > > system runs out of space. >=20 > This is not true. Most of the programming languages can behave sanely > in out-of-space conditions. Sure, I/O fails one way or another but if > it's just a matter of proper error handling. Shells on the other hand, > tend to use temporary files under the hood and can misbehave > in unexpected ways. =20 Another reason I went with bb is there are no deps outside of bb and the script. Also bb is on every gentoo system. In short, I went for something that I knew would be everywhere. If I'm going to do the work, it isn't going to be in python because I'm not a fan. I can write some in python, but I don't consider my skills there to be the best, and the indentation requirements make it tedious for me to debug. > > That is why I put the rollback directory in /var by > > default figuring there is more space there than in /, especially on > > systems with separate /var. >=20 > What really can help is reflinking on filesystems supporting that. =20 What really can help is more info instead of being terse like this. Which filesystems support it? > > The run_command wrapper in the script causes an immediate exit when the > > command it runs fails so things don't go any > > further than the failing command, so I'm not sure what else I can do > > with this situation. One option is to always echo the command we are > > about to run then just not run it if -d is specified. That means you > > would always see the commands the script is running. >=20 > Did you see the verbose explanatory messages unsymlink-lib produces > on every error condition? Compared to this, your script is printing > Windows-level 'error figure-it-out-yourself'. =20 This is early in development. Ideally I would handle most of the error conditions before people start using this on production systems; that is why I put it out here. > > > You don't seem to be handling file collisions at all. Even today we > > > have files like /bin/bzip2 and /usr/bin/bzip2, not to mention shared > > > libraries. Silently ignoring the problem or requiring the users to > > > manually ensure their system is clean is not going to solve it. >=20 > > I have added -i to the cp commands in my latest version of this so I'll > > see when we have duplicate names, and yes that is an issue, even without > > the /usr merge. >=20 > I don't think that's the best UI for this possible. I doubt that > a random user running the script will figure out why it is asking about > file replacements. =20 You are probably right, that's why I put this out here, and listed my thoughts about it below to generate discussion. > > I'm curious why we are duplicating all of these names in the first > > place honestly.=20 > >=20 > > One example of this is in coreutils, and we even say in the ebuild > > comment that we need to figure out why we are doing it. > >=20 > > There are a couple of ways forward for this. > >=20 > > 1) attempt to open bugs on the packages and remove the duplicate names > > by dropping symlinks and putting the files in their cannonical > > locations. this could lead to breakages if one of the alternate names= is > > expected by something until the /usr merge is done. > >=20 > > 2) try to guess at resolving the duplicate names as part of the > > =C2=A0=C2=A0=C2=A0usr merge process. > >=20 > > a. symmlinks in /bin or /sbin that point to the same name in /usr/bin or > > /usr/sbin should be removed. > > b. symlinks in /usr/bin or /usr/sbin that point to the same name in / > > should be removed. > > c. Other symlinks that I can think of should be preserved. > >=20 > > Which path for handling this is best? Do you have any thoughts? >=20 > This is a thing that should have been researched before writing > a script. =20 With all respect, knock it off. I'm not going to sit and read through all of the ebuilds in the tree looking for every little thing. > One case I'm aware of is using symlinks to replace tools with > alternatives, e.g. /usr/bin/bzip2 -> lbzip2 (/bin/bzip2 remains > as the original). Your example doesn't exist here, so I wouldn't have known about it if you hadn't mentioned it. It looks like that specific situation is controlled by the symlink use flag in lbzip, so it should be ok. > I don't know what's the best way forward. >=20 > >=20 > > > You don't seem to provide any helpful messages. When things fail, us= er > > > will be left in the blue with an error message from some system tool = (or > > > rather, cheap-ass busybox rewrite, I guess). > >=20 > > If the rollback directory is populated, you can use -r to recover, > > and the script will not let you perform the /usr merge if the rollback > > directory is not populated. > >=20 > > > Also, have you verified that busybox's cp(1) actually preserves all f= ile > > > properties (including xattrs, ACLs, caps...)? > >=20 > > The documentation for busybox states that -p preserves file attributes > > if possible, and by doing ls -l in the chroot I can see that > > ownership/permissions are preserved. So, logically I would guess it > > preserves the others. >=20 > I don't see how you would come to this conclusion. Just because some > tool is doing the absolute minimum it doesn't mean it does anything > else, and especially operations that require additional syscalls. Hmm, I think a better response here would have been to point out how I could look at the other attributes of a file you are talking about. > > If switching back to non-busybox is fine for this operation, I have no > > problem doing that either. > >=20 > > > Please don't forget to include tests with it. Docker's good for test= ing > > > stuff like this. > >=20 > > Docker can't be used during src_test that I'm aware of, so I'm not sure > > how Docker could be used to test this. >=20 > You can test it outside of ebuild. Some automated testing is better > than no testing at all. With all respect, I think most of this email was sarcasm, trolling etc instead of the type of constructive feedback we look for when sharing code on the list. I will do what I can to use parts of it, but I really have issues with your tone. William --JdPdzKOsAAqRYX2S Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQTVeuxEZo4uUHOkQAluVBb0MMRlOAUCYFtWKgAKCRBuVBb0MMRl OAqKAKCn2guj+jeAgu5Tm2NHQiNw/cFoegCfRPtJSX4wOfcZqQlep9q6nKP11c0= =0DCs -----END PGP SIGNATURE----- --JdPdzKOsAAqRYX2S--