* [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
@ 2022-01-10 5:39 Michał Górny
2022-01-10 12:01 ` Anna
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: Michał Górny @ 2022-01-10 5:39 UTC (permalink / raw
To: gentoo-dev
Hi, everyone.
TL;DR: how to deal with setuptools (and newer distutils vendored by
setuptools) replacing .egg-info files with directories?
I know I'm reiterating the same topic but I think we're coming to having
to actually do something about this and I still haven't found
a satisfactory solution.
For the people new to the problem, a quick recap: Python packages
install metadata called .egg-info (newer build systems use .dist-info
but that's another matter). Original Python distutils installed this
as a single file but setuptools replaced it with a directory.
Now that distutils is deprecated, setuptools is vendoring its own
version. We can switch between the system and the vendored version
using an envvar. By default, setuptools < 60 uses system distutils,
and >= 60 (masked) uses vendored distutils.
The big problem is that switching implies changing the format, so if you
install foo-X, then switch, then reinstall the same version of foo,
you're going to have the file replaced by a directory. This is not
supported by the PMS, and Portage handles it somewhat suboptimally
(renaming the old file and leaving it orphaned).
I should probably emphasize here that the .egg-info path contains
the package version, so this is a problem only if the same upstream
version is being reinstalled.
You can easily reproduce the problem by playing with:
SETUPTOOLS_USE_DISTUTILS=stdlib
SETUPTOOLS_USE_DISTUTILS=local # vendored
and repeatedly building some DISTUTILS_USE_SETUPTOOLS=no packages.
From a quick grep, there are 179 packages using DUS=no right now
(and there might be more using distutils without the declaration).
I don't think we can ignore the problem.
What we can do right now is force the SETUPTOOLS_USE_DISTUTILS=stdlib
default back via the eclass or patches for the time being. This will
let us finally unmask setuptools-60+ while delaying fixing the problem.
However, given pypa's tendency to remove deprecated stuff quickly, this
is unlikely to work for long.
Some ideas on fixing this:
1. For a start, with distutils deprecated are actually migrating to
setuptools or other build systems upstream. This effectively solves
the problem for us since the .egg-info switch happens on version bump
and there is no file collision. However, this isn't going to help for
dead projects.
2. We could control the distutils version in ebuilds directly,
i.e. force "stdlib" for the current versions and have developers switch
to "local" on version bumps. Combined with 1., this will probably
increase the coverage a bit but dead packages will remain in the way.
It also relies on all devs understanding the problem.
3. The developers could explicitly bump versions (i.e. create "Gentoo
subversions") of packages that don't expect any updates. We can get
100% coverage this way but it's hard and requires patching.
4. We could have the eclasses switch to "local" model and rename
the .egg-info files somehow at some point. The main question is "rename
how?"
5. We could have the eclasses convert .egg-info into the newer .dist-
info format. However, I'm not aware of any existing tool doing such
a conversion, and I'm not convinced I want to write one right now,
and whether it wouldn't have compatibility implications.
These are all the options I can think of right now that don't make my
head explode. I'd like to hear your ideas.
--
Best regards,
Michał Górny
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-10 5:39 [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess Michał Górny
@ 2022-01-10 12:01 ` Anna
2022-01-10 12:27 ` Michał Górny
2022-01-10 14:06 ` Francesco Riosa
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Anna @ 2022-01-10 12:01 UTC (permalink / raw
To: gentoo-dev
On 2022-01-10 06:39, Michał Górny wrote:
> 1. For a start, with distutils deprecated are actually migrating to
> setuptools or other build systems upstream. This effectively solves
> the problem for us since the .egg-info switch happens on version bump
> and there is no file collision.
Deprecation warnings may go unnoticed on upstream's end. Tinderbox can
open bugs (same as it does with setuptools warnings) for packages using
distutils to encourage maintainers report this to upstream issue
trackers.
>However, this isn't going to help for dead projects.
Dead dependencies probably should be reported upstream too.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-10 12:01 ` Anna
@ 2022-01-10 12:27 ` Michał Górny
0 siblings, 0 replies; 10+ messages in thread
From: Michał Górny @ 2022-01-10 12:27 UTC (permalink / raw
To: gentoo-dev
On Mon, 2022-01-10 at 17:01 +0500, Anna wrote:
> On 2022-01-10 06:39, Michał Górny wrote:
>
> > However, this isn't going to help for dead projects.
>
> Dead dependencies probably should be reported upstream too.
>
Unfortunately, upstreams often don't care. As long as it is installable
via pip... What's even worse, it's not uncommon for people to add new
dependencies on packages that haven't seen a release in 5+ years.
--
Best regards,
Michał Górny
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-10 5:39 [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess Michał Górny
2022-01-10 12:01 ` Anna
@ 2022-01-10 14:06 ` Francesco Riosa
2022-01-10 14:43 ` Michał Górny
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Francesco Riosa @ 2022-01-10 14:06 UTC (permalink / raw
To: gentoo development, Michał Górny
[-- Attachment #1: Type: text/plain, Size: 649 bytes --]
Il giorno lun 10 gen 2022 alle ore 06:39 Michał Górny <mgorny@gentoo.org>
ha scritto:
[...]
>
> The big problem is that switching implies changing the format, so if you
> install foo-X, then switch, then reinstall the same version of foo,
> you're going to have the file replaced by a directory. This is not
> supported by the PMS, and Portage handles it somewhat suboptimally
> (renaming the old file and leaving it orphaned).
>
> Is it possible to force the package to own the file may be touching it in
src_install() ?
This way we have a dirty installation but no orphaned file and it would be
back to normal at next install
[-- Attachment #2: Type: text/html, Size: 1026 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-10 5:39 [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess Michał Górny
2022-01-10 12:01 ` Anna
2022-01-10 14:06 ` Francesco Riosa
@ 2022-01-10 14:43 ` Michał Górny
2022-01-11 1:19 ` Mike Gilbert
2022-01-11 18:23 ` Andreas K. Huettel
2022-01-13 10:36 ` Michał Górny
4 siblings, 1 reply; 10+ messages in thread
From: Michał Górny @ 2022-01-10 14:43 UTC (permalink / raw
To: gentoo-dev
On Mon, 2022-01-10 at 06:39 +0100, Michał Górny wrote:
> 4. We could have the eclasses switch to "local" model and rename
> the .egg-info files somehow at some point. The main question is "rename
> how?"
>
If anyone's interested, I've published a proof-of-concept for this:
https://github.com/gentoo/gentoo/pull/23721
Long story short, the eclass detects if vendored distutils are being
used and renames the directories from .egg-info to .g.egg-info then.
This basically means the tag changes from e.g. "py3.8" to "py3.8.g".
I'm testing this approach now and it doesn't seem to break anything.
--
Best regards,
Michał Górny
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-10 14:43 ` Michał Górny
@ 2022-01-11 1:19 ` Mike Gilbert
2022-01-11 3:49 ` Mike Gilbert
0 siblings, 1 reply; 10+ messages in thread
From: Mike Gilbert @ 2022-01-11 1:19 UTC (permalink / raw
To: Gentoo Dev
On Mon, Jan 10, 2022 at 9:43 AM Michał Górny <mgorny@gentoo.org> wrote:
>
> On Mon, 2022-01-10 at 06:39 +0100, Michał Górny wrote:
> > 4. We could have the eclasses switch to "local" model and rename
> > the .egg-info files somehow at some point. The main question is "rename
> > how?"
> >
>
> If anyone's interested, I've published a proof-of-concept for this:
>
> https://github.com/gentoo/gentoo/pull/23721
>
> Long story short, the eclass detects if vendored distutils are being
> used and renames the directories from .egg-info to .g.egg-info then.
> This basically means the tag changes from e.g. "py3.8" to "py3.8.g".
> I'm testing this approach now and it doesn't seem to break anything.
A possible alternative would be to define pkg_preinst in the eclass
and have it move the .egg-info file out of the way before portage
tries to replace it with a directory. That would be a significant API
change for distutils-r1 though.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-11 1:19 ` Mike Gilbert
@ 2022-01-11 3:49 ` Mike Gilbert
0 siblings, 0 replies; 10+ messages in thread
From: Mike Gilbert @ 2022-01-11 3:49 UTC (permalink / raw
To: Gentoo Dev
On Mon, Jan 10, 2022 at 8:19 PM Mike Gilbert <floppym@gentoo.org> wrote:
>
> On Mon, Jan 10, 2022 at 9:43 AM Michał Górny <mgorny@gentoo.org> wrote:
> >
> > On Mon, 2022-01-10 at 06:39 +0100, Michał Górny wrote:
> > > 4. We could have the eclasses switch to "local" model and rename
> > > the .egg-info files somehow at some point. The main question is "rename
> > > how?"
> > >
> >
> > If anyone's interested, I've published a proof-of-concept for this:
> >
> > https://github.com/gentoo/gentoo/pull/23721
> >
> > Long story short, the eclass detects if vendored distutils are being
> > used and renames the directories from .egg-info to .g.egg-info then.
> > This basically means the tag changes from e.g. "py3.8" to "py3.8.g".
> > I'm testing this approach now and it doesn't seem to break anything.
>
> A possible alternative would be to define pkg_preinst in the eclass
> and have it move the .egg-info file out of the way before portage
> tries to replace it with a directory. That would be a significant API
> change for distutils-r1 though.
Portage actually handles installation of a directory over a file by
renaming the file with a ".backup.nnnn" suffix.
* Installation of a directory is blocked by a file:
* '/usr/lib/python3.9/site-packages/layman-2.4.3-py3.9.egg-info'
* This file will be renamed to a different name:
* '/usr/lib/python3.9/site-packages/layman-2.4.3-py3.9.egg-info.backup.0000'
A lazy approach would be to just let Portage do this and advise people
to clean them up later if so desired.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-10 5:39 [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess Michał Górny
` (2 preceding siblings ...)
2022-01-10 14:43 ` Michał Górny
@ 2022-01-11 18:23 ` Andreas K. Huettel
2022-01-11 20:10 ` Michał Górny
2022-01-13 10:36 ` Michał Górny
4 siblings, 1 reply; 10+ messages in thread
From: Andreas K. Huettel @ 2022-01-11 18:23 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]
>
> TL;DR: how to deal with setuptools (and newer distutils vendored by
> setuptools) replacing .egg-info files with directories?
> I should probably emphasize here that the .egg-info path contains
> the package version, so this is a problem only if the same upstream
> version is being reinstalled.
>
> You can easily reproduce the problem by playing with:
>
> SETUPTOOLS_USE_DISTUTILS=stdlib
> SETUPTOOLS_USE_DISTUTILS=local # vendored
> 2. We could control the distutils version in ebuilds directly,
> i.e. force "stdlib" for the current versions and have developers switch
> to "local" on version bumps. Combined with 1., this will probably
> increase the coverage a bit but dead packages will remain in the way.
> It also relies on all devs understanding the problem.
How about switching it with a new Python version?
(since that is also in the path...)
--
Andreas K. Hüttel
dilfridge@gentoo.org
Gentoo Linux developer
(council, toolchain, base-system, perl, libreoffice)
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 981 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-11 18:23 ` Andreas K. Huettel
@ 2022-01-11 20:10 ` Michał Górny
0 siblings, 0 replies; 10+ messages in thread
From: Michał Górny @ 2022-01-11 20:10 UTC (permalink / raw
To: gentoo-dev
On Tue, 2022-01-11 at 19:23 +0100, Andreas K. Huettel wrote:
> >
> > TL;DR: how to deal with setuptools (and newer distutils vendored by
> > setuptools) replacing .egg-info files with directories?
>
> > I should probably emphasize here that the .egg-info path contains
> > the package version, so this is a problem only if the same upstream
> > version is being reinstalled.
> >
> > You can easily reproduce the problem by playing with:
> >
> > SETUPTOOLS_USE_DISTUTILS=stdlib
> > SETUPTOOLS_USE_DISTUTILS=local # vendored
>
> > 2. We could control the distutils version in ebuilds directly,
> > i.e. force "stdlib" for the current versions and have developers
> > switch
> > to "local" on version bumps. Combined with 1., this will probably
> > increase the coverage a bit but dead packages will remain in the
> > way.
> > It also relies on all devs understanding the problem.
>
> How about switching it with a new Python version?
> (since that is also in the path...)
>
I'm afraid upstream is likely to drop support for stdlib distutils
before we manage to deprecate all the old versions... or even before
3.11 comes out.
--
Best regards,
Michał Górny
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess
2022-01-10 5:39 [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess Michał Górny
` (3 preceding siblings ...)
2022-01-11 18:23 ` Andreas K. Huettel
@ 2022-01-13 10:36 ` Michał Górny
4 siblings, 0 replies; 10+ messages in thread
From: Michał Górny @ 2022-01-13 10:36 UTC (permalink / raw
To: gentoo-dev
On Mon, 2022-01-10 at 06:39 +0100, Michał Górny wrote:
>
> 5. We could have the eclasses convert .egg-info into the newer .dist-
> info format. However, I'm not aware of any existing tool doing such
> a conversion, and I'm not convinced I want to write one right now,
> and whether it wouldn't have compatibility implications.
>
Ok, here's a somewhat related idea. Since we're going to have to switch
to PEP517 builds anyway at some point, and PEP517 builds use .dist-info
rather than .egg-info, how about:
1. we set SETUPTOOLS_USE_DISTUTILS=stdlib for the time being,
2. then we switch to =local in PEP517 mode?
i.e. effectively skip the intermediate step of having .egg-info
directory in distutils ebuilds, and go straight for .dist-info.
In other words:
a. existing ebuilds will not be affected, and unmasking setuptools-60+
should not cause any problems,
b. we will explicitly switch to the new mode via eclass var and test
both potentially breaking changes simultaneously.
--
Best regards,
Michał Górny
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-01-13 10:37 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-10 5:39 [gentoo-dev] Looking for a solution to the distutils/setuptools .egg-info mess Michał Górny
2022-01-10 12:01 ` Anna
2022-01-10 12:27 ` Michał Górny
2022-01-10 14:06 ` Francesco Riosa
2022-01-10 14:43 ` Michał Górny
2022-01-11 1:19 ` Mike Gilbert
2022-01-11 3:49 ` Mike Gilbert
2022-01-11 18:23 ` Andreas K. Huettel
2022-01-11 20:10 ` Michał Górny
2022-01-13 10:36 ` Michał Górny
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox