* [gentoo-project] How to improve detection of unmaintained packages? @ 2019-03-23 7:32 Michał Górny 2019-03-23 8:04 ` Joonas Niilola ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Michał Górny @ 2019-03-23 7:32 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 4055 bytes --] Hi, Gentoo is still having a major problem of unmaintained packages. I'm not talking about pure 'maintainer-needed' here but packages that have apparent maintainers and stay under the radar for long, harming users in the process. I'd like to query potential solutions as how we could improve this and look for new maintainers sooner. The current state ================= The definition of an unmaintained package here is a bit blurry. For our needs, let's say that an unmaintained package is a package that is not getting attention of any of the maintainers, whose bugs are not looked at, that does not receive version bumps or simply fails to build for a long time. This is especially the case with 'revived herds', i.e. projects that were formed from old herds. Their main characteristic is that they 'maintain' a large number of loosely-related packages, and their developers take care of only a small subset of them. Sadly, we still have people who cherish that model, and instead of taking packages they care about themselves, they shove it into one of 'their' herds. So far we're rarely catching such cases directly. Sometimes it happens when another developer tries to use the package and notices the problem, then finds that it's been reported a long time ago and never received any attention. Sometimes, after retiring a developer we notice that he had 'maintained' packages that were broken for years and never received any attention. There are even real cases of developers taking over broken packages just to prevent them from being lastrited but without ever fixing them. Then, some of the packages are noticed as result of major API update trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker. Those API changes provoke build failures, and while investigating them we discover that some of the software hasn't seen any upstream attention since 2000 (!), not to mention maintainers that could actually patch the issues. Version bump-based inactivity? ============================== One of the options would be to monitor inactivity as negligence to bump packages. With euscan and/or repology, we are at least able to partially monitor and report new versions of software (I think someone used to do that but I don't see those reports anymore). While this still requires some manual processing (esp. given that repology results are sometimes mistaken), it would be a step forward. The counterarguments for doing this is that not all version bumps are meaningful to Gentoo. We'd have to at least be able to filter out development releases if maintainers are not doing them. Sometimes we also skip releases if they don't introduce anything meaningful to Gentoo users. Finally, some developers reject new versions of software for various reasons. Bugzilla-based inactivity? ========================== I've noticed something interesting in Fedora lately. They have a policy that if a package build failure is reported (note: they are reporting them automatically) and the maintainer does not update it from the 'NEW' state, it is automatically orphaned after 8 weeks. Effectively, if the maintainer does not take care (or at least pretends to) of the package, it is orphaned automatically. I suppose we might be able to look for a similar policy in Gentoo. However, there are two obvious counterarguments. Firstly, this would create 'busywork' that people would be required to do in order to prevent from orphaning their packages. Secondly, a fair number of developers would just do this 'busywork' to every new bug just to avoid the problem, rendering the measure ineffective. What can we actually do? ======================== Do you have any specific ideas how we could actually improve the situation? I'm particularly looking for things we could do at least semi-automatically, without having to spend tremendous effort looking through thousands of unhandled bugs manually. -- Best regards, Michał Górny [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 963 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 7:32 [gentoo-project] How to improve detection of unmaintained packages? Michał Górny @ 2019-03-23 8:04 ` Joonas Niilola 2019-03-23 8:48 ` Toralf Förster ` (2 subsequent siblings) 3 siblings, 0 replies; 13+ messages in thread From: Joonas Niilola @ 2019-03-23 8:04 UTC (permalink / raw To: gentoo-project On 3/23/19 9:32 AM, Michał Górny wrote: > > Bugzilla-based inactivity? > ========================== > I've noticed something interesting in Fedora lately. They have a policy > that if a package build failure is reported (note: they are reporting > them automatically) and the maintainer does not update it from the 'NEW' > state, it is automatically orphaned after 8 weeks. Effectively, > if the maintainer does not take care (or at least pretends to) > of the package, it is orphaned automatically. > > I suppose we might be able to look for a similar policy in Gentoo. > However, there are two obvious counterarguments. Firstly, this would > create 'busywork' that people would be required to do in order to > prevent from orphaning their packages. Secondly, a fair number of > developers would just do this 'busywork' to every new bug just to avoid > the problem, rendering the measure ineffective. > > Third: Aren't you afraid this will result in huge load of packages being "maintainer-needed", getting users involved with them and creating even more workload for proxy-maint devs? Yes, it's still better than the current situation, but even now there is still a problem of some PRs being left to rot unnoticed :\ Anyway I'd like to see some action taken for disbanding inactive 'herds' or projects that do not respond to bugs (if there's some easy way to measure that?) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 7:32 [gentoo-project] How to improve detection of unmaintained packages? Michał Górny 2019-03-23 8:04 ` Joonas Niilola @ 2019-03-23 8:48 ` Toralf Förster 2019-03-23 8:51 ` Michał Górny 2019-03-23 14:17 ` Alec Warner 2019-03-23 20:32 ` Toralf Förster 3 siblings, 1 reply; 13+ messages in thread From: Toralf Förster @ 2019-03-23 8:48 UTC (permalink / raw To: gentoo-project [-- Attachment #1.1: Type: text/plain, Size: 318 bytes --] On 3/23/19 8:32 AM, Michał Górny wrote: > Secondly, a fair number of > developers would just do this 'busywork' to every new bug just to avoid > the problem, rendering the measure ineffective. What's the rationale behind this for a dev? Claiming the package furthermore? -- Toralf PGP 23217DA7 9B888F45 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 8:48 ` Toralf Förster @ 2019-03-23 8:51 ` Michał Górny 0 siblings, 0 replies; 13+ messages in thread From: Michał Górny @ 2019-03-23 8:51 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 839 bytes --] On Sat, 2019-03-23 at 09:48 +0100, Toralf Förster wrote: > On 3/23/19 8:32 AM, Michał Górny wrote: > > Secondly, a fair number of > > developers would just do this 'busywork' to every new bug just to avoid > > the problem, rendering the measure ineffective. > What's the rationale behind this for a dev? Claiming the package furthermore? > Well, let me expand how see it. If we require that dev needs to touch *every* bug against the package to avoid it being orphaned, we introduce a silly case of accidentally orphaning packages just because developer failed to touch a single bug. Therefore, some developers would just touch all bugs ASAP to avoid the problem. However, it would only verify that the package is 'maintained' at a time the bug is filed and not afterwards. -- Best regards, Michał Górny [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 963 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 7:32 [gentoo-project] How to improve detection of unmaintained packages? Michał Górny 2019-03-23 8:04 ` Joonas Niilola 2019-03-23 8:48 ` Toralf Förster @ 2019-03-23 14:17 ` Alec Warner 2019-03-23 17:05 ` Raymond Jennings 2019-03-23 18:25 ` Rich Freeman 2019-03-23 20:32 ` Toralf Förster 3 siblings, 2 replies; 13+ messages in thread From: Alec Warner @ 2019-03-23 14:17 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 5686 bytes --] On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@gentoo.org> wrote: > Hi, > > Gentoo is still having a major problem of unmaintained packages. > I'm not talking about pure 'maintainer-needed' here but packages that > have apparent maintainers and stay under the radar for long, harming > users in the process. I'd like to query potential solutions as how we > could improve this and look for new maintainers sooner. > > > The current state > ================= > The definition of an unmaintained package here is a bit blurry. For our > needs, let's say that an unmaintained package is a package that is not > getting attention of any of the maintainers, whose bugs are not looked > at, that does not receive version bumps or simply fails to build for > a long time. > > This is especially the case with 'revived herds', i.e. projects that > were formed from old herds. Their main characteristic is that they > 'maintain' a large number of loosely-related packages, and their > developers take care of only a small subset of them. Sadly, we still > have people who cherish that model, and instead of taking packages they > care about themselves, they shove it into one of 'their' herds. > > So far we're rarely catching such cases directly. Sometimes it happens > when another developer tries to use the package and notices the problem, > then finds that it's been reported a long time ago and never received > any attention. > > Sometimes, after retiring a developer we notice that he had 'maintained' > packages that were broken for years and never received any attention. > There are even real cases of developers taking over broken packages just > to prevent them from being lastrited but without ever fixing them. > > Then, some of the packages are noticed as result of major API update > trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker. > Those API changes provoke build failures, and while investigating them > we discover that some of the software hasn't seen any upstream attention > since 2000 (!), not to mention maintainers that could actually patch > the issues. > > > Version bump-based inactivity? > ============================== > One of the options would be to monitor inactivity as negligence to bump > packages. With euscan and/or repology, we are at least able to > partially monitor and report new versions of software (I think someone > used to do that but I don't see those reports anymore). While this > still requires some manual processing (esp. given that repology results > are sometimes mistaken), it would be a step forward. > > The counterarguments for doing this is that not all version bumps are > meaningful to Gentoo. We'd have to at least be able to filter out > development releases if maintainers are not doing them. Sometimes we > also skip releases if they don't introduce anything meaningful to Gentoo > users. Finally, some developers reject new versions of software for > various reasons. > I've also considered to just use time. Many *packages* have not been touched in N time. While some software doesn't get updates often, even routine maintenance should require edits on a fairly regular basis. > > > Bugzilla-based inactivity? > ========================== > I've noticed something interesting in Fedora lately. They have a policy > that if a package build failure is reported (note: they are reporting > them automatically) and the maintainer does not update it from the 'NEW' > state, it is automatically orphaned after 8 weeks. Effectively, > if the maintainer does not take care (or at least pretends to) > of the package, it is orphaned automatically. > > I suppose we might be able to look for a similar policy in Gentoo. > However, there are two obvious counterarguments. Firstly, this would > create 'busywork' that people would be required to do in order to > prevent from orphaning their packages. Secondly, a fair number of > developers would just do this 'busywork' to every new bug just to avoid > the problem, rendering the measure ineffective. > Avoid letting the perfect be the enemy of the good here. Any metric can be gamed by developers; but it turns out we must choose some metric to drive the organization. I'm fairly sure not *all* developers will automate this busywork; because *some* of us want to see the number of unmaintained packages reduced; resulting in a net-win. > > > What can we actually do? > ======================== > Do you have any specific ideas how we could actually improve > the situation? I'm particularly looking for things we could do at least > semi-automatically, without having to spend tremendous effort looking > through thousands of unhandled bugs manually. > So I'd recommend avoiding a specific implementation; which means don't trigger off of a specific signal. Signals: 1) euscan first; because its most accurate and plausible already implemented. 2) Date-based scanning; its trivial to implement. So now for each package, we have 2 straightforward signals. When was it last touched, how many versions behind? Rules: A package is unmaintained if it: - Has not been touched in 5 years - Is behind 3 versions AND hasn't been touched in 2 years - Is behind 5 versions AND hasn't been touched in 1 years As we add more signals (e.g. doesn't build, or unfixed bugs) we can add additional rules. We could generate a QA report per package on the qa reports page. If there is an API for request the QA report, we could cross-link from p.g.o. -A > -- > Best regards, > Michał Górny > > [-- Attachment #2: Type: text/html, Size: 7046 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 14:17 ` Alec Warner @ 2019-03-23 17:05 ` Raymond Jennings 2019-03-23 17:38 ` Michał Górny 2019-03-23 18:25 ` Rich Freeman 1 sibling, 1 reply; 13+ messages in thread From: Raymond Jennings @ 2019-03-23 17:05 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 7859 bytes --] On Sat, Mar 23, 2019 at 7:18 AM Alec Warner <antarus@gentoo.org> wrote: > > > On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@gentoo.org> wrote: > >> Hi, >> >> Gentoo is still having a major problem of unmaintained packages. >> I'm not talking about pure 'maintainer-needed' here but packages that >> have apparent maintainers and stay under the radar for long, harming >> users in the process. I'd like to query potential solutions as how we >> could improve this and look for new maintainers sooner. >> >> >> The current state >> ================= >> The definition of an unmaintained package here is a bit blurry. For our >> needs, let's say that an unmaintained package is a package that is not >> getting attention of any of the maintainers, whose bugs are not looked >> at, that does not receive version bumps or simply fails to build for >> a long time. >> >> This is especially the case with 'revived herds', i.e. projects that >> were formed from old herds. Their main characteristic is that they >> 'maintain' a large number of loosely-related packages, and their >> developers take care of only a small subset of them. Sadly, we still >> have people who cherish that model, and instead of taking packages they >> care about themselves, they shove it into one of 'their' herds. >> >> So far we're rarely catching such cases directly. Sometimes it happens >> when another developer tries to use the package and notices the problem, >> then finds that it's been reported a long time ago and never received >> any attention. >> >> Sometimes, after retiring a developer we notice that he had 'maintained' >> packages that were broken for years and never received any attention. >> There are even real cases of developers taking over broken packages just >> to prevent them from being lastrited but without ever fixing them. >> >> Then, some of the packages are noticed as result of major API update >> trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker. >> Those API changes provoke build failures, and while investigating them >> we discover that some of the software hasn't seen any upstream attention >> since 2000 (!), not to mention maintainers that could actually patch >> the issues. >> >> >> Version bump-based inactivity? >> ============================== >> One of the options would be to monitor inactivity as negligence to bump >> packages. With euscan and/or repology, we are at least able to >> partially monitor and report new versions of software (I think someone >> used to do that but I don't see those reports anymore). While this >> still requires some manual processing (esp. given that repology results >> are sometimes mistaken), it would be a step forward. >> >> The counterarguments for doing this is that not all version bumps are >> meaningful to Gentoo. We'd have to at least be able to filter out >> development releases if maintainers are not doing them. Sometimes we >> also skip releases if they don't introduce anything meaningful to Gentoo >> users. Finally, some developers reject new versions of software for >> various reasons. >> > > I've also considered to just use time. > > Many *packages* have not been touched in N time. While some software > doesn't get updates often, even routine maintenance should require edits on > a fairly regular basis. > > >> >> >> Bugzilla-based inactivity? >> ========================== >> I've noticed something interesting in Fedora lately. They have a policy >> that if a package build failure is reported (note: they are reporting >> them automatically) and the maintainer does not update it from the 'NEW' >> state, it is automatically orphaned after 8 weeks. Effectively, >> if the maintainer does not take care (or at least pretends to) >> of the package, it is orphaned automatically. >> >> I suppose we might be able to look for a similar policy in Gentoo. >> However, there are two obvious counterarguments. Firstly, this would >> create 'busywork' that people would be required to do in order to >> prevent from orphaning their packages. Secondly, a fair number of >> developers would just do this 'busywork' to every new bug just to avoid >> the problem, rendering the measure ineffective. >> > > Avoid letting the perfect be the enemy of the good here. Any metric can be > gamed by developers; but it turns out we must choose some metric to drive > the organization. I'm fairly sure not *all* developers will automate this > busywork; because *some* of us want to see the number of unmaintained > packages reduced; resulting in a net-win. > > >> >> >> What can we actually do? >> ======================== >> Do you have any specific ideas how we could actually improve >> the situation? I'm particularly looking for things we could do at least >> semi-automatically, without having to spend tremendous effort looking >> through thousands of unhandled bugs manually. >> > > So I'd recommend avoiding a specific implementation; which means don't > trigger off of a specific signal. > > Signals: > 1) euscan first; because its most accurate and plausible already > implemented. > 2) Date-based scanning; its trivial to implement. > > So now for each package, we have 2 straightforward signals. When was it > last touched, how many versions behind? > > Rules: > A package is unmaintained if it: > - Has not been touched in 5 years > - Is behind 3 versions AND hasn't been touched in 2 years > - Is behind 5 versions AND hasn't been touched in 1 years > > As we add more signals (e.g. doesn't build, or unfixed bugs) we can add > additional rules. > > We could generate a QA report per package on the qa reports page. > If there is an API for request the QA report, we could cross-link from > p.g.o. > > -A > > > >> -- >> Best regards, >> Michał Górny >> >> As a side observation I'd like to exempt a package from being flagged as unmaintained if there's nothing wrong with it. If upstream is idle and the package in a quiet state simply because there's no work needing done, then the package should be left alone. I think a packages should be flagged in progressive phases. Phase 1 could determine if the package warrants attention, and my proposed metric for this is if there are outstanding bugs on the bugzilla. For this purpose an outstanding bug is anything regarding the package, including revbumps, stablereqs, as well as actual defect/qa/buildfail related bugs. In essence, using the bugzilla as a central point of data collection and a radar for trouble. Phase 2 could take up any phase 1 candidates to actually audit for a lack of maintainership, i.e., "maintainer wanted" or "maintainer needed" packages could escalate the package in question to phase 2, as could a timestamp check on the latest activity for teh package. If the package is "phase 1" status due to an outstanding bug, and either lacks a maintainer altogether or fails a dormancy test, then the package is promoted to "phase 2" Phase 3 could be where we take remedial action. If the package has a maintainer this would be a good point to contact them. Perhaps a more comprehensive audit of the package's lack of maintainership, etc etc etc. A package that has entered "phase 2" has already been established as having outstanding bugs AND failed whatever automated sort of audit is done to check for being unmaintained. Phase 4 is the package being officially marked as unmaintained, and at this point it could probably be put on treecleaner's radar or however else we wish to handle unmaintained packages. If the package has a maintainer that failed to respond during phase 3 this could well be raise a concern of its own about that maintainer's own performance. [-- Attachment #2: Type: text/html, Size: 9550 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 17:05 ` Raymond Jennings @ 2019-03-23 17:38 ` Michał Górny 2019-03-23 17:53 ` Raymond Jennings 0 siblings, 1 reply; 13+ messages in thread From: Michał Górny @ 2019-03-23 17:38 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 6896 bytes --] On Sat, 2019-03-23 at 10:05 -0700, Raymond Jennings wrote: > On Sat, Mar 23, 2019 at 7:18 AM Alec Warner <antarus@gentoo.org> wrote: > > > > > On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@gentoo.org> wrote: > > > > > Hi, > > > > > > Gentoo is still having a major problem of unmaintained packages. > > > I'm not talking about pure 'maintainer-needed' here but packages that > > > have apparent maintainers and stay under the radar for long, harming > > > users in the process. I'd like to query potential solutions as how we > > > could improve this and look for new maintainers sooner. > > > > > > > > > The current state > > > ================= > > > The definition of an unmaintained package here is a bit blurry. For our > > > needs, let's say that an unmaintained package is a package that is not > > > getting attention of any of the maintainers, whose bugs are not looked > > > at, that does not receive version bumps or simply fails to build for > > > a long time. > > > > > > This is especially the case with 'revived herds', i.e. projects that > > > were formed from old herds. Their main characteristic is that they > > > 'maintain' a large number of loosely-related packages, and their > > > developers take care of only a small subset of them. Sadly, we still > > > have people who cherish that model, and instead of taking packages they > > > care about themselves, they shove it into one of 'their' herds. > > > > > > So far we're rarely catching such cases directly. Sometimes it happens > > > when another developer tries to use the package and notices the problem, > > > then finds that it's been reported a long time ago and never received > > > any attention. > > > > > > Sometimes, after retiring a developer we notice that he had 'maintained' > > > packages that were broken for years and never received any attention. > > > There are even real cases of developers taking over broken packages just > > > to prevent them from being lastrited but without ever fixing them. > > > > > > Then, some of the packages are noticed as result of major API update > > > trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker. > > > Those API changes provoke build failures, and while investigating them > > > we discover that some of the software hasn't seen any upstream attention > > > since 2000 (!), not to mention maintainers that could actually patch > > > the issues. > > > > > > > > > Version bump-based inactivity? > > > ============================== > > > One of the options would be to monitor inactivity as negligence to bump > > > packages. With euscan and/or repology, we are at least able to > > > partially monitor and report new versions of software (I think someone > > > used to do that but I don't see those reports anymore). While this > > > still requires some manual processing (esp. given that repology results > > > are sometimes mistaken), it would be a step forward. > > > > > > The counterarguments for doing this is that not all version bumps are > > > meaningful to Gentoo. We'd have to at least be able to filter out > > > development releases if maintainers are not doing them. Sometimes we > > > also skip releases if they don't introduce anything meaningful to Gentoo > > > users. Finally, some developers reject new versions of software for > > > various reasons. > > > > > > > I've also considered to just use time. > > > > Many *packages* have not been touched in N time. While some software > > doesn't get updates often, even routine maintenance should require edits on > > a fairly regular basis. > > > > > > > > > > Bugzilla-based inactivity? > > > ========================== > > > I've noticed something interesting in Fedora lately. They have a policy > > > that if a package build failure is reported (note: they are reporting > > > them automatically) and the maintainer does not update it from the 'NEW' > > > state, it is automatically orphaned after 8 weeks. Effectively, > > > if the maintainer does not take care (or at least pretends to) > > > of the package, it is orphaned automatically. > > > > > > I suppose we might be able to look for a similar policy in Gentoo. > > > However, there are two obvious counterarguments. Firstly, this would > > > create 'busywork' that people would be required to do in order to > > > prevent from orphaning their packages. Secondly, a fair number of > > > developers would just do this 'busywork' to every new bug just to avoid > > > the problem, rendering the measure ineffective. > > > > > > > Avoid letting the perfect be the enemy of the good here. Any metric can be > > gamed by developers; but it turns out we must choose some metric to drive > > the organization. I'm fairly sure not *all* developers will automate this > > busywork; because *some* of us want to see the number of unmaintained > > packages reduced; resulting in a net-win. > > > > > > > > > > What can we actually do? > > > ======================== > > > Do you have any specific ideas how we could actually improve > > > the situation? I'm particularly looking for things we could do at least > > > semi-automatically, without having to spend tremendous effort looking > > > through thousands of unhandled bugs manually. > > > > > > > So I'd recommend avoiding a specific implementation; which means don't > > trigger off of a specific signal. > > > > Signals: > > 1) euscan first; because its most accurate and plausible already > > implemented. > > 2) Date-based scanning; its trivial to implement. > > > > So now for each package, we have 2 straightforward signals. When was it > > last touched, how many versions behind? > > > > Rules: > > A package is unmaintained if it: > > - Has not been touched in 5 years > > - Is behind 3 versions AND hasn't been touched in 2 years > > - Is behind 5 versions AND hasn't been touched in 1 years > > > > As we add more signals (e.g. doesn't build, or unfixed bugs) we can add > > additional rules. > > > > We could generate a QA report per package on the qa reports page. > > If there is an API for request the QA report, we could cross-link from > > p.g.o. > > > > -A > > > > > > > > > -- > > > Best regards, > > > Michał Górny > > > > > > > As a side observation I'd like to exempt a package from being flagged as > unmaintained if there's nothing wrong with it. If upstream is idle and the > package in a quiet state simply because there's no work needing done, then > the package should be left alone. This is the attitude that means that few months later a single person is overburdened with a few dozens unmaintained packages all suddenly falling apart. Just like ncurses[tinfo]. Or openssl-1.1. -- Best regards, Michał Górny [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 963 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 17:38 ` Michał Górny @ 2019-03-23 17:53 ` Raymond Jennings 0 siblings, 0 replies; 13+ messages in thread From: Raymond Jennings @ 2019-03-23 17:53 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 8161 bytes --] On Sat, Mar 23, 2019 at 10:38 AM Michał Górny <mgorny@gentoo.org> wrote: > On Sat, 2019-03-23 at 10:05 -0700, Raymond Jennings wrote: > > On Sat, Mar 23, 2019 at 7:18 AM Alec Warner <antarus@gentoo.org> wrote: > > > > > > > > On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@gentoo.org> > wrote: > > > > > > > Hi, > > > > > > > > Gentoo is still having a major problem of unmaintained packages. > > > > I'm not talking about pure 'maintainer-needed' here but packages that > > > > have apparent maintainers and stay under the radar for long, harming > > > > users in the process. I'd like to query potential solutions as how > we > > > > could improve this and look for new maintainers sooner. > > > > > > > > > > > > The current state > > > > ================= > > > > The definition of an unmaintained package here is a bit blurry. For > our > > > > needs, let's say that an unmaintained package is a package that is > not > > > > getting attention of any of the maintainers, whose bugs are not > looked > > > > at, that does not receive version bumps or simply fails to build for > > > > a long time. > > > > > > > > This is especially the case with 'revived herds', i.e. projects that > > > > were formed from old herds. Their main characteristic is that they > > > > 'maintain' a large number of loosely-related packages, and their > > > > developers take care of only a small subset of them. Sadly, we still > > > > have people who cherish that model, and instead of taking packages > they > > > > care about themselves, they shove it into one of 'their' herds. > > > > > > > > So far we're rarely catching such cases directly. Sometimes it > happens > > > > when another developer tries to use the package and notices the > problem, > > > > then finds that it's been reported a long time ago and never received > > > > any attention. > > > > > > > > Sometimes, after retiring a developer we notice that he had > 'maintained' > > > > packages that were broken for years and never received any attention. > > > > There are even real cases of developers taking over broken packages > just > > > > to prevent them from being lastrited but without ever fixing them. > > > > > > > > Then, some of the packages are noticed as result of major API update > > > > trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker. > > > > Those API changes provoke build failures, and while investigating > them > > > > we discover that some of the software hasn't seen any upstream > attention > > > > since 2000 (!), not to mention maintainers that could actually patch > > > > the issues. > > > > > > > > > > > > Version bump-based inactivity? > > > > ============================== > > > > One of the options would be to monitor inactivity as negligence to > bump > > > > packages. With euscan and/or repology, we are at least able to > > > > partially monitor and report new versions of software (I think > someone > > > > used to do that but I don't see those reports anymore). While this > > > > still requires some manual processing (esp. given that repology > results > > > > are sometimes mistaken), it would be a step forward. > > > > > > > > The counterarguments for doing this is that not all version bumps are > > > > meaningful to Gentoo. We'd have to at least be able to filter out > > > > development releases if maintainers are not doing them. Sometimes we > > > > also skip releases if they don't introduce anything meaningful to > Gentoo > > > > users. Finally, some developers reject new versions of software for > > > > various reasons. > > > > > > > > > > I've also considered to just use time. > > > > > > Many *packages* have not been touched in N time. While some software > > > doesn't get updates often, even routine maintenance should require > edits on > > > a fairly regular basis. > > > > > > > > > > > > > > Bugzilla-based inactivity? > > > > ========================== > > > > I've noticed something interesting in Fedora lately. They have a > policy > > > > that if a package build failure is reported (note: they are reporting > > > > them automatically) and the maintainer does not update it from the > 'NEW' > > > > state, it is automatically orphaned after 8 weeks. Effectively, > > > > if the maintainer does not take care (or at least pretends to) > > > > of the package, it is orphaned automatically. > > > > > > > > I suppose we might be able to look for a similar policy in Gentoo. > > > > However, there are two obvious counterarguments. Firstly, this would > > > > create 'busywork' that people would be required to do in order to > > > > prevent from orphaning their packages. Secondly, a fair number of > > > > developers would just do this 'busywork' to every new bug just to > avoid > > > > the problem, rendering the measure ineffective. > > > > > > > > > > Avoid letting the perfect be the enemy of the good here. Any metric > can be > > > gamed by developers; but it turns out we must choose some metric to > drive > > > the organization. I'm fairly sure not *all* developers will automate > this > > > busywork; because *some* of us want to see the number of unmaintained > > > packages reduced; resulting in a net-win. > > > > > > > > > > > > > > What can we actually do? > > > > ======================== > > > > Do you have any specific ideas how we could actually improve > > > > the situation? I'm particularly looking for things we could do at > least > > > > semi-automatically, without having to spend tremendous effort looking > > > > through thousands of unhandled bugs manually. > > > > > > > > > > So I'd recommend avoiding a specific implementation; which means don't > > > trigger off of a specific signal. > > > > > > Signals: > > > 1) euscan first; because its most accurate and plausible already > > > implemented. > > > 2) Date-based scanning; its trivial to implement. > > > > > > So now for each package, we have 2 straightforward signals. When was it > > > last touched, how many versions behind? > > > > > > Rules: > > > A package is unmaintained if it: > > > - Has not been touched in 5 years > > > - Is behind 3 versions AND hasn't been touched in 2 years > > > - Is behind 5 versions AND hasn't been touched in 1 years > > > > > > As we add more signals (e.g. doesn't build, or unfixed bugs) we can add > > > additional rules. > > > > > > We could generate a QA report per package on the qa reports page. > > > If there is an API for request the QA report, we could cross-link from > > > p.g.o. > > > > > > -A > > > > > > > > > > > > > -- > > > > Best regards, > > > > Michał Górny > > > > > > > > > > As a side observation I'd like to exempt a package from being flagged as > > unmaintained if there's nothing wrong with it. If upstream is idle and > the > > package in a quiet state simply because there's no work needing done, > then > > the package should be left alone. > > This is the attitude that means that few months later a single person is > overburdened with a few dozens unmaintained packages all suddenly > falling apart. Just like ncurses[tinfo]. Or openssl-1.1. > I wanted to point out that a package shouldn't be flagged as unmaintained in the first place unless there is first a reason for it to be maintained. Those should be weeded out as candidates under the principle of "if it isn't broke don't fix it" since there's actually nothing wrong with the package remaining status quo. As it is the phase 4 I proposed is meant to catch broken packages that either a) don't have a maintainer at all, or b) whose maintainer is completely incommunicado, and not just busy. To clarify context though, could you give an example, howsoever hypothetical about "all suddenly falling apart"? Perhaps you mean a package that is a wide spread dependency, and its revdeps all break at the same time due to some sort of api change or the like? Is this what you meant by ncurses and openssl-1.1? > > -- > Best regards, > Michał Górny > > [-- Attachment #2: Type: text/html, Size: 10354 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 14:17 ` Alec Warner 2019-03-23 17:05 ` Raymond Jennings @ 2019-03-23 18:25 ` Rich Freeman 2019-03-23 19:22 ` Alec Warner 2019-03-23 20:14 ` Raymond Jennings 1 sibling, 2 replies; 13+ messages in thread From: Rich Freeman @ 2019-03-23 18:25 UTC (permalink / raw To: gentoo-project On Sat, Mar 23, 2019 at 10:17 AM Alec Warner <antarus@gentoo.org> wrote: > > > Avoid letting the perfect be the enemy of the good here. Indeed, we need to avoid treating packages as unmaintained simply because they have open bugs. Many packages have bugs that are fairly trivial in nature, or build issues that only show up in fairly obscure configurations. These often affect only a single user. If we treeclean the package we don't actually fix the problem - we just drive it to an overlay. Now instead of a package that works for 11/12 users and has an obscure but, we now have a package that isn't getting monitored for security issues, and other QA issues that might actually be fixed if they were pointed out. > Rules: > A package is unmaintained if it: > - Has not been touched in 5 years Do we really want to bump packages just for the sake of saying that they've been touched? That seems a bit much. > - Is behind 3 versions AND hasn't been touched in 2 years If we have the ability to detect if a package is behind upstream, perhaps we should actually file bugs about this so that the maintainer is aware. However, the fact that a newer version exists doesn't necessarily mean that there is a problem with the older version. For some types of software a maintainer might be picky about what updates they accept. For example, they might need to synchronize versions with other distros that update less often/etc. They should of course accept contributions from others willing to test, but the fact that somebody is maintaining a package on Gentoo doesn't obligate them to always support the latest version of that package. Now, obviously if there is a security issue/etc then we should follow the existing security policies, but those are already established. -- Rich ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 18:25 ` Rich Freeman @ 2019-03-23 19:22 ` Alec Warner 2019-03-23 19:36 ` Rich Freeman 2019-03-23 20:14 ` Raymond Jennings 1 sibling, 1 reply; 13+ messages in thread From: Alec Warner @ 2019-03-23 19:22 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 2587 bytes --] On Sat, Mar 23, 2019 at 2:26 PM Rich Freeman <rich0@gentoo.org> wrote: > On Sat, Mar 23, 2019 at 10:17 AM Alec Warner <antarus@gentoo.org> wrote: > > > > > > Avoid letting the perfect be the enemy of the good here. > > Indeed, we need to avoid treating packages as unmaintained simply > because they have open bugs. > > Many packages have bugs that are fairly trivial in nature, or build > issues that only show up in fairly obscure configurations. These > often affect only a single user. > So this is why I advocate for building a number of signals, and using a combination of signals to determine if a package is unmaintained. > > If we treeclean the package we don't actually fix the problem - we > just drive it to an overlay. Now instead of a package that works for > 11/12 users and has an obscure but, we now have a package that isn't > getting monitored for security issues, and other QA issues that might > actually be fixed if they were pointed out. > > > Rules: > > A package is unmaintained if it: > > - Has not been touched in 5 years > > Do we really want to bump packages just for the sake of saying that > they've been touched? That seems a bit much. > I'm not saying "we should absolutely remove packages that have not been touched in N years" but I am saying "we should review packages that have not been touched in N years". > > > - Is behind 3 versions AND hasn't been touched in 2 years > > If we have the ability to detect if a package is behind upstream, > perhaps we should actually file bugs about this so that the maintainer > is aware. > > However, the fact that a newer version exists doesn't necessarily mean > that there is a problem with the older version. For some types of > software a maintainer might be picky about what updates they accept. > For example, they might need to synchronize versions with other > distros that update less often/etc. They should of course accept > contributions from others willing to test, but the fact that somebody > is maintaining a package on Gentoo doesn't obligate them to always > support the latest version of that package. > > Now, obviously if there is a security issue/etc then we should follow > the existing security policies, but those are already established. > Would you be happier if there was some kind of opt-out or whitelist? Have you looked at mgorny's recent removals? its mostly stuff that doesn't build and hasn't been touched in 5 years and *yeah* I want that stuff out of the tree; its a net negative for everyone. Keeping packages in the tree isn't free. > > -- > Rich > > [-- Attachment #2: Type: text/html, Size: 3850 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 19:22 ` Alec Warner @ 2019-03-23 19:36 ` Rich Freeman 0 siblings, 0 replies; 13+ messages in thread From: Rich Freeman @ 2019-03-23 19:36 UTC (permalink / raw To: gentoo-project On Sat, Mar 23, 2019 at 3:22 PM Alec Warner <antarus@gentoo.org> wrote: > > I'm not saying "we should absolutely remove packages that have not been touched in N years" but I am saying "we should review packages that have not been touched in N years". ++ > Have you looked at mgorny's recent removals? its mostly stuff that doesn't build and hasn't been touched in 5 years and *yeah* I want that stuff out of the tree; its a net negative for everyone. Keeping packages in the tree isn't free. Also, ++ I completely support the general intent. I'm just trying to maintain balance as well. A good approach would be to just auto-file a bug as a ping and let the maintainer ack it as a first step. If somebody is getting a lot of pings maybe look at it more closely, and if a ping is ignored then definitely react. Ask maintainers to include in their ack a brief rationale - it need not be extensive/etc, or even carefully scrutinized, but it could give some perspective. "Yes, I'm aware that upstream has v25 and we're on v20, but API was broken in v21 without SONAME change and most of the deps in the repo want v20 as everybody thinks upstream is crazy." As long as we aren't pinging the same packages often that shouldn't be a big deal and will also simplify review. -- Rich ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 18:25 ` Rich Freeman 2019-03-23 19:22 ` Alec Warner @ 2019-03-23 20:14 ` Raymond Jennings 1 sibling, 0 replies; 13+ messages in thread From: Raymond Jennings @ 2019-03-23 20:14 UTC (permalink / raw To: gentoo-project [-- Attachment #1: Type: text/plain, Size: 2309 bytes --] On Sat, Mar 23, 2019 at 11:26 AM Rich Freeman <rich0@gentoo.org> wrote: > On Sat, Mar 23, 2019 at 10:17 AM Alec Warner <antarus@gentoo.org> wrote: > > > > > > Avoid letting the perfect be the enemy of the good here. > > Indeed, we need to avoid treating packages as unmaintained simply > because they have open bugs. > > Many packages have bugs that are fairly trivial in nature, or build > issues that only show up in fairly obscure configurations. These > often affect only a single user. > > If we treeclean the package we don't actually fix the problem - we > just drive it to an overlay. Now instead of a package that works for > 11/12 users and has an obscure but, we now have a package that isn't > getting monitored for security issues, and other QA issues that might > actually be fixed if they were pointed out. > > > Rules: > > A package is unmaintained if it: > > - Has not been touched in 5 years > > Do we really want to bump packages just for the sake of saying that > they've been touched? That seems a bit much. > > > - Is behind 3 versions AND hasn't been touched in 2 years > > If we have the ability to detect if a package is behind upstream, > perhaps we should actually file bugs about this so that the maintainer > is aware. > This is part of the idea behind my plan to have open bugs be the first (but probably not only, as the later phases demonstrate) symptom of trouble. Apart from it not being fair to remove teh package unless it's actually broken, it's also a good habit imo to encourage bugs (as long as they're not frivolous) to be filed simply for documentation purposes. However, the fact that a newer version exists doesn't necessarily mean > that there is a problem with the older version. For some types of > software a maintainer might be picky about what updates they accept. > For example, they might need to synchronize versions with other > distros that update less often/etc. They should of course accept > contributions from others willing to test, but the fact that somebody > is maintaining a package on Gentoo doesn't obligate them to always > support the latest version of that package. > > Now, obviously if there is a security issue/etc then we should follow > the existing security policies, but those are already established. > > -- > Rich > > [-- Attachment #2: Type: text/html, Size: 3078 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [gentoo-project] How to improve detection of unmaintained packages? 2019-03-23 7:32 [gentoo-project] How to improve detection of unmaintained packages? Michał Górny ` (2 preceding siblings ...) 2019-03-23 14:17 ` Alec Warner @ 2019-03-23 20:32 ` Toralf Förster 3 siblings, 0 replies; 13+ messages in thread From: Toralf Förster @ 2019-03-23 20:32 UTC (permalink / raw To: gentoo-project [-- Attachment #1.1: Type: text/plain, Size: 578 bytes --] On 3/23/19 8:32 AM, Michał Górny wrote: > What can we actually do? > ======================== > Do you have any specific ideas how we could actually improve > the situation? Reminds me about my strategy when I started with the tinderbox. Because it was intented as a help to improve the QA state I started to report only the most obvious issues for 2 common configs (in fact, my KDE desktop and my server). Way later I unleashed the tinderbox to look for a wider range of issues when the lopw hanging fruits were picked up. -- Toralf PGP 23217DA7 9B888F45 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-03-23 20:32 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-03-23 7:32 [gentoo-project] How to improve detection of unmaintained packages? Michał Górny 2019-03-23 8:04 ` Joonas Niilola 2019-03-23 8:48 ` Toralf Förster 2019-03-23 8:51 ` Michał Górny 2019-03-23 14:17 ` Alec Warner 2019-03-23 17:05 ` Raymond Jennings 2019-03-23 17:38 ` Michał Górny 2019-03-23 17:53 ` Raymond Jennings 2019-03-23 18:25 ` Rich Freeman 2019-03-23 19:22 ` Alec Warner 2019-03-23 19:36 ` Rich Freeman 2019-03-23 20:14 ` Raymond Jennings 2019-03-23 20:32 ` Toralf Förster
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox