On Sat, Mar 23, 2019 at 3:32 AM Michał Górny wrote: > Hi, > > Gentoo is still having a major problem of unmaintained packages. > I'm not talking about pure 'maintainer-needed' here but packages that > have apparent maintainers and stay under the radar for long, harming > users in the process. I'd like to query potential solutions as how we > could improve this and look for new maintainers sooner. > > > The current state > ================= > The definition of an unmaintained package here is a bit blurry. For our > needs, let's say that an unmaintained package is a package that is not > getting attention of any of the maintainers, whose bugs are not looked > at, that does not receive version bumps or simply fails to build for > a long time. > > This is especially the case with 'revived herds', i.e. projects that > were formed from old herds. Their main characteristic is that they > 'maintain' a large number of loosely-related packages, and their > developers take care of only a small subset of them. Sadly, we still > have people who cherish that model, and instead of taking packages they > care about themselves, they shove it into one of 'their' herds. > > So far we're rarely catching such cases directly. Sometimes it happens > when another developer tries to use the package and notices the problem, > then finds that it's been reported a long time ago and never received > any attention. > > Sometimes, after retiring a developer we notice that he had 'maintained' > packages that were broken for years and never received any attention. > There are even real cases of developers taking over broken packages just > to prevent them from being lastrited but without ever fixing them. > > Then, some of the packages are noticed as result of major API update > trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker. > Those API changes provoke build failures, and while investigating them > we discover that some of the software hasn't seen any upstream attention > since 2000 (!), not to mention maintainers that could actually patch > the issues. > > > Version bump-based inactivity? > ============================== > One of the options would be to monitor inactivity as negligence to bump > packages. With euscan and/or repology, we are at least able to > partially monitor and report new versions of software (I think someone > used to do that but I don't see those reports anymore). While this > still requires some manual processing (esp. given that repology results > are sometimes mistaken), it would be a step forward. > > The counterarguments for doing this is that not all version bumps are > meaningful to Gentoo. We'd have to at least be able to filter out > development releases if maintainers are not doing them. Sometimes we > also skip releases if they don't introduce anything meaningful to Gentoo > users. Finally, some developers reject new versions of software for > various reasons. > I've also considered to just use time. Many *packages* have not been touched in N time. While some software doesn't get updates often, even routine maintenance should require edits on a fairly regular basis. > > > Bugzilla-based inactivity? > ========================== > I've noticed something interesting in Fedora lately. They have a policy > that if a package build failure is reported (note: they are reporting > them automatically) and the maintainer does not update it from the 'NEW' > state, it is automatically orphaned after 8 weeks. Effectively, > if the maintainer does not take care (or at least pretends to) > of the package, it is orphaned automatically. > > I suppose we might be able to look for a similar policy in Gentoo. > However, there are two obvious counterarguments. Firstly, this would > create 'busywork' that people would be required to do in order to > prevent from orphaning their packages. Secondly, a fair number of > developers would just do this 'busywork' to every new bug just to avoid > the problem, rendering the measure ineffective. > Avoid letting the perfect be the enemy of the good here. Any metric can be gamed by developers; but it turns out we must choose some metric to drive the organization. I'm fairly sure not *all* developers will automate this busywork; because *some* of us want to see the number of unmaintained packages reduced; resulting in a net-win. > > > What can we actually do? > ======================== > Do you have any specific ideas how we could actually improve > the situation? I'm particularly looking for things we could do at least > semi-automatically, without having to spend tremendous effort looking > through thousands of unhandled bugs manually. > So I'd recommend avoiding a specific implementation; which means don't trigger off of a specific signal. Signals: 1) euscan first; because its most accurate and plausible already implemented. 2) Date-based scanning; its trivial to implement. So now for each package, we have 2 straightforward signals. When was it last touched, how many versions behind? Rules: A package is unmaintained if it: - Has not been touched in 5 years - Is behind 3 versions AND hasn't been touched in 2 years - Is behind 5 versions AND hasn't been touched in 1 years As we add more signals (e.g. doesn't build, or unfixed bugs) we can add additional rules. We could generate a QA report per package on the qa reports page. If there is an API for request the QA report, we could cross-link from p.g.o. -A > -- > Best regards, > Michał Górny > >