On Sat, Mar 23, 2019 at 10:38 AM Michał Górny <mgorny@gentoo.org> wrote:
On Sat, 2019-03-23 at 10:05 -0700, Raymond Jennings wrote:
> On Sat, Mar 23, 2019 at 7:18 AM Alec Warner <antarus@gentoo.org> wrote:
>
> >
> > On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@gentoo.org> wrote:
> >
> > > Hi,
> > >
> > > Gentoo is still having a major problem of unmaintained packages.
> > > I'm not talking about pure 'maintainer-needed' here but packages that
> > > have apparent maintainers and stay under the radar for long, harming
> > > users in the process.  I'd like to query potential solutions as how we
> > > could improve this and look for new maintainers sooner.
> > >
> > >
> > > The current state
> > > =================
> > > The definition of an unmaintained package here is a bit blurry.  For our
> > > needs, let's say that an unmaintained package is a package that is not
> > > getting attention of any of the maintainers, whose bugs are not looked
> > > at, that does not receive version bumps or simply fails to build for
> > > a long time.
> > >
> > > This is especially the case with 'revived herds', i.e. projects that
> > > were formed from old herds.  Their main characteristic is that they
> > > 'maintain' a large number of loosely-related packages, and their
> > > developers take care of only a small subset of them.  Sadly, we still
> > > have people who cherish that model, and instead of taking packages they
> > > care about themselves, they shove it into one of 'their' herds.
> > >
> > > So far we're rarely catching such cases directly.  Sometimes it happens
> > > when another developer tries to use the package and notices the problem,
> > > then finds that it's been reported a long time ago and never received
> > > any attention.
> > >
> > > Sometimes, after retiring a developer we notice that he had 'maintained'
> > > packages that were broken for years and never received any attention.
> > > There are even real cases of developers taking over broken packages just
> > > to prevent them from being lastrited but without ever fixing them.
> > >
> > > Then, some of the packages are noticed as result of major API update
> > > trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker.
> > > Those API changes provoke build failures, and while investigating them
> > > we discover that some of the software hasn't seen any upstream attention
> > > since 2000 (!), not to mention maintainers that could actually patch
> > > the issues.
> > >
> > >
> > > Version bump-based inactivity?
> > > ==============================
> > > One of the options would be to monitor inactivity as negligence to bump
> > > packages.  With euscan and/or repology, we are at least able to
> > > partially monitor and report new versions of software (I think someone
> > > used to do that but I don't see those reports anymore).  While this
> > > still requires some manual processing (esp. given that repology results
> > > are sometimes mistaken), it would be a step forward.
> > >
> > > The counterarguments for doing this is that not all version bumps are
> > > meaningful to Gentoo.  We'd have to at least be able to filter out
> > > development releases if maintainers are not doing them.  Sometimes we
> > > also skip releases if they don't introduce anything meaningful to Gentoo
> > > users.  Finally, some developers reject new versions of software for
> > > various reasons.
> > >
> >
> > I've also considered to just use time.
> >
> > Many *packages* have not been touched in N time. While some software
> > doesn't get updates often, even routine maintenance should require edits on
> > a fairly regular basis.
> >
> >
> > >
> > > Bugzilla-based inactivity?
> > > ==========================
> > > I've noticed something interesting in Fedora lately.  They have a policy
> > > that if a package build failure is reported (note: they are reporting
> > > them automatically) and the maintainer does not update it from the 'NEW'
> > > state, it is automatically orphaned after 8 weeks.  Effectively,
> > > if the maintainer does not take care (or at least pretends to)
> > > of the package, it is orphaned automatically.
> > >
> > > I suppose we might be able to look for a similar policy in Gentoo.
> > > However, there are two obvious counterarguments.  Firstly, this would
> > > create 'busywork' that people would be required to do in order to
> > > prevent from orphaning their packages.  Secondly, a fair number of
> > > developers would just do this 'busywork' to every new bug just to avoid
> > > the problem, rendering the measure ineffective.
> > >
> >
> > Avoid letting the perfect be the enemy of the good here. Any metric can be
> > gamed by developers; but it turns out we must choose some metric to drive
> > the organization. I'm fairly sure not *all* developers will automate this
> > busywork; because *some* of us want to see the number of unmaintained
> > packages reduced; resulting in a net-win.
> >
> >
> > >
> > > What can we actually do?
> > > ========================
> > > Do you have any specific ideas how we could actually improve
> > > the situation?  I'm particularly looking for things we could do at least
> > > semi-automatically, without having to spend tremendous effort looking
> > > through thousands of unhandled bugs manually.
> > >
> >
> > So I'd recommend avoiding a specific implementation; which means don't
> > trigger off of a specific signal.
> >
> > Signals:
> > 1) euscan first; because its most accurate and plausible already
> > implemented.
> > 2) Date-based scanning; its trivial to implement.
> >
> > So now for each package, we have 2 straightforward signals. When was it
> > last touched, how many versions behind?
> >
> > Rules:
> > A package is unmaintained if it:
> >   - Has not been touched in 5 years
> >   - Is behind 3 versions AND hasn't been touched in 2 years
> >   - Is behind 5 versions AND hasn't been touched in 1 years
> >
> > As we add more signals (e.g. doesn't build, or unfixed bugs) we can add
> > additional rules.
> >
> > We could generate a QA report per package on the qa reports page.
> > If there is an API for request the QA report, we could cross-link from
> > p.g.o.
> >
> > -A
> >
> >
> >
> > > --
> > > Best regards,
> > > Michał Górny
> > >
> > >
> As a side observation I'd like to exempt a package from being flagged as
> unmaintained if there's nothing wrong with it.  If upstream is idle and the
> package in a quiet state simply because there's no work needing done, then
> the package should be left alone.

This is the attitude that means that few months later a single person is
overburdened with a few dozens unmaintained packages all suddenly
falling apart.  Just like ncurses[tinfo].  Or openssl-1.1.

I wanted to point out that a package shouldn't be flagged as unmaintained in the first place unless there is first a reason for it to be maintained.  Those should be weeded out as candidates under the principle of "if it isn't broke don't fix it" since there's actually nothing wrong with the package remaining status quo.

As it is the phase 4 I proposed is meant to catch broken packages that either a) don't have a maintainer at all, or b) whose maintainer is completely incommunicado, and not just busy.

To clarify context though, could you give an example, howsoever hypothetical about "all suddenly falling apart"?  Perhaps you mean a package that is a wide spread dependency, and its revdeps all break at the same time due to some sort of api change or the like?  Is this what you meant by ncurses and openssl-1.1?

--
Best regards,
Michał Górny